ML needs a new programming language – Interview with Chris Lattner

283 points by melodyogonna a day ago

Thank you for all the great interest in the podcast and in Mojo. If you're interested in learning more, Mojo has a FAQ that covers many topics (including "why not make Julia better" :-) here: https://docs.modular.com/mojo/faq/

Mojo also has a bunch of documentation https://docs.modular.com/mojo/ as well as hundreds of thousands of lines of open source code you can check out: https://github.com/modular/modular

The Mojo community is really great, please consider joining, either our discourse forum: https://forum.modular.com/ or discord https://discord.com/invite/modular chat.

-Chris Lattner

bsaul 18 hours ago

Fan here.
I watched a lot of your talks about mojo, where you mention how mojo benefits from very advanced compiler technology. But i've never seen you give concret example of this advanced technology. Please, can you write a blog about that, going as deep and hardcore tech as you can ? As i'm not a compiler dev i'll probably understand 20% of it, but hopefully i'll start to get a sense of how advanced the whole thing is.
- melodyogonna 18 hours ago
  
  They've given a talk abut this on the LLVM dev meeting: https://www.youtube.com/watch?v=SEwTjZvy8vw&
  They have some fancy tech around compile-time interpreter, mojopkg, and their code generator. They also have a graph compiler (not mentioned in the talk) that can fuse kernels written in Mojo.
  - timmyd 16 hours ago
    
    You might also enjoy this series: https://www.modular.com/democratizing-ai-compute which goes into a lot of the details.
tialaramex 15 hours ago

The FAQ answers: "Is the Mojo Playground still available?" by pointing me to the playground, but, the playground itself says "Playground will be removed with the next release of 25.6"
As a FAQ answer to "is this available?" pointing me to a thing which says it will be removed at some unspecified point sort of misses the point. I guess maybe the answer is "Not for long" ?
hirvi74 9 hours ago

Thank you for all you have done over the years, Chris. Your work has been an endless source of inspiration for myself and many others.
Razengan 17 hours ago

Thanks for Swift, one of the best languages I've used (even if the tooling around it leaves something to be desired), along with GDScript, for different reasons ^^
WaxProlix 14 hours ago

Hey Chris, great to see you here. Still waiting for my hefty kickstart of Light Table to pay off, any updates there?
(for real though, this looks cool; what kinds of longevity should we expect from such a project - how can we be sure?)
- nl 13 hours ago
  
  Light Table is by Chris Granger. This is Chris Lattner who previously created Swift
  - dismalaf 11 hours ago
    
    I think LLVM is more monumental than Swift personally.
    
    GeekyBear 9 hours ago
    
    LLVM, MLIR, Clang, Swift, and now Mojo.
    He had quite a long and distinguished career when it comes to open sourced compiler tech.

MontyCarloHall a day ago

The reason why Python dominates is that modern ML applications don't exist in a vacuum. They aren't the standalone C/FORTRAN/MATLAB scripts of yore that load in some simple, homogeneous data, crunch some numbers, and spit out a single result. Rather, they are complex applications with functionality extending far beyond the number crunching, which requires a robust preexisting software ecosystem.

For example, a modern ML application might need an ETL pipeline to load and harmonize data of various types (text, images, video, etc., all in different formats) from various sources (local filesystem, cloud storage, HTTP, etc.) The actual computation then must leverage many different high-level functionalities, e.g. signal/image processing, optimization, statistics, etc. All of this computation might be too big for one machine, and so the application must dispatch jobs to a compute cluster or cloud. Finally, the end results might require sophisticated visualization and organization, with a GUI and database.

There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python. Python's numerical computing libraries (NumPy/PyTorch/JAX etc.) all call out to C/C++/FORTRAN under the hood and are thus extremely high-performance, and for functionality they don't implement, Python's C/C++ FFIs (e.g. Python.h, NumPy C integration, PyTorch/Boost C++ integration) are not perfect, but are good enough that implementing the performance-critical portions of code in C/C++ is much easier compared to re-implementing entire ecosystems of packages in another language like Julia.

benzible a day ago

Python's ecosystem is hard to beat, but Elixir/Nx already does a lot of what Mojo promises. EXLA gives you GPU/TPU compilation through XLA with similar performance to Mojo's demos, Explorer handles dataframes via Polars, and now Pythonx lets you embed Python when you need those specialized libraries.
The real difference is that Elixir was built for distributed systems from day one. OTP/BEAM gives the ability to handle millions of concurrent requests as well as coordinating across GPU nodes. If you're building actual ML services (not just optimizing kernels), having everything from Phoenix / LiveView to Nx in one stack built for extreme fault-tolerance might matter more than getting the last bit of performance out of your hardware.
- devbug 4 hours ago
  
  I recently built out training and inference at a FinTech (for fraud and risk) using Elixir and tried this very approach…
  We’re now using Python for training and forking Ortex (ONNX) for inference.
  The ecosystem just isn’t there, especially for training. It’s a little better for inference but still has significant gaps. I will eventually have time to push contributions upstream but Python has so much momentum behind it.
  Livebooks are amazing though and a better experience than anything a Python offers sans libraries.
- melodyogonna a day ago
  
  Who uses this Exla in production?
  - benzible 15 hours ago
    
    These guys, for one: https://www.amplified.ai
    See: https://www.youtube.com/watch?v=5FlZHkc4Mq4
benreesman 15 hours ago

I'm in kind of a different place with it on the inference side.
I've got these crazy tuned up CUDA kernels that are relatively straightforward to build in isolation and really where all the magic happens, and there's this new CUTLASS 3 stuff and modern C++ can call it all trivially.
And then there's this increasingly thin film of torch crap that's just this side of unbuildable and drags in this reference counting and broken setup.py and it's a bunch of up and down projections to the real hacker shit.
I'm thinking I'm about one squeeze of the toothpaste tube from just squuezing that junk out and having a nice, clean, well-groomed C++ program that can link anything and link into anything.
- pjmlp 9 hours ago
  
  CUTLASS 4 has first class support for Python.
  - saagarjha 5 hours ago
    
    In fact I doubt the C++ API will be getting much love moving forward
    
    pjmlp 5 hours ago
    
    At GTC 2025, NVidia introduced two major changes in CUDA ecosystem.
    First class support for Python JIT/DSLs across the whole ecosystem.
    Change the way C++ is used and taught, more focused on standard C++ support and libraries, than low level CUDA extensions.
    So in a way, I think you're kind of right.
    
    benreesman a minute ago
    
    Nah, their people are way involved in mdarray and ROCm is looking to have the "oh no its broken again" bit flipped off in the RDNA 4/5 cycle.
    NVIDIA wants Python and C++ people, they want a new thing to moat up on, and they know it has to be legitimately good to defy economic gravity on chips a lot of companies can design and fan now.
nialv7 10 hours ago

You argument is circular. Python has all this ecosystem _because_ it have been the language of choice for ML for a decade. At this point it's difficult to beat, but doesn't explain why it was chosen all those years ago.
- 317070 5 hours ago
  
  I was there when it was chosen all those years ago.
  At the time (2007-2009), Matlab was the application of choice for what would become "deep" learning research. Though it had its warts, and licensing issues. It was easy for students to get started with and to use, also as a lot of them were not from computer science backgrounds, but often from statistics, engineering or neuroscience.
  When autograd came (this was even before gpu's), people needed something more powerful than matlab, yet familiar. Numpy already existed, and python+numpy+matplotlib give you an environment and a language very similar to matlab. The biggest hurdle was that python is zero-indexed.
  If things went slightly different, I reckon we might have ended up using Octave or lua. I reckon Octave was too restrictive and poorly documented for autograd. On the other hand, lua was too dissimilar to matlab. I think it was Theano, the first widely used python autograd, and then later PyTorch, that really sealed the deal for python.
  - breuleux an hour ago
    
    We chose Python for Theano because Python was already the language of choice for our research lab. If it had been my choice, I would probably have picked Scheme (I was really into macros at that time) or Ruby (I think it's better designed than Python). But if we had done it in another language than Python, frankly, I'm not sure it would have taken off in the first place. Python already had quite a bit of inertia, likely thanks to numpy and matplotlib.
  - nickpeterson 3 hours ago
    
    You were there 30 years ago, when the strength of men failed?
- chickenzzzzu 10 hours ago
  
  Not only is their argument circular but it is wrong. There is no need to use 50 million lines of Python, Pytorch, Numpy, Linux, Cmake, CUDA, and god knows how many other layers of madness to do inference.
  It is literally on the order of tens of thousands of lines of code, instead of tens of millions, to do Vulkan ML, especially if you strip out the parts of the kernel you don't need.
Hizonner a day ago

This guy is worried about GPU kernels, which are never, ever written in Python. As you point out, Python is a glue language for ML.
> There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python.
That may be true, but some of us are still bitter that all that grew up around an at-least-averagely-annoying language rather than something nicer.
- MontyCarloHall a day ago
  
  >This guy is worried about GPU kernels
  Then the title should be "why GPU kernel programming needs a new programming language." I can get behind that; I've written CUDA C and it was not fun (though this was over a decade ago and things may have since improved, not to mention that the code I wrote then could today be replaced by a couple lines of PyTorch). That said, GPU kernel programming is fairly niche: for the vast majority of ML applications, the high-level API functions in PyTorch/TensorFlow/JAX/etc. provide optimal GPU performance. It's pretty rare that one would need to implement custom kernels.
  >which are never, ever written in Python.
  Not true! Triton is a Python API for writing kernels, which are JIT compiled.
  - catgary a day ago
    
    I agree with you that writing kernels isn’t necessarily the most important thing for most ML devs. I think an MLIR-first workflow with robust support for the StableHLO and LinAlg dialects is the best path forward for ML/array programming, so on one hand I do applaud what Mojo is doing.
    But I’m much more interested in how MLIR opens the door to “JAX in <x>”. I think Julia is moving in that direction with Reactant.jl, and I think there’s a Rust project doing something similar (I think burn.dev may be using ONNX has an even higher-level IR). In my ideal world, I would be able to write an ML model and training loop in some highly verified language and call it from Python/Rust for training.
  - pama 18 hours ago
    
    Although they are incredibly useful, the ML applications, or MLops, or ML Devops, or whatever other production-related application tech stack terminology may come to mind, are providing critical scaffolding infrastructure and glue but are not strictly what comes to mind when you only use the term “Machine Learning”. The key to machine learning is the massively parallel ability to efficiently train large neural network models, and the key to using the benefits of these trained models is the ability to rapidly evaluate them. Yes you need to get data for the training and you need complex infrastructure for the applications but the conferences, papers, studies on machine learning dont refer to these (other than in passing), in part because they are solvable, and in part because they are largely orthogonal to ML. Another way to think about it is that the nvidia price is what goes to infinity in this recent run and not the database or disk drive providers. So if someone finds a way to make the core ML part better with a programming language solution that is certainly very welcome and the title is appropriate. (The fact that GPU programming is considered niche in the current state of ML is a strong argument for keeping the title as is.)
  - seanmcdirmid 8 hours ago
    
    There are a lot of tricks for writing GPU code in a high level language and using some sort of meta programming to make it work out (I think Conal Elliott first did this in Haskell, where he also does symbolic differentiation in the same paper!).
- jimbokun 20 hours ago
  
  > That may be true, but some of us are still bitter that all that grew up around an at-least-averagely-annoying language rather than something nicer.
  Don't worry. If you stick around this industry long enough you'll see this happen several more times.
  - Hizonner 19 hours ago
    
    I'm basically retired. But I'm still bitter about each of the times...
    
    anakaine 18 hours ago
    
    Im not.
    Move on with life and be happy. What we have is functional, easy to develop, and well supported.
    
    Hizonner 18 hours ago
    
    My bitterness is the only thing that keeps me happy.
- anothernewdude 6 hours ago
  
  > rather than something nicer.
  Python was the something nicer. A lot of the other options were so much worse.
- ModernMech a day ago
  
  > This guy is worried about GPU kernels, which are never, ever written in Python. As you point out, Python is a glue language for ML.
  That's kind of the point of Mojo, they're trying to solve the so-called "two language problem" in this space. Why should you need two languages to write your glue code and kernel code? Why can't there be a language which is both as easy to write as Python, but can still express GPU kernels for ML applications? That's what Mojo is trying to be through clever use of LLVM MLIR.
  - nostrademons 20 hours ago
    
    It's interesting, people have been trying to solve the "two language problem" since before I started professionally programming 25 years ago, and in that time period two-language solutions have just gotten even more common. Back in the 90s they were usually spoken about only in reference to games and shell programming; now the pattern of "scripting language calls out to highly-optimized C or CUDA for compute-intensive tasks" is common for webapps, ML, cryptocurrency, drones, embedded, robotics, etc.
    I think this is because many, many problem domains have a structure that lends themselves well to two-language solutions. They have a small homogenous computation structure on lots of data that needs to run extremely fast. And they also have a lot of configuration and data-munging that is basically quick one-time setup but has to be specified somewhere, and the more concisely you can specify it, the less human time development takes. The requirements on a language designed to run extremely fast are going to be very different from one that is designed to be as flexible and easy to write as possible. You usually achieve quick execution by eschewing flexibility and picking a programming model that is fairly close to the machine model, but you achieve flexibility by having lots of convenience features built into the language, most of which will have some cost in memory or indirections.
    There've been a number of attempts at "one language to rule them all", notably PL/1, C++, Julia (in the mathematical programming subdomain), and Common Lisp, but it often feels like the "flexible" subset is shoehorned in to fit the need for zero-cost abstractions, and/or the "compute-optimized" subset is almost a whole separate language that is bolted on with similar but more verbose syntax.
    
    Karrot_Kream 15 hours ago
    
    From what I can tell, gaming has mostly just embraced two language solutions. The big engines Unity, Unreal, and Godot have tight cores written in C/C++ and then scripting languages that are written on top. Hobby engines like Love2D often also have a tight, small core and are extensible with languages like Lua or Fennel.
    Modern Common Lisp also seems to have given up its "one language to rule them all" mindset and is pretty okay with just dropping into CFFI to call into C libraries as needed. Over the years I've come to see that mindset as mostly a dead-end. Python, web browsers, game engines, emacs, these are all prominent living examples of two-language solutions that have come to dominate in their problem spaces.
    One aspect of the "two language problem" that I find troubling though is that modern environments often ossify around the exact solution. For example, it's very difficult to have something like PyTorch in say Common Lisp even though libcuda and libdnn should be fairly straightforward to wrap in Common Lisp (see [1] for Common Lisp CUDA bindings.) JS/TS/WASM that runs in the browser often is dependent on Chrome's behavior. Emacs continues to be tied to its ancient, tech-debt ridden C runtime. There seems to be a lot of value tied into the glue between the two chosen languages and it's hard to recreate that value with other HLLs even if the "metal" language/runtime stays the same.
    [1]: https://github.com/takagi/cl-cuda
    
    nostrademons 15 hours ago
    
    This may be because while the computational core is small, much of the code and the value of the overall solution are actually in the HLL. That's the reason for the use of a HLL in the first place.
    PyTorch is actually quite illustrative as being a counterexample that proves the rule. It was based on Torch, which had very similar if not identical BLAS routines but used Lua as the scripting language. But now everybody uses PyTorch because Lua development stopped in 2017, so all the extra goodies that people rely on now are in the Python wrapper.
    The only exception seems to be when multiple scripting languages are supported, and at roughly equal points of development. So for example - SQLite continues to have most of its value in the C substrate, and is relatively easy to port to other languages, because it has so many language bindings that there's a strong incentive to write new functionality in C and keep the API simple. Ditto client libraries for things like MySQL, PostGres, MongoDB, Redis, etc. ZeroMQ has a bunch of bindings that are largely dumb passthroughs to the underlying C++ substrate.
    But even a small imbalance can lead to that one language being preferenced heavily in supporting tooling and documentation. Pola.rs is a Rust substrate and ships with bindings for Python, R, and Node.js, but all the examples on the website are in Python or Rust, and I rarely hear of a non-Python user picking it up.
    
    Karrot_Kream 14 hours ago
    
    Very interesting observation on SQLite vs Pola.rs. Also, how could I forget that Torch was originally a Lua library when I used it forever ago.
    I also wonder how much of the ossification comes from the embodied logic in the HLL. SQLite wrappers tend to be very simple and let the C core do most of the work. Something like PyTorch on the other hand layers on a lot of logic onto underlying CUDA/BLAS that is essential complexity living solely in Python the HLL. This is also probably why libcurl has so many great wrappers in HLLs because libcurl does the heavy lifting.
    The pain point I see repeatedly in putting most of the logic into the performant core is asynchrony. Every HLL seems to have its own way to do async execution (Python with asyncio, Node with its async runtime, Go with lightweight green threads (goroutines), Common Lisp with native threads, etc.) This means that the C core needs to be careful as to what to expose and how to accommodate various asynchrony patterns.
    
    soVeryTired 19 hours ago
    
    There's a very interesting video about the "1.5 language problem" in Julia [0]. The point being that when you write high-performance Julia it ends up looking nothing like "standard" Julia.
    It seems like it's just extremely difficult to give fine-grained control over the metal while having an easy, ergonomic language that lets you just get on with your tasks.
    [0] https://www.youtube.com/watch?v=RUJFd-rEa0k
    
    imtringued 6 hours ago
    
    You say this is some ideal outcome, but I want to get as far away from python and C++ as possible.
    Also, no. I can't use Python for inference, because it is too slow, so I have to export to tensorflow lite and run the model in C++, which essentially required me to rewrite half the code in C++ again.
  - bjourne 14 hours ago
    
    > Why can't there be a language which is both as easy to write as Python, but can still express GPU kernels for ML applications? That's what Mojo is trying to be through clever use of LLVM MLIR.
    It already exists. It is called PyTorch/JAX/TensorFlow. These frameworks already contain sophisticated compilers for turning computational graphs into optimized GPU code. I dare say that they don't leave enough performance on the table for a completely new language to be viable.
    
    davidatbu 8 hours ago
    
    Last I checked , all of pytorch, tensorflow, and Jax sit at a layer of abstraction that is above GPU kernels. They avail GPU kernels (as basically nodes in the computational graph you mention), but they don't let you write GPU kernels.
    Triton, CUDA, etc, let one write GPU kernels.
    
    bjourne 2 hours ago
    
    Yes, they kinda do. The computational graph you specify is completely different from the execution schedule it is compiled into. Whether it's 1, 2, or N kernels is irrelevant as long as it runs fast. Mojo being an HLL is conceptually no different from Python. Whether it will, in the future, become better for DNNs, time will tell.
    
    davidatbu 2 hours ago
    
    I assume HLL=Higher Level Language? Mojo definitely avails lower-level facilities than Python. Chris has even described Mojo as "syntactic sugar over MLIR". (For example, the native integer type is defined in library code as a struct).
    > Whether it's 1, 2, or N kernels is irrelevant.
    Not sure what you mean here. But new kernels are written all the time (flash-attn is a great example). One can't do that in plain Python. E.g., flash-attn was originally written in C++ CUDA, and now in Triton.
    
    boroboro4 2 hours ago
    
    Torch.compile sits at both the level of computation graph and GPU kernels and can fuse your operations by using triton compiler. I think something similar applies to Jax and tensorflow by the way of XLA, but I’m not 100% sure.
    
    davidatbu 2 hours ago
    
    Good point. But the overall point about Mojo availing a different level of abstraction as compared to Python still stands: I imagine that no amount of magic/operator-fusion/etc in `torch.compile()` would let one get reasonable performance for an implementation of, say, flash-attn. One would have to use CUDA/Triton/Mojo/etc.
    
    saagarjha 5 hours ago
    
    There's plenty of performance on the table but I don't think it will be captured by a new language.
  - adsharma 14 hours ago
    
    Python -> Mojo -> MLIR ... <target hardware>
    Yes, you can write mojo with python syntax and transpile. You'd end up with something similar to Julia's 1.5 language problem.
    Since the mojo language is not fully specified, it's hard to understand what language constructs can't be efficiently expressed in the python syntax.
    Love MLIR and Mojo as debuggable/performant intermediate languages.
  - bobajeff 17 hours ago
    
    I don't think Mojo can solve the two language problem. Maybe if it was going to be superset of Python? Anyway I think that was actually Julia's goal not Mojo's.
    
    davidatbu 8 hours ago
    
    Being a Python superset is literally a goal of Mojo mentioned in the podcast.
    Edit: from other posts on this page, I've realized that being a superset of Python is now regarded a nice-to-have by Modular, not a must-have. They realized it's harder than they thought initially, basically.
- almostgotcaught 19 hours ago
  
  > which are never, ever written in Python
  nah never ever ever ever ever ... except
  https://github.com/FlagOpen/FlagGems
  https://github.com/linkedin/Liger-Kernel
  https://github.com/meta-pytorch/applied-ai
  https://github.com/gpu-mode/triton-index
  https://github.com/AlibabaPAI/FLASHNN
  https://github.com/unslothai/unsloth
  the number of people commenting on this stuff that don't know what they're actually talking about grows by leaps and bounds every day...
  - Hizonner 18 hours ago
    
    I stand corrected. I should have known people would be doing that in Python.
    How many of the world's total FLOPs are going through those?
    
    saagarjha 5 hours ago
    
    A lot. OpenAI uses Triton for their critical kernels. Meta has torch.compile using it too. I know Anthropic is not using Triton but I think their homegrown compiler is also Python. xAI is using CUTLASS which is C++ but I wouldn't be surprised if they start using the Python API moving forward.
    
    almostgotcaught 3 hours ago
    
    Anthropic is a Jax shop
    
    almostgotcaught 16 hours ago
    
    Triton is a backend for PyTorch. Lately it is the backend. So it's definitely double digits percentage if not over 50%.
    
    cavisne 8 hours ago
    
    Doesn’t triton write its own intermediate language that then compiles to PTX?
    
    saagarjha 5 hours ago
    
    It has a fairly standard MLIR pipeline
    
    almostgotcaught 5 hours ago
    
    Yes and?
    
    lairv 13 hours ago
    
    It's the backend for torch.compile, pytorch eager mode will still use cuBLAS/cuDNN/custom CUDA kernels, not sure what's the usage of torch.compile
    
    almostgotcaught 12 hours ago
    
    > not sure what's the usage of torch.compile
    consider that at minimum both FB and OAI themselves definitely make heavy use of the Triton backend in PyTorch.
fellowmartian 12 hours ago

Ironically Python is the worst language for everything you’ve described. Packaging is pain, wheels are pain, everything breaks all the time. It’s only great for those standalone scripts. Nobody in their right mind would design Python the way it turned out if the goal was to be the main ML language.
halayli 18 hours ago

Fortunately, Chris knows what he's doing. https://docs.modular.com/mojo/manual/python/
pjmlp 9 hours ago

I was there when Perl and Tcl were the main actors, that is why VTK used Tcl originally.
Python dominates, because 25 years ago places like CERN started to adopt Python as their main scripting language, and eventually got used for more tasks than empowered shell scripts.
It is like arguing why C dominates and nothing else can ever replace it.
- cdavid 8 hours ago
  
  I agree ability to use python to "script HPC" was key factor, but by itself would not have been enough. What really made it dominate is numpy/scipy/matplotlib becoming good enough to replace matlab 20 years ago, and enabled an explosion of tools on top of it: pandas, scikit learn, and the DL stuff ofc.
  This is what differentiates python from other "morally equivalent" scripting languages.
goatlover 20 hours ago

> There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python.
Have a hard time believing C++ and Java don't have rich enough ecosystems. Not saying they make for good glue languages, but everything was being written in those languages before Python became this popular.
- j2kun 20 hours ago
  
  Yeah the OP here listed a bunch of Python stuff that all ends up shelling out to C++. C++ is rich enough, period, but people find it unpleasant to work in (which I agree with).
  It's not about "richness," it's about giving a language ecosystem for people who don't really want to do the messy, low-level parts of software, and which can encapsulate the performance-critical parts with easy glue
  - lairv 13 hours ago
    
    I tried to statically link DuckDB to one of my C++ project earlier this year and it took me 3 days to have something working on Windows/Linux/MacOS (just to be able to use the dependency)
    While I'm not a C++ expert, doing the same in Python is just one pip install away, so yeah both "richness" and "ease of use" of the ecosystem matters
  - FuckButtons 18 hours ago
    
    I mean, you’ve basically described why people use Python, it’s a way to use C/C++ without having to write it.
    
    anakaine 18 hours ago
    
    And ill take that reason every single day. I could spend days or more working out particular issues in C++, or I could use a much nicer to use glue language with a great ecosystem and a huge community driving it and get the same task done in minutes to hours.
- flourpower471 15 hours ago
  
  Ever tried to write a web scraper in c++?

frou_dh a day ago

Listening to this episode, I was quite surprised to hear that even now in Sept 2025, support for classes at all is considered a medium-term goal. The "superset of Python" angle was thrown around a lot in earlier discussions of Mojo 1-2 years ago, but at this rate of progress seems a bit of a pie-in-the-sky aspiration?

adgjlsfhk1 a day ago

superset of Python was never a goal. It was a talking point to try and build momentum that was quietly dropped once it served it's purpose of getting Mojo some early attention.
- red2awn 16 hours ago
  
  I don't think they were intentionally rug pulling the community. More likely they started out thinking they can build a python superset, realised it wasn't a good idea so quickly pivoted to the current design. Making Mojo more Python-like is now on their long term roadmap (https://docs.modular.com/mojo/roadmap/#phase-3-dynamic-objec...).
- fwip 20 hours ago
  
  I tend to agree, which is why I can't recommend Mojo, despite thinking their tech is pretty innovative. If they're willing to lie about something that basic, I can't trust any of their other claims.
- ModernMech 20 hours ago
  
  I hope that’s not what it was, that makes them seem very manipulative and dishonest. I was under the impression it was a goal, but they dropped it when it became apparent it was too hard. That’s much more reasonable to understand.
  - adgjlsfhk1 18 hours ago
    
    I'd agree with you if Mojo was a project by an inexperienced team, but it wasn't. Chris Latner alone has decades of PL experience and the issues with making Python fast are fundamental and well known. anyone who comes around with a sales pitch for making a portion implementation that's significantly faster than Pypy is either incredibly naive or lying
    
    ivell 6 hours ago
    
    I am regularly tracking Mojo both in Discord and their forums.
    It was indeed their goal to support Python as superset. Many discussions were around that. So I would not claim any malice in there.
    
    ModernMech 18 hours ago
    
    For me, it’s because Chris is part of the team I’m willing to give them the benefit of the doubt. I will assume ego-driven naïveté over malice. It’s not unheard of.
    
    chrislattner 13 hours ago
    
    Thank you, that's not entirely wrongbut not the full picture. Our initial explanation had two problems actually,
    1) we were "ambitiously optimistic" (different way of saying "ego-driven naïveté" perhaps :) ) and 2) the internet misread our long-term ambitions as being short-term goals.
    We've learned that the world really really wants a better Python and the general attention spans of clickbait are very short - we've intentionally dialed back to very conservative claims to avoid the perception of us overselling.
    We're still just as ambitious though!
    -Chris
bartvk 3 hours ago

I suspect it’s not about rate of progress but rather that they don’t like OOP.
melodyogonna 18 hours ago

It was always a long-term goal.

tree_enjoyer 18 hours ago

I know it's a bit of trope to say this on HN... but why not Lisp?

If I make the assumption that future ML code will be written by ML algorithms (or at least 'transpiled' from natural language), and Lisp S-Expressions are basically the AST, it would make sense that it would be the most natural language for a machine to write in. As a side benefit, it's already the standard to have a very feature complete REPL, so as a replacement for Python it seems it would fit well

nextos 12 hours ago

Yann LeCun and others created https://lush.sourceforge.net, which was a Lisp for ML.
It was probably the best option during the 2000s, and only superseded when Theano (Python) and Torch (Lua) emerged.
A Lisp comeback in this niche would be great. Python has fantastic libraries, but the language leaves a lot to be desired.
layer8 17 hours ago

LLMs have trouble counting parentheses accurately. ;)
- trallnag 15 hours ago
  
  We need a lisp that is written in JSON. That should fix it
  - sorry_i_lisp 7 hours ago
    
    https://github.com/delaneyj/jackalope
  - cjohnson318 15 hours ago
    
    It would REBOLutionize everything.
pjmlp 9 hours ago

Because it got burned in the last AI hype cycle, and many devs still think Emacs + SBCL is the only way, never even bothered to open the websites of LispWorks, Allegro Common Lisp or Clozure (not the JVM one).

seabrookmx 13 hours ago

The mojo faq talks about the language as if it's a strict superset (or aiming to be) of Python. Or that mojo "is" Python.

Yet the roadmap says:

> As Mojo matures through phase 3, we believe Mojo will become increasingly compatible with Python code and deeply familiar to Python users, except more efficient, powerful, coherent, and safe. Mojo may or may not evolve into a full superset of Python, and it's okay if it doesn't.

This is incredibly confusing. If it's _not_ aiming for Python compatibility, why are we talking about Python at all?

Also, is anyone actually considering using an emoji as a file extension?

mmphosis 2 hours ago

> Also, is anyone actually considering using an emoji as a file extension?
Definitely.U+2615
pansa2 13 hours ago

AFAICT Mojo has (or plans to have) Python-like syntax and easy interop with Python.
I’m not sure it’s any more Python-like than that - the similarity to Python seems to be heavily overstated for marketing reasons.
- itsn0tm3 8 hours ago
  
  I‘m pretty sure in their early communications the stated very clearly that Mojo is going to be a clear superset of Python. Seems like they paddled back a bit in that regard.
  - davidatbu 8 hours ago
    
    Yeah this is slightly confusing for me as well. Even in this very podcast, being a superset of Python was mentioned as a goal (albeit a long term one).

_aavaa_ a day ago

Yeah, except Mojo’s license is a non-starter.

auggierose a day ago

Wow, just checked it out, and they distinguish (for commercial purposes) between CPU & Nvidia on one hand, and other "accelerators" (like TPU or AMD) on the other hand. For other accelerators you need to contact them for a license.
https://www.modular.com/blog/a-new-simpler-license-for-max-a...
- _aavaa_ a day ago
  
  Yes; in particular see sections 2-4 of [0].
  They say they'll open source in 2026 [1]. But until that has happened I'm operating under the assumption that it won't happen.
  [0]: https://www.modular.com/legal/community
  [1]: https://docs.modular.com/mojo/faq/#will-mojo-be-open-sourced
  - mdaniel a day ago
    
    > I'm operating under the assumption that it won't happen.
    Or, arguably worse: my expectation is that they'll open source it, wait for it to get a lot of adoption, possibly some contribution, certainly a lot of mindshare, and then change the license to some text no one has ever heard of that forbids use on nvidia hardware without paying the piper or whatever
    If it ships with a CLA, I hope we never stop talking about that risk
  - actionfromafar a day ago
    
    Same
rs186 20 hours ago

To my naive mind, any language that is controlled by a single company instead of a non profit is a non-starter. Just look at how many companies reacted when Java license change happened. You must be either an idiot or way too smart for me to understand to base your business on a language like Mojo instead of Python.
- discreteevent 19 hours ago
  
  It's hard to think of a language for server side programming that I could go back and recommend to build their business on 20 or 30 years ago than java. I don't think aws or even Google regret it. Think of the amount of profitable requests being processed by the jvm all over the world for years by every kind of business.
  - const_cast 18 hours ago
    
    That's just because oracle are assholes but not that big of assholes.
    If OpenJDK did not or could not exist, we would all be mega fucked. Luckily alternative Java implementations exist, are performant, and are largely indistinguishable from the non-free stuff. Well, except whatever android has going on... That stuff is quirky.
    But if you look at dotnet, prior to opensourcing it, it was kind of a fucked up choice to go with. You were basically locked into Windows Server for your backend and your application would basically slowly rot over time as you relied on legacy windows subsystems that even Microsoft barely cared to support, like COM+.
    There were always alternative dotnet runtimes like Mono, but they were very not feature complete. Which, ironically, is why we saw so much Java. The CLR is arguably a better designed VM and C# a better language, but it doesn't matter.
    
    pjmlp 8 hours ago
    
    OpenJDK is mainly developed by Oracle employees, the other big one OpenJ9 is developed by IBM, everything else like Azul, PTC, Aicas, microEJ,.... are commercial.
  - rs186 17 hours ago
    
    That's definitely true. But we are talking about now, aren't we? A decision of which language to use today is very different from the considerations 20/30 years ago.
  - _aavaa_ 19 hours ago
    
    Yeah, but whose JVM are they running on? I’d be surprised if they’re all paying the lawnmower.
- saagarjha 5 hours ago
  
  How do you feel about CUDA?
  - rs186 3 hours ago
    
    You don't really have a choice, and the alternatives are worse. Plus, it's not the same question as choosing a language -- this is more about choosing hardware vendor their SDK where options could be either plenty or very limited depending on what you work on.

whimsicalism 10 hours ago

I find the attempt behind Mojo bold, but unfortunately a proprietary language a la matlab is a non-starter for me and I assume many other people.

davidatbu 8 hours ago

Fwiw, Chris has detailed in many podcasts that Mojo will definitely be entirely open source in the future, and that the only reason it isn't now is that he learnt from his experience running Swift as an open source project that the open contribution and planning model is a bad fit, and basically counter productive, at this stage of a language's lifetime.
That is why they opted to open source in stages (ie, the stdlib is open source right now, and they expect to open source the compiler soon).
- procaryote 7 hours ago
  
  That's nice... once it's fully open source and forkable, it might be worth a look

nromiun a day ago

Weird that there has been no significant adoption of Mojo. It has been quite some time since it got released and everyone is still using PyTorch. Maybe the license issue is a much bigger deal than people realize.

pjmlp a day ago

I personally think they overshot themselves.
First of all some people really like Julia, regardless of how it gets discussed on HN, its commercial use has been steadily growing, and has GPGPU support.
On the other hand, regardless of the sore state of JIT compilers on CPU side for Python, at least MVidia and Intel are quite serious on Python DSLs for GPGPU programming on CUDA and One API, so one gets close enough to C++ performance while staying in Python.
So Mojo isn't that appealing in the end.
- dsharlet 20 hours ago
  
  The problem I've seen is this: in order to get good performance, no matter what language you use, you need to understand the hardware and how to use the instructions you want to use. It's not enough to know that you want to use tensor cores or whatever, you also need to understand the myriad low level requirements they have.
  Most people that know this kind of thing don't get much value out of using a high level language to do it, and it's a huge risk because if the language fails to generate something that you want, you're stuck until a compiler team fixes and ships a patch which could take weeks or months. Even extremely fast bug fixes are still extremely slow on the timescales people want to work on.
  I've spent a lot of my career trying to make high level languages for performance work well, and I've basically decided that the sweet spot for me is C++ templates: I can get the compiler to generate a lot of good code concisely, and when it fails the escape hatch of just writing some architecture specific intrinsics is right there whenever it is needed.
  - adgjlsfhk1 20 hours ago
    
    The counterpoint to this is that having a language that has a graceful slide between python like flexibility and hand optimized assembly is really useful. The thing I like most about Julia is it is very easy to both write fast somewhat sloppy code (e.g. for exploring new algorithms), but then you can go through and tune it easily for maximal performance and get as fast as anything out there.
    
    wolvesechoes 18 hours ago
    
    > easily for maximal performance and get as fast as anything out there.
    Optimizing Julia is much harder than optimizing Fortran or C.
    
    postflopclarity a few seconds ago
    
    for equal LOC, sure. for equal semantics, less true
- mvieira38 20 hours ago
  
  > First of all some people really like Julia, regardless of how it gets discussed on HN, its commercial use has been steadily growing
  Got any sources on that? I've been interested in learning Julia for a while but don't because it feels useless compared to Python, especially now with 3.13
  - xgdgsc 10 hours ago
    
    https://www.reddit.com/r/Julia/comments/1efxp0j/comment/lfob...
  - pjmlp 9 hours ago
    
    Of course, because Internet is where we always have to prove ourselves.
    https://info.juliahub.com/industries/case-studies-1/author/j...
  - adgjlsfhk1 19 hours ago
    
    what about python 3.13 is significant for you? if it's multithreading you likely should be prepared for disappointment. Free threading is ~30% slower than GIL and the first rule of multi threaded code is to first optimize the hell out of the single threaded version.
    
    wolvesechoes 18 hours ago
    
    Probably the same stuff as with 3.12 or 3.11 or 3.10: good docs, huge ecosystem, wide knowledge base, detailed reference.
- nickpsecurity a day ago
  
  Here's some benefits it might try to offer as differentiators:
  1. Easy packaging into one executable. Then, making sure that can be reproducible across versions. Getting code from prior, AI papers to rub can be hard.
  2. Predictability vs Python runtime. Think concurrent, low-latency GC's or low/zero-overhead abstractions.
  3. Metaprogramming. There have been macro proposals for Python. Mojo could borrow from D or Rust here.
  4. Extensibility in a way where extensions don't get too tied into the internal state of Mojo like they do Python. I've considered Python to C++, Rust, or parallelized Python schemes many times. The extension interplay is harder to deal with than either Python or C++ itself.
  5. Write once, run anywhere, to effortlessly move code across different accelerators. Several frameworks are doing this.
  6. Heterogenous, hot-swappable, vendor-neutral acceleration. That's what I'm calling it when you can use the same code in a cluster with a combination of Nvidia GPU', AMD GPU's, Gaudi3's, NPU's, SIMD chips, etc.
  - pjmlp a day ago
    
    Agree in most points, however I still can't use it today on Windows, and it needs that unavoidable framework.
    Languages on their own is very hard to gain adoption.
raggi 20 hours ago

I'm on the systems side, and I find some of what Chris and team are doing with Mojo pretty interesting and could be useful to eradicate a bunch of polyglot ffi mess across the board. I can't invest in it or even start discussions around using it until it's actually open.
- bobajeff 18 hours ago
  
  Yeah I'm in the same boat. I plan to prototype in python and then speed up the slow bits in a low level language. I've narrowed my options to C++ and Mojo.
  C++ just seems like a safer bet but I'd love something better and more ergonomic.
melodyogonna a day ago

It is not ready for general-purpose programming. Modular itself tried offering a Mojo api for their MAX engine, but had to give up because the language still evolved too rapidly for such an investment.
As per the roadmap[1], I expect to start seeing more adoption once phase 1 is completed.
1. https://docs.modular.com/mojo/roadmap
singularity2001 a day ago

Is it really released? Last time I checked it was not open sourced I don't want to rely on some proprietary vaporware stack.
- melodyogonna a day ago
  
  It is released but not open-source. Modular was aiming to open-source the compiler by Q4 2026; however, Chris now says they could be able to do that considerably faster, perhaps early 2026[1].
  If you're interested, they think the language will be ready for open source after completing phase 1 of the roadmap[2].
  1.https://youtu.be/I0_XvXXlG5w?si=KlHAGsFl5y1yhXnm&t=943
  2. https://docs.modular.com/mojo/roadmap
jb1991 a day ago

It says at the top:
> write state of the art kernels
Mojo seems to be competing with C++ for writing kernels. PyTorch and Julia are high-level languages where you don't write the kernels.
- Alexander-Barth a day ago
  
  Actually in julia you can write kernels with a subset of the julia language:
  https://cuda.juliagpu.org/stable/tutorials/introduction/#Wri...
  With KernelAbstractions.jl you can actually target CUDA and ROCm:
  https://juliagpu.github.io/KernelAbstractions.jl/stable/kern...
  For python (or rather python-like), there is also triton (and probably others):
  https://pytorch.org/blog/triton-kernel-compilation-stages/
  - davidatbu 8 hours ago
    
    Chris's claim (at least with regards to Triton) is that it avails 80% of the performance, and they're aiming for closer to 100%.
- jakobnissen a day ago
  
  I think Julia aspires to be performant enough that you can write the kernels in Julia, so Julia is more like Mojo + Python together.
  Although I have my doubts that Julia is actually willing to make the compromises which would allow Julia to go that low level. I.e. semantic guarantees about allocations and inference, guarantees about certain optimizations, and more.
- pjmlp a day ago
  
  You can write kernels with Python using CUDA and Open API SDKs in 2025, that is one of the adoption problems regarding Mojo.
poly2it 20 hours ago

I definitely think the license is a major holdback for the language. Very few individuals or organisation for that matter would like to invest in a new closed stack. CUDA is accepted simply because it has been along for such a long time. GPGPU needs a Linux moment.
pansa2 a day ago

Sounds to me like it's very incomplete:
> maybe a year, 18 months from now [...] we’ll add classes
subharmonicon 15 hours ago

The market tends to be pretty efficient for things like these. We’ve seen significant rapid adoption of several different ML solutions over the last decade, yet Mojo languishes. I think that’s a clear sign they aren’t solving the real-world pain points that users are hitting, and are building a rather niche solution that only appeals to a small number of people, no matter how good their execution may be.
fnands a day ago

It's still very much in a beta stage, so a little bit hard to use yet.
Mojo is effectively an internal tool that Modular have released publicly.
I'd be surprised to see any serious adoption until a 1.0 state is reached.
But as the other commented said, it's not really competing with PyTorch, it's competing with CUDA.
ModernMech 20 hours ago

They’re not going to see serious adoption before they open source. It’s just a rule of programming languages at this point if you don’t have the clout to force it, and Modular does not. People have been burned too many times by closed source languages.

Cynddl a day ago

Anyone knows what Mojo is doing that Julia cannot do? I appreciate that Julia is currently limited by its ecosystem (although it does interface nicely with Python), but I don't see how Mojo is any better then.

thetwentyone a day ago

Especially because Julia has pretty user friendly and robust GPU capabilities such as JuliaGPU and Reactant[2] among other generic-Julia-code to GPU options.
1: https://enzymead.github.io/Reactant.jl/dev/ 2: https://enzymead.github.io/Reactant.jl/dev/
- jb1991 a day ago
  
  I get the impression that most of the comments in this thread don't understand what a GPU kernel is. These high-level languages like Python and Julia are not running on the kernel, they are calling into other kernels usually written in C++. The goal is different with Mojo, it says at the top of the article:
  > write state of the art kernels
  You don't write kernels in Julia.
  - arbitrandomuser a day ago
    
    >You don't write kernels in Julia.
    The package https://github.com/JuliaGPU/KernelAbstractions.jl was specifically designed so that julia can be compiled down to kernels.
    Julia's is high level yes, but Julia's semantics allow it to be compiled down to machine code without a "runtime interpretter" . This is a core differentiating feature from Python. Julia can be used to write gpu kernels.
  - ssfrr a day ago
    
    It doesn’t make sense to lump python and Julia together in this high-level/low-level split. Julia is like python if numba were built-in - your code gets jit compiled to native code so you can (for example) write for loops to process an array without the interpreter overhead you get with python.
    People have used the same infrastructure to allow you to compile Julia code (with restrictions) into GPU kernels
  - adgjlsfhk1 a day ago
    
    Julia's GPU stack doesn't compile to C++. it compiles Julia straight to GPU assembly.
  - jakobnissen a day ago
    
    Im pretty sure Julia does JIT compilation of pure Julia to the GPU: https://github.com/JuliaGPU/GPUCompiler.jl
    
    actionfromafar a day ago
    
    ” you should use one of the packages that builds on GPUCompiler.jl, such as CUDA.jl, AMDGPU.jl, Metal.jl, oneAPI.jl, or OpenCL.jl”
    Not sure how that organization compares to Mojo.
  - pjmlp a day ago
    
    See new cu tile architecture on CUDA, designed from the ground up with Python in mind.
Alexander-Barth a day ago

I guess that the interoperability with Python is a bit better. But on the other hand, the PythonCall.jl (allowing calling python from julia) is quite good and stable. In Julia, you have quite good ML frameworks (Lux.jl and Flux.jl). I am not sure that you have mojo-native ML frameworks which are similarly usable.
jakobnissen a day ago

Mojo to me looks significantly lower level, with a much higher degree of control.
Also, it appears to be more robust. Julia is notoriously fickle in both semantics and performance, making it unsuitable for foundational software the way Mojo strives for.
- Archit3ch 18 hours ago
  
  > Also, it appears to be more robust.
  Sure, Mojo the language is more robust. Until its investors decide to 10x the licensing Danegeld.
bobajeff 18 hours ago

I've looked into making Python modules with Julia and it doesn't look like that is very well supported right now. Where as it's a core feature of Mojo.
- dunefox 17 hours ago
  
  Shouldn't something like this work? https://github.com/JuliaPy/PythonCall.jl
  - bobajeff 16 hours ago
    
    That might work but can't seem to find much information on using it to create a pip installable module though.
MohamedMabrouk a day ago

* Compiling arbitrary Julia code into a native standalone binary (a la rust/C++) with all its consequcnes.
ubj a day ago

> Anyone knows what Mojo is doing that Julia cannot do?
First-class support for AoT compilation.
https://docs.modular.com/mojo/cli/build
Yes, Julia has a few options for making executables but they feel like an afterthought.
jb1991 a day ago

Isn't Mojo designed for writing kernels? That's what it says at the top of the article:
> write state of the art kernels
Julia and Python are high-level languages that call other languages where the kernels exist.
- Sukera a day ago
  
  No, you can write the kernels directly in Julia using KernelAbstractions.jl [1].
  [1] https://juliagpu.github.io/KernelAbstractions.jl/stable/
hansvm a day ago

[0] https://danluu.com/julialang/

tony 19 hours ago

Chris Lattner also has a very good episode on Lex Fridman (Episode 381, June 2nd, 2023):

- https://www.youtube.com/watch?v=pdJQ8iVTwj8

- https://open.spotify.com/episode/6flH0XxwdIbayoXTHOgAfI

- https://podcasts.apple.com/us/podcast/381-chris-lattner-futu...

He has two other episodes on the show:

- https://www.youtube.com/watch?v=nWTvXbQHwWs (Episode 131, October 18th, 2020)

- https://www.youtube.com/watch?v=yCd3CzGSte8 (Episode 21, May 13th, 2019)

seanmcdirmid 8 hours ago

This headline must really irk someone like Robert Harper. Unfortunate acronym collision, but ML (programming language) at least could be disambiguated that way.

JonChesterfield 20 hours ago

ML seems to be doing just fine with python and cuda.

davidatbu an hour ago

Yeah the rate of progress in AI definitely makes it seem like that from the outside for me too.
But having never written cuda, I have to rely on authority to some extent for this question. And it seems to me like few are in a better position to opine on whether there's a better story to be had for the software-hardware boundary in ML than the person who wrote MLIR, Swift-for-Tensorflow (alongside with making that work on TPUs and GPUs), ran ML at Tesla for some time, was VP at SiFive, ... etc.
poly2it 20 hours ago

Python and CUDA are not very well adapted for embedded ML.

threeducks a day ago

When I was young, I enjoyed messing around with new languages, but as time went on, I realized that there is really very little to be gained through new languages that can not be obtained through a new library, without the massive downside of throwing away most of the ecosystem due to incompatibility. Also, CuPy, Triton and Numba already exist right now and are somewhat mature, at least compared to Mojo.

jakobnissen a day ago

Usually people create languages to address issues that cannot be addressed by a library because they have different semantics on a deeper level.
Like, Rust could not be a C++ library, that does not make sense. Zig could not be a C library. Julia could not be a Python library.
There is some superficial level of abstraction where all programming languages do is interchangeable computation and therefore everything can be achieved in every language. But that superficial sameness doesn't correspond to the reality of programming.
- threeducks a day ago
  
  I agree with your examples, but is there anything new that Mojo brings to the table that could not be achieved with a Python library?
- throwawaymaths a day ago
  
  famously people have tried to make erlang a library and failed twice by my count (erjang, ergo)
dwattttt a day ago

If a learning a new language didn't change how you think about programming, it wasn't a language worth learning.
- gugagore 5 minutes ago
  
  "A language that doesn't affect the way you think about programming is not worth knowing." ― Alan J. Perlis
- threeducks a day ago
  
  Learning new languages did change how I think about programming. For example, Clojure's immutability and functional nature had a strong influence on how I write my (mostly Python) code these days. I learned how to write efficient code for CPUs with C and C++, and for GPUs with CUDA and OpenCL. I learned math with Matlab and Octave, and declarative programming with Prolog.
  With Mojo, on the other hand, I think a library (or improvements to an existing library) would have been a better approach. A new language needlessly forks the developer community and duplicates work. But I can see the monetary incentives that made the Mojo developers choose this path, so good for them.
ActionHank a day ago

Would love to know which languages you learned that were so similar that you didn't gain much.
Just comparing for example c++, c#, and typescript. These are all c-like, have heavy MS influence, and despite that all have deeply different fundamentals, concepts, use cases, and goals.
- threeducks a day ago
  
  I have learned a lot from other programming languages, but developing a new programming language and building an ecosystem around it is a huge amount of work. In the case of the Mojo programming language, it would have been more beneficial to the programming community as a whole if the developers had spent their time improving existing libraries instead of developing a new language.

postflopclarity a day ago

Julia could be a great language for ML. It needs more mindshare and developer attention though

singularity2001 a day ago

What's the current state of time to first plot and executable size? Last time it was several seconds to get a 200 MB hello world. I'm sure they are moving in the right direction the only questions is are they there yet?
- moelf a day ago
  
  with juliac.jl and --trim, hello world is now 1MB and compiles in a second.
  more realistic examples of compiling a Julia package into .so: https://indico.cern.ch/event/1515852/contributions/6599313/a...
- postflopclarity a day ago
  
  improving, slowly. 5 steps forward 3 steps back.
  1.9 and 1.10 made huge gains in package precompilation and native code caching. then attentions shifted and there were some regressions in compile times due to unrelated things in 1.11 and the upcoming 1.12. but at the same time, 1.12 will contain an experimental new feature `--trim` as well as some further standardization around entry points to run packages as programs, which is a big step towards generating self-contained small binaries. also nearly all efforts in improving tooling are focused on providing static analysis and helping developers make their program more easily compilable.
  it's also important a bit to distinguish between a few similar but related needs. most of what I just described applies to generating binaries for arbitrary programs. but for the example you stated "time to first plot" of existing packages, this is already much improved in 1.10 and users (aka non-package-developers) should see sub-second TTFP, and TTFX for most packages they use that have been updated to use the precompilation goodies in recent versions
- adgjlsfhk1 a day ago
  julia> @time begin
  using Plots display(plot(rand(8))) end 1.074321 seconds
  On Julia 1.12 (currently at release candidate stage), <1mb hello world is possible with juliac (although juliac in 1.12 is still marked experimental)
- ModernMech 21 hours ago
  
  I recently looked into making Julia binaries, and it's not at all a good process. They say it's supported, but it's not exactly as easy as "cargo build" to get a Julia binary out. And the build process involves creating this minimal version of Julia you're expected to ship with your binary, so build times were terrible. I don't know if that gets amortized though.
  As far as the executable size, it was only 85kb in my test, a bouncing balls simulation. However, it required 300MB of Julia libraries to be shipped with it. About 2/3 of that is in libjulia-codegen.dll, libLLVM-16jl.dll. So you're shipping this chunky runtime and their LLVM backend. If you're willing to pay for that, you can ship a Julia executable. It's a better story than what Python offers, but it's not great if you want small, self-contained executables.
  - postflopclarity 21 hours ago
    
    note that as a few other commenters have pointed out, this situation will improve greatly in 1.12 (although still many rough edges)
    
    ModernMech 18 hours ago
    
    Yeah, that's what I've been hearing about Julia for about 10 years now: "situation will improve greatly in the next version, but still many rough edges remain".
    I've been involved in a few programming language projects, so I'm sympathetic as to how much work goes into one and how long they can take.
    At the same time, it makes me wary of Julia, because it highlights that progress is very slow. I think Julia is trying to be too much at once. It's hard enough to be a dynamic, interactive language, but they also want to claim to be performant and compiled. That's a lot of complexity for a small team to handle and deliver on.
    
    adgjlsfhk1 14 hours ago
    
    Different things have been improving for the past 10 years. TTFX is in a good spot now, multithreading is in a pretty good spot now, GC issues for most people are basically solved. AOT compilation was pretty much the last big item that a lot of people wanted, and that has recently (within the last year) gotten to the point where the code exists, it's merged, and it will be releasing soon (and then spending the next year or two getting improved)
    
    pjmlp 8 hours ago
    
    That has been the situation with many programming languages until they finally explode, including Python, which is 34 years old, and also was largely ignored for a decade until Zope in 2000's.
numbers_guy a day ago

What makes Julia "great" for ML?
- macawfish a day ago
  
  Built-in autodifferentiation and amazing libraries built around it, plus tons of cutting edge applied math libraries that interoperate automatically, thanks to Julia's well conceived approach to the expression problem (multiple dispatch). Aside from that, the language itself is like a refined python so it should be pretty friendly off the bat to ML devs.
  What Julia needs though: wayyyy more thorough tooling to support auto generated docs, well integrated with package management tooling and into the web package management ecosystem. Julia attracts really cutting edge research and researchers writing code. They often don't have time to write docs and that shouldn't really matter.
  Julia could definitely use some work in the areas discussed in this podcast, not so much the high level interfaces but the low level ones. That's really hard though!
- postflopclarity a day ago
  
  I would use the term "potentially great" rather than plain "great"
  but all the normal marketing words: in my opinion it is fast, expressive, and has particularly good APIs for array manipulation
  - numbers_guy a day ago
    
    Interesting. I am experimenting with different ML ecosystems and wasn't really considering Julia at all but I put it on the list now.
    
    postflopclarity a day ago
    
    Glad to hear. I've found it's a very welcoming community.
    I'll warn you that Julia's ML ecosystem has the most competitive advantage on "weird" types of ML, involving lots of custom gradients and kernels, integration with other pieces of a simulation or diffeq, etc.
    if you just want to throw some tensors around and train a MLP, you'll certainly end up finding more rough edges than you might in PyTorch
    
    salty_biscuits 2 hours ago
    
    Yes, my experience has been that it is great if you need to do something particularly weird, but less smooth to do something ordinary.
    
    macawfish a day ago
    
    If I wanted to get into research ML, I'd pick Julia no doubt. It allows both conventional ML techniques where we throw tons of parameters at the problem, but additionally a more nimble style where we can train over ordinary functions.
    Combine that with all the cutting edge applied math packages often being automatically compatible with the autodiff and GPU array backends, even if the library authors didn't think about that... it's a recipe for a lot of interesting possibilities.
- bobbylarrybobby 19 hours ago
  
  It's a low level language with a high level interface. In theory, GC aside, you should be able to write code as performant as C++ without having to actually write C++. It's also homoiconic and the compiler is part of the language’s API, so you can do neat things with macros that let more or less you temporarily turn it into a different language.
  In practice, the Julia package ecosystem is weak and generally correctness is not a high priority. But the language is great, if you're willing to do a lot of the work yourself.
mdaniel a day ago

I don't understand why in the world someone would go from one dynamically typed language to another. Even the kernels example cited below is "eh, the types are whatever you want them to be" https://cuda.juliagpu.org/stable/tutorials/introduction/#Wri...
Then again, I am also open to the fact that I'm jammed up by the production use of dynamically typed languages, and maybe the "for ML" part means "I code in Jupyter notebooks" and thus give no shits about whether person #2 can understand what's happening
- postflopclarity a day ago
  
  It's very important that readers, writers, maintainers, etc. of code are able to easily understand what that code is doing.
  explicit and strict types on arguments to functions is one way, but certainly not the only way, nor probably the best way to effect that
  - mdaniel a day ago
    
    I would actually be curious to hear your perspective on the "best way" that isn't typechecking. I literally cannot comprehend why someone would write such a thing
    I readily admit that I am biased in that I believe that having a computer check that every reference to every relationship does what it promises, all the time
    
    postflopclarity a day ago
    
    first and foremost great documentation & design docs cannot be surpassed as a tool to explain and understand code. and that is entirely language agnostic.
    more generally, the most important bits of a particular function to understand is
    * what should it be called with
    * what should it return
    * what side effects might it have
    and the "what" here refers to properties in a general sense. types are a good shortcut to signify certain named collections of properties (e.g., the `Int` type has arithmetic properties). but there are other ways to express traits, preconditions, postconditions, etc. besides types
    
    const_cast 18 hours ago
    
    but there are other ways to express traits, preconditions, postconditions, etc. besides types
    You can also put that in the type system, and expressive languages do. Its just a compiler limitations when we can't.
    I mean, even in C++ with concepts we can do most of that. And C++ doesn't have the most expressive type system.
    
    lgas a day ago
    
    I mean documentation can be wrong whereas types can't, so it seems like it's strictly a worse tool if your goal is to understand what's actually going on and not what someone said was going on at some point in the past.
    
    postflopclarity a day ago
    
    > whereas types can't
    they sure can...
    
    olddustytrail 20 hours ago
    
    Can they? How does that work?
    
    adgjlsfhk1 19 hours ago
    
    types can be (and almost always are) overly restrictive, preventing otherwise valid code from running. they can also be under powered, and not expressing necessary invariants for the algorithm.
    
    lgas 15 hours ago
    
    I didn't make any claims about that, just that they can't be wrong. And by that, I didn't mean you can't choose the wrong types, just that once you've chosen types that compile they can't be incompatible or other than what they are.
    That being said, I've always found the argument that types can be overly restrictive and prevent otherwise valid code from running unconvincing. I've yet to see dynamic code that benefits from this alleged advantage.
    Nearly universally the properly typed code for the same thing is better, more reliable and easier for new people to understand and modify. So sure, you can avoid all of this if the types are really what bother you, but it feels a bit like saying "there are stunts I can pull off if I'm not wearing a seatbelt that I just can't physically manage if I am."
    If doing stunts is your thing, knock yourself out, but I'd rather wear a seatbelt and be more confident I'm going to get to my destination in one piece.
rvz a day ago

> It needs more mindshare and developer attention though
That is the problem. Julia could not compete against Python's mindshare.
A competitor to Python needs to be 100% compatible with its ecosystem.

brainzap 3 hours ago

side note: we need a new syntax, to separate human comments from machine/compiler/llm/type comments

lordofgibbons 18 hours ago

I'm the primary target audiance for Mojo and was very interested in it, but I just wish they didn't keep Exceptions. This backwards compatibility with Python syntax is extremely overrated and not worth the cost of bringing language warts from the 90s.

God, I hate exceptions so much. I have never seen anyone use exceptions correctly in either Java (at FAANG) or in any regular Python application.

I'm much more in favor of explicit error handling like in Go, or the syntax sugar Rust provides.

chrislattner 13 hours ago

Exceptions in Mojo are just syntax sugar for Result types. You don't have to use them if you don't want, and the overhead is not like C++ exceptions.
UncleEntity 16 hours ago

Which kind of begs the question: what it the correct way to use exceptions?
Like, python says to throw an exception when an iterator reaches the end so if my custom C iterator does that is it wrong? I do kind of want to be able to do the whole 'for i in whatever: do(stuff(i))' thing so...
- rybosome 14 hours ago
  
  Not the OP, but I assume they mean that it's encoded in the type system.
  For example Rust gives you a `Result<Thing, ErrorType>`, which might be a `Thing` or might be one of the possible error types given by `ErrorType`. So when you get a function's return value, you have to deal with the fact that it might have failed with this specific error.

typpilol 6 hours ago

How comes no one has mentioned the weird file extension

monkeyelite 8 hours ago

If there is one thing Chris is good at, it’s promoting himself and his work.

davidatbu 8 hours ago

Just to make sure I understand you correctly, you're claiming that the person who started (and lead onto maturity) LLVM, Swift, and MLIR, writing millions of lines of c++ and leading dozens of engineers in the process, is primarily good at self promotion (as opposed to, say, language design, or compilers, or tackling really hard and long term software projects)?
- monkeyelite 8 hours ago
  
  That’s right, and saying otherwise is actually robbing him of his rightful talent and ability.
  > really hard and long term software project
  That’s kind of what I mean. He commits to projects and gets groups of people talking about them and interested.
  Imagine how hard it was to convince everyone at apple to use his language - and how many other smart engineers projects were not chosen. It’s not even clear the engineering merits were there for that one.
  - davidatbu 7 hours ago
    
    So I think a demonstrative example of your claim would be if you knew someone who is as accomplished with regards to compilers, language design, tackling really hard long term projects, but not as good at self promotion, and elaborate on what the lack of that skill-set caused.
    The only other person I know of who has started and lead to maturity multiple massive and infrastructural software projects is Fabrice Bellard. I've never ran into him self promoting (podcasts, HN, etc), and yet his projects are widely used and foundational.
    It seems to me like the evidence points to "if you tackle really hard, long term, and foundational software projects successfully, people will use it, regardless of your ability to self promote."
    
    monkeyelite 30 minutes ago
    
    > tackling really hard long term projects, but not as good at self promotion
    Fabrice is one of my examples. Walter Bright (who does spend effort on promotion). Anyone who works on the V8 compiler at Google, or query engine on Postgres who we have never heard of.
    > It seems to me like the evidence points to "if you tackle really hard, long term, and foundational software projects successfully, people will use it,
    That’s a common belief among engineers. If you have worked at a large company you know that’s just now how big efforts like switching from Objective-C to Swift get done.

dboreham a day ago

ML *is* a programming language.

jhbadger 16 hours ago

It's even standard! (well, one version of it is called that)
https://en.wikipedia.org/wiki/Standard_ML

atbpaca a day ago

Mojo looks like the perfect balance between readability (python-like syntax) and efficiency (rust-like performance).

torginus a day ago

I think Mojo's cool and there's definitely a place for a modern applications programming language with C++ class(ish) performance, aka what Swift wanted to be but got trapped in the Apple ecosystem (designed by the same person as Mojo).

The strong AI focus seems to be a sign of the times, and not actually something that makes sense imo.

tomovo a day ago

While I appreciate all his work on LLVM, Chris Lattner's Swift didn't work out so well for me, so I'm cautious about this.
Swift has some nice features. However, the super slow compilation times and cryptic error messages really erase any gains in productivity for me.
- "The compiler is unable to type-check this expression in reasonable time?" On an M3 Pro? What the hell!?
- To find an error in SwiftUI code I sometimes need to comment everything out block by block to narrow it down and find the culprit. We're getting laughs from Kotlin devs.
- davidatbu an hour ago
  
  Fwiw, Chris has mentioned both of those as lessons he took from Swift that he'd like to avoid for Mojo.
- melodyogonna a day ago
  
  I think Swift is really successful in that there are so many new Apple developers who would use Swift now but wouldn't have used ObjC.
- elpakal a day ago
  
  To be fair to Chris, I’ve only seen the message about compiler not being able to type check the expression in swiftui closure hell. I think he left (maybe partly) because of the SwiftUI influence on Swift.
  - drivebycomm 19 hours ago
    
    [dead]
    
    chrislattner 13 hours ago
    
    Mojo learns a lot from the mistakes in Swift, including this one. Mojo compiles much faster and doesn't have exponential time type checking! :)
    
    valleyer 12 hours ago
    
    That last paragraph (the personal attack) is completely needless.
    
    drivebycomm 10 hours ago
    
    Not a personal attack, and the post is completely valid, but it won't bother me much if you succeed in having my posts censored, as Hacker News (and Western media in general) often censors many people's posts for a variety of purposes, some deeply and completely immoral and corrupt, for this was a drive-by comment from a temporary account.
    That programming language designers have to be careful about a type system and its type checking's asymptotic time complexity was 100% widely known before Swift was first created. Some people like to diss on mathematics, but this stuff can have severe practical and widespread engineering consequences. I don't expect everyone to master everything, but budding programming language designers should then at least realize that there may be important issues, and for instance mitigate any issues with having one or more experts checking the relevant aspects.
fnands a day ago

> The strong AI focus seems to be a sign of the times, and not actually something that makes sense imo.
It has been Mojo's explicit goal from the start. It has it's roots in the time that Chris Lattner spent at Google working on the compiler stack for TPUs.
It was explicitly designed to by Python-like because that is where (almost) all the ML/AI is happening.
diggan a day ago

> The strong AI focus seems to be a sign of the times, and not actually something that makes sense imo.
Are you sure about that? I think Mojo was always talked about as "The language for ML/AI", but I'm unsure if Mojo was announced before the current hype-cycle, must be 2-3 years at this point right?
- torginus a day ago
  
  According to wikipedia it was announced in May 2023
ModernMech 20 hours ago

It makes a lot of sense when you look at how much money they have raised:
https://techcrunch.com/2023/08/24/modular-raises-100m-for-ai...
You don’t raise $130M at a $600M valuation to make boring old dev infrastructure that is sorely needed but won’t generate any revenue because no one is willing to pay for general purpose programming languages in 2025.
You raise $130M to be the programming foundation of next Gen AI. VCs wrote some big friggen checks for that pitch.

theanonymousone a day ago

https://xkcd.com/927/

Sorry, couldn't resist.

Mars008 10 hours ago

It's not clear about multithreading on CPU. Without it it's hard to use modern it efficiently. Something like TBB library in C++ would be nice.

icanthulahoop 20 hours ago

stopped after the first line. isn't Vikram Adve also the creator of LLVM? I prefer terms like co-creator, co-invented, etc.

defraudbah 7 hours ago

clankers are getting out of hand

CyberDildonics 19 hours ago

I don't think ML does need a new programming language. You give up an extreme amount of progress in tools and libraries when you move to a new language.

I haven't seen new languages that market themselves for specific features that couldn't be done just as easily through straight classes with operator overloading.

dboreham a day ago

ML is a programming language.

a3w a day ago

Meta Language is shortened to ML. Great language, and even fathered further ML dialects.
Machine Learning is shortened to ML, too.
This posting is about "Why ML needed Mojo", but does not tell us why the license of Mojo is garbage.
M. Learning as an example of compute intensive tasks could have been the Rails moment for Ruby here, but seems like Mojo is dead on arrival — it was trending here at hackernews when announced, but no one seems to talk about it now.
---------------
(I like em-dashes, but not written wit any AI, except for language tools spellchecker)

blizdiddy a day ago

Mojo is the enshitification of programming. Learning a language is too much cognitive investment for VC rugpulls. You make the entire compiler and runtime GPL or you pound sand, that has been the bar for decades. If the new cohort of programmers can’t hold the line, we’ll all suffer.

j2kun 20 hours ago

What are you ranting about? Lattner has a strong track record of producing valuable, open source software artifacts (LLVM, Swift, MLIR) used across the industry.
pjmlp a day ago

For decades, paying for compiler tools was a thing.
- analog31 a day ago
  
  True, but aren't we in a better place now? I think the move to free tools was motivated by programmers, and not by their employers. I've read that it became hard to hire people if you used proprietary tools. Even the great Microsoft open-sourced their flagship C# language. And it's ironic but telling that the developers of proprietary software don't trust proprietary tools. And every developer looks at the state of the art in proprietary engineering tooling, such as CAD, and retches a little bit. I've seen many comments on HN along those lines.
  And "correlation is not causality," but the occupation with the most vibrant job market until recently was also the one that used free tools. Non-developers like myself looked to that trend and jumped on the bandwagon when we could. I'm doing things with Python that I can't do with Matlab because Python is free.
  Interestingly, we may be going back to proprietary tools, if our IDE's become a "terminal" for the AI coding agents, paid for by our employers.
  - pjmlp a day ago
    
    Not really, as many devs rediscover public domain, shareware, demos and open core, because it turns out there are bills to pay.
    If you want the full C# experience, you will still be getting Windows, Visual Studio, or Rider.
    VSCode C# support is under the same license as Visual Studio Community, and lack several tools, like the advanced graphical debugging for parallel code and code profiling.
    The great Microsoft has not open sourced that debugger, nor many other tools on .NET ecosystem, also they can afford to subsidise C# development as gateway into Azure, and being valued in 4 trillion, the 2nd biggest in the world.
    
    mdaniel a day ago
    
    > If you want the full C# experience, you will still be getting Windows, Visual Studio, or Rider.
    I don't believe the first two are true, and as a point of reference Rider is part of their new offerings that are free for non-commercial use https://www.jetbrains.com/rider/#:~:text=free%20for%20non-co...
    I also gravely, gravely doubt the .NET ecosystem has anything in the world to do with Azure
    
    pjmlp 21 hours ago
    
    Prove me wrong showing how to do Sharepoint or Office 365 addons with Rider, as bonus points provide the screenshots of parallel debugging and profiling experience, alongside .NET visualizers for debugging, and a bit of hot code reloading in Windows frameworks as well.
    Azure pays for .NET, and projects like Aspire.
- blizdiddy a day ago
  
  I’d prefer to not touch a hot stove twice. Telling me what processors I can use is Oracle- level rent seeking, and it should be mocked just like Oracle.
  - pjmlp a day ago
    
    I am quite sure Larry thinks very foundly of such folks when having vacations on his yatch or paying the bills to land the private jet off airport opening times.
- const_cast 18 hours ago
  
  Yes and it sucked, and those companies who relied on that largely got conned into it and then saw their tooling slowly decay and their application becomes legacy garbage.
  Its not just that OS tooling is "free", it's also better and works for way longer. If you relied on proprietary Delphi-compatible tooling, well... you fucked up!
  - pjmlp 9 hours ago
    
    You mean like the tooling for iOS, Nintendo, PlayStation, XBox, Windows, CUDA?
- kuschkufan a day ago
  
  And it sucked so hard that GNU and LLVM were born.
  - pjmlp 20 hours ago
    
    LLVM was a research project embraced by Apple to avoid GCC and anything GPL.
    Apple and Google have purged most GPL stuff out of their systems, after making clang shine.

Shorel a day ago

We have C++ :)

curtisszmania 11 hours ago

[dead]

mitch_said a day ago

[dead]