I find it somewhat ironic that you pitch this as "No callbacks. No promises. No async/await keywords. Just Ruby code that scales."
When you literally show in the example right above that you need both an "async do" and a "end.map(&:wait)".
I'll add - the one compelling argument you make about needing a db connection per worker is mitigated with something like pgbouncer without much work. The OS overhead per thread (or hell, even per process: https://jacob.gold/posts/serving-200-million-requests-with-c...) isn't an argument I really buy, especially given your use case is long running llm chat tasks as stated above.
Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now).
Fair point on the syntax, I should have been clearer. What I meant is that your existing Ruby code doesn't need modifications. In Python you'd need to use a different HTTP library, add `async def` and `await` everywhere, etc. In Ruby the same `Net::HTTP` call works in both sync and async context.
The `Async do` wrapper just at the orchestration level, not throughout your codebase. That's a huge difference in practice.
Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams. Your 26th user waits. With fibers, you can handle thousands on the same hardware because they yield during the 30-60s of waiting for tokens.
Sure, for pure performance you'd pick another language. But that's not the point - the point is that you can get much better performance for IO-bound workloads in Ruby today, without switching languages or rewriting everything.
It's about making Ruby better at what it's already being used for, not competing with system languages.
> Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now
"Fast and efficient" can mean almost anything. You can be fast and efficient in Ruby at handling thousands of concurrent llm chats (or other IO-bound work), as per the article. You can also be fast and efficient at CPU-bound work (it's possible to enjoy Ruby while keeping in mind how it will translate into C). You probably cannot be fast and efficient at micro-managing memory allocations in Ruby. If you're ok to brush ruby aside over a vague generalization, maybe you just don't see its appeal in the first place, which is fair, but that makes the other reasons you provide kind of moot.
Gotta give credit for wonderfully clear writing. You can tell a person understands what they're saying by how well they express it. Reads smooth, and makes me see the author's mental model.
As far as substance: I love ruby libraries that allow you to simply "insert any ruby code". Many libraries tell you to call specific declarative functions, but I think Ruby shines at letting you use Ruby, instead of some limited subset of it. Examples of not-great approaches (imo) are libraries that try to take over how you write code, and give you a special declarative syntax for runtime type checking, building services out of lambdas, composing functions. Ruby's async is an example of "just insert any ruby in here". You can build runtime type checking the same way — allow people to check the value with any ruby code they like. Essentially, I agree with author's sentiment, and wish more people appreciated the beauty of this approach.
Mmmm...
I find it somewhat ironic that you pitch this as "No callbacks. No promises. No async/await keywords. Just Ruby code that scales."
When you literally show in the example right above that you need both an "async do" and a "end.map(&:wait)".
I'll add - the one compelling argument you make about needing a db connection per worker is mitigated with something like pgbouncer without much work. The OS overhead per thread (or hell, even per process: https://jacob.gold/posts/serving-200-million-requests-with-c...) isn't an argument I really buy, especially given your use case is long running llm chat tasks as stated above.
Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now).
Fair point on the syntax, I should have been clearer. What I meant is that your existing Ruby code doesn't need modifications. In Python you'd need to use a different HTTP library, add `async def` and `await` everywhere, etc. In Ruby the same `Net::HTTP` call works in both sync and async context.
The `Async do` wrapper just at the orchestration level, not throughout your codebase. That's a huge difference in practice.
Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams. Your 26th user waits. With fibers, you can handle thousands on the same hardware because they yield during the 30-60s of waiting for tokens.
Sure, for pure performance you'd pick another language. But that's not the point - the point is that you can get much better performance for IO-bound workloads in Ruby today, without switching languages or rewriting everything.
It's about making Ruby better at what it's already being used for, not competing with system languages.
> Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now
"Fast and efficient" can mean almost anything. You can be fast and efficient in Ruby at handling thousands of concurrent llm chats (or other IO-bound work), as per the article. You can also be fast and efficient at CPU-bound work (it's possible to enjoy Ruby while keeping in mind how it will translate into C). You probably cannot be fast and efficient at micro-managing memory allocations in Ruby. If you're ok to brush ruby aside over a vague generalization, maybe you just don't see its appeal in the first place, which is fair, but that makes the other reasons you provide kind of moot.
Aren't threads overkill for an IO workload? You can do a lot with 1 thread and epoll(7).
Gotta give credit for wonderfully clear writing. You can tell a person understands what they're saying by how well they express it. Reads smooth, and makes me see the author's mental model.
As far as substance: I love ruby libraries that allow you to simply "insert any ruby code". Many libraries tell you to call specific declarative functions, but I think Ruby shines at letting you use Ruby, instead of some limited subset of it. Examples of not-great approaches (imo) are libraries that try to take over how you write code, and give you a special declarative syntax for runtime type checking, building services out of lambdas, composing functions. Ruby's async is an example of "just insert any ruby in here". You can build runtime type checking the same way — allow people to check the value with any ruby code they like. Essentially, I agree with author's sentiment, and wish more people appreciated the beauty of this approach.
Author here. Thank you, that means a lot!
Happy to answer any questions.
"these microseconds add up to real latency"
While I love Ruby, if performance is your main motiviation, you would not be using a scripting language.
What an interesting perspective on Ruby async, the I/O multiplexing example was quite faschinating to see aswell
Python and Ruby developers discovering what was standard on Javascript a decade ago.
*yawn*
Imitation is the sincerest form of flattery, at least.