pjmlp 7 hours ago

Most of the Python libraries, are anyway bindings to native libraries.

Any other ecosystem is able to plug into the same underlying native libraries, or even call them directly in case of being the same language.

In a way it is kind of interesting the performance pressure that is going on Python world, otherwise CPython folks would never reconsider changing their stance on performance.

  • OptionOfT 7 hours ago

    Most of these native libraries' output isn't 1-1 mappable to Python. Based on the data you need to write native data wrappers, or worse, marshal the data into managed memory. The overhead can be high.

    It gets worse because Python doesn't expose you to memory management. This initially is an advantage, but later on causes bloat.

    Python is an incredibly easy interface over these native libraries, but has a lot of runtime costs.

    • nicce 6 hours ago

      > Python is an incredibly easy interface over these native libraries, but has a lot of runtime costs.

      It also means that many people use Python while they don't understand that what part of the code is actually fast. They mix Python code with wrappers to native libraries, and sometimes the Python code slows down the overall work substantially and people don't know that fault is there. E.g use Python Maths with the mix of Numpy math bindings, while they can do it with Numpy alone.

    • __coaxialcabal 6 hours ago

      Have you had any success using LLMs to rewrite Python to rust?

      • throwup238 4 hours ago

        They’re very good at porting code between languages but going from a dynamically typed language with a large standard library to a static one with a large library ecosystem requires a bit more hand holding. It helps to specify the rust libraries you want to use (and their versions) and you’ll probably want to give a few rounds of feedback and error correction before the code is ready.

    • pjmlp 6 hours ago

      Yet another reason to use native compiled languages with bindings to the same C and C++ libraries.

      If using C++20 onwards, then it is relatively easy to have similar high level abstractions, one just needs to let go of Cisms that many insist in using.

      Here Rust has clearly an advantage that it doesn't allow for copy-paste of C like code.

      Naturally D and Swift with their safety and C++ interop, would be an option as well.

  • oersted 6 hours ago

    Indeed, but Python is used to orchestrate all these lower-level libraries. If you have Python on top, you often want to call these libraries on a loop, or more often, within parallelized multi-stage pipelines.

    Overhead and parallelization limitations become a serious issue then. Frameworks like PySpark take your Python code and are able to distribute it better, but it's still (relatively) very slow and clunky. Or they can limit what you can do to a natively implemented DSL (often SQL, or some DataFrame API, or an API to define DAGs and execute them within a native engine), but you can't to much serious data work without UDFs, where again Python comes in. There are tricks but you can never really avoid the limitations of the Python interpreter.

lmeyerov 7 hours ago

At least for Louie.ai, basically genAI-native computational notebooks, where operational analysts ask for intensive analytics tasks for like pulling Splunk/Databricks/neo4j data, getting it wrangled in some runtime, cluster/graph/etc it, and generate interactive viz, Python has ups and downs:

On the plus side, it means our backend gets to handle small/mid datasets well. Apache Arrow adoption in analytics packages is strong, so zero copy & and columnar flows on many rows is normal. Pushing that to the GPU or another process is also great.

OTOH, one of our greatest issues is the GIL. Yes, it shows up a bit in single user code, and not discussed in the post, especially when doing divide-and-conquer flows for a user. However, the bigger issue is in stuffing many concurrent users into the same box to avoid blowing your budget. We would like the memory sharing benefits of threaded, but because of the GIL, want the isolation benefits of multiprocess. A bit same-but-different, we stream results to the browser as agents progress in your investigation, and that has not been as smooth as we have done with other languages.

And moving to multiprocess is no panacea. Eg, a local embedding engine is expensive to do in-process per worker because modern models have high RAM needs. So that biases to using a local inference server for what is meant to be an otherwise local call, which is doable, but representative of that extra work needed for production-grade software.

Interesting times!

elpalek 5 hours ago

Langchain and other frameworks are too bloated, it's good for demo, but highly recommend to build your own pipeline in production, it's not really that complicated, and you can have much better control over implementation. Plus you don't need 99% packages that comes with Langchain, reduce security vulnerabilities.

I've written a series of RAG notebooks on how to implement RAG in python directly, with minimal packages. I know it's not in Rust or C++, but it can give you some ideas on how to do things directly.

https://github.com/yudataguy/RawRAG

  • cpill 3 hours ago

    trouble is that the Langchain community is large and jumps on the latest research papers that come out almost immediately, which is a big advantage of your a small team

bborud 6 hours ago

It would be helpful to move to a compiled language with a decent toolchain. Rust and Go are good candidates.

zitterbewegung 5 hours ago

This is a comparison of apples to oranges. Langchain has an order of magnitude of examples, of integrations and features and also rewrote its whole architecture to try to make the chaining more understandable. I don't see enough documentation in this pipeline to understand how to migrate my app to this. I also realize it would take me at least a week even migrate my own app to Langchain's rewrite.

Langchain is used because it was a first mover and that's the same reason it's achilles heel and not for speed at all.

dmezzetti 7 hours ago

I've covered this before in articles such as this: https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...

You can make anything performant if you know the right buttons to push. While Rust makes it easy in some ways, Rust is also a difficult language to develop with for many developers. There is a tradeoff.

I'd also say LangChain's primary goal isn't performance it's convenience and functionality coverage.

  • timonv 4 hours ago

    Cool, that's a fun read! I recently added sparse vector support to fastembed-rs, with Splade, not bm-25. Still, would be nice to compare the two.

RcouF1uZ4gsC 6 hours ago

Why not use C++?

For the most part, these aren't security critical components.

You already have a massive amount of code you can use like say llama.cpp

You get the performance that you do with Rust.

Compared to Python, in addition to performance, you also get a much easier deployment story.

  • oersted 6 hours ago

    If you already have substantial experience with C++, this could be a good option. But I'd say nowadays that learning to use Rust *well* is much easier than learning to use C++ *well*. And the ecosystem, even if it's a lot less mature, I'd say is already better in Rust for these use-cases.

    Indeed, here security (generally safety) is a secondary concern and is not the main reason for choosing Rust, although welcome. It's just that Rust has everything that C++ gives you, but in a more modern and ergonomic package. Although, again, I can see how someone already steeped in C/C++ for years might not feel that, and reasonably so. But I think I can farely safely say that Rust is just "a better C++" from the perspective of someone starting from scratch now.

    • outworlder 4 hours ago

      Indeed.

      Plus, one doesn't usually just 'learn C++'. It's a herculean effort and I've yet to meet anyone, even people exclusively using C++ for all their careers, that could confidently say they "know C++". They may be comfortable with whatever subset of C++ their company uses, while another company's codebase will look completely alien, often with entire features being ignored that they used, and vice versa.

      Despite that, it's still a substantial time commitment, to the point that many (if not most) people working on C++ have made that their career; it's not just a tool anymore at that point. They may be more willing to jump entire industries rather than jump to another language. It is a generalization, but I have seen that far too often at this point.

      If someone is making a significant time investment starting today, I too would suggest investing in Rust instead. It also requires a decent time investment, but the rewards are great. Instead of learning where all the (hidden) landmines are, you learn how to write code that can't have those landmines in the first place. You aren't losing much either, other than the ability to read existing C++ codebases.

    • riku_iki 3 hours ago

      > But I'd say nowadays that learning to use Rust well is much easier than learning to use C++ well.

      For someone(me) who was making a choice recently, it is not that obvious. I tried to learn through rust examples and ecosystems, and there are many more wtf moments compared to when I am writing C++ as C with classes + boost, especially when writing close to metal performance code, rust has many abstractions with unobvious performance implications.

  • roca 4 hours ago

    Lots of reasons, but a big one is that dependency and build management in C++ is absolutely hellish unless you use stuff like Conan which nobody knows. In Rust, you use Cargo and everyone is happy.

    • pjmlp 2 hours ago

      There are lots of things I don't know until I learn how to use them, duh.

      Cargo is great, for pure Rust codebases, otherwise it is build.rs or having to learn another build system, and then people aren't that happy any longer.

    • riku_iki 2 hours ago

      You can always use something as simple as Make for your C++ proj with manually dumping dependencies to some libs folder.

  • timonv 4 hours ago

    I've worked with C++ in the past, it's subject to taste. I like how Rust's rigidness empowers rapid change _without_ breaking things.

    Besides, the ML ecosystem is also very mature. llama.cpp has native bindings (which Swiftide supports), onnx bindings, ndarray (numpy in Rust) works great, Candle, lots of processing utilities. Additionally, many languages are rewriting parts in Rust, more often than not, these are available in Rust as well.

  • Philpax 5 hours ago

    Why use C++? What's the benefit over Rust here?

  • IshKebab 6 hours ago

    Rust is much better than C++ overall and far easier to debug (C++ is prone to very difficult to debug memory errors which don't happen in Rust).

    The main reasons to use C++ these days are compatibility with existing code (C++ and Rust are a bit of a pain to mix), and if a big dependency is C++ (e.g. Qt).

    • pjmlp 2 hours ago

      Additionally the industry standards on GPGPU APIs, tooling ecosystem.

      Maybe one day we get Live++ or Visual Studio debugging experience for Rust, given that now plenty of Microsoft projects use Rust.

zie1ony 6 hours ago

DSPy is in Python, so it must be Python. Sorry bro :P

swyx 6 hours ago

i mean LLM based or not has nothing to do with it, this is a standard optimization, scripting lang vs systems lang story.

  • godelski 6 hours ago

    Shhhh, let this one go. So many people don't get optimization and why it is needed that I'll take anything we can get. Hell, I routinely see people saying no one needs to know C because python calls C in "the backend" (who the fuck writes "the backend" then?). The more people that can learn some HPC and parallelism, the better.

    • pjmlp 6 hours ago

      Even better if they would learn about these amazing managed languages where we can introspect the generated machine code of their dynamic compilers.

      • godelski 5 hours ago

        Agree, but idk what the gateway in is since I'm so desperate for people to just get the basic concepts.

serjester 6 hours ago

I'm surprised they don't talk about the business side of this - did they have users complaining about the speed? At the end of day they only increased performance by 50%.

These kind of optimization seem awesome once you have a somewhat mature product but you really have to wonder if this is the best use of a startup's very limited bandwidth.

  • timonv 4 hours ago

    Core maintainer of Swiftide here. That's a fair comment! Additionially, it's interesting to note that almost all the time is spend in FastEmbed / onxx in the Swiftide benchmark. A more involved follow up with chunking and transformation could be very interesting, and anecdotally shows far bigger differences. We did not have the time yet to fully dive into this.

    Personally, I just love code being fast, and Rust is incredible to work with. Exceptions granted, I'm more productive with Rust than any other language. And it's fun.

  • godelski 6 hours ago

      > At the end of day they only increased performance by 50%.
    
      > only 50%.
    
    I'm sorry... what?! That's a lot of improvement and will save you a lot of money. 10% increases are quite large!

    Think about it this way, if you have a task that takes an hour and you turn that into 59 minutes and 59 seconds, it might seem like nothing (0.02%). But now consider you have a million users, that's a million seconds, or 277 hrs! This can save you money, you are often paying by the hour in one way or another (even if you own the system, your energy has cost that's dynamic). If this is a task run frequently, you're saving a lot of time in aggregate, despite not a lot per person. But even for a single person, this is helpful if more devs do this. Death by a thousand cuts.

    But in the specific case, if a task takes an hour and you save 50%, your task takes 30 minutes. Maybe the task here took only a few minutes, but people will be chaining these together quite a lot.

    • jahewson 5 hours ago

      > 10% increases are quite large!

      You have to ask yourself, 10% of what? I don’t usually mind throwing 10% more compute or memory at a problem but I do mind if its 10x more. I’ve shipped 100x perf improvements in the past where 1.5x would have been a waste of engineering time. A more typical case is a 10x or 20x improvement that’s worth a few days coding. Now, if I’m working on a mature system that’s had tens of thousands of engineering hours devoted to it, and is used by thousands of users, then I might be quite happy with 10%. Though I also may not! The broader context matters.

      • godelski 5 hours ago

        Sure, but I didn't shy away from the fact that it is case dependent. In fact, you're just talking about the metaoptimization. Which for any optimization, needs to be considered too.

    • lpapez 6 hours ago

      Maybe these optimizations benefit the two users who do the operation three times a year.

      In such an extreme case no amount of optimization work would be profitable.

      So the parent comment asks a very valid question: how much total time was saved by this and who asked for it to be saved (paying or free tier customers for example)?

      People who see the business side of things rightfully fear when they hear the word "optimization", it's often not the best use of limited development resources - especially in an early stage product under development.

      • sroussey 5 hours ago

        I do wish that when people write about optimization that they would then multiply by usage, or something similar.

        Another way is to show CPU usage over a fleet of servers before and after. And then reshuffle the servers and use fewer and use the number of servers no longer needed as the metric.

        Number of servers have direct costs, as well as indirect costs, so you can even derive a dollar value. More so if you have a growth rate.

        • godelski 5 hours ago

            > I do wish that when people write about optimization that they would then multiply by usage, or something similar.
          
          How? You can give specific examples and then people make the same complaints because it isn't relevant to their use case. It's fairly easy to extrapolate the numbers to specific cases. We are humans, and we can fucking generalize. I'll agree there isn't much to the article, but I find this ask a bit odd. Do you not have all the information to make that calculation yourself? They should have done that if they're addressing their manager, but it looks like a technical blog where I think it is fair to assume the reader is technical and can make these extrapolations themselves.
      • godelski 5 hours ago

          > So the parent comment asks a very valid question: how much total time was saved by this and who asked for it to be saved (paying or free tier customers for example)?
        
        That is a hard question to answer because it very much depends on the use case, which is why I gave a vague response in my comment. Truth be told, __there is no answer__ BECAUSE it depends on context. In the case of AI agents, yeah, 50% is going to save you a ton of money. If you make LLM calls once a day, then no, probably not. Part of being the developer is to determine this tradeoff. Specifically, that's what technical managers are for, communicating technical stuff to business people (sure, your technical manager might not be technical, but someone being bad at their job doesn't make the point irrelevant, it just means someone else needs to do the job).

        You're right about early stage products, but there's lots of moderate and large businesses (and yes, startups) that don't optimize but should. Most software never optimizes and it has led to a lot of enshitification. Yes, move fast and break things, but go back and clean up, optimize, and reduce your tech debt, because you left a mess of broken stuff in your wake. But it is weird to pigeonhole to early stage startups.