←back to thread

255 points tbruckner | 1 comments | | HN request time: 0.314s | source
Show context
rvz ◴[] No.37420331[source]
Totally makes sense for C++ or Rust based AI models for inference instead of the over-bloated networks run on Python with sub-optimal inference and fine-tuning costs.

Minimal overhead or zero cost abstractions around deep learning libraries implemented in those languages gives some hope that people like ggerganov are not afraid of the 'don't roll your own deep learning library' dogma and now we can see the results as to why DL on the edge and local AI, is the future of efficiency in deep learning.

We'll see, but Python just can't compete on speed at all, henceforth Modular's Mojo compiler is another one that solves the problem properly with the almost 1:1 familiarity of Python.

replies(5): >>37420484 #>>37420605 #>>37420734 #>>37421354 #>>37422072 #
1. survirtual ◴[] No.37420734[source]
Python is generally just the glue language for underlying, highly optimized c++ libs. The improvements aren't just about languages. I would imagine facebook is less focused on inference, so didn't bother to make a highly optimized LLM inference engine. There also just isn't a business case for CPU-bound LLMs at an enterprise scale, so why code for that? Additionally, llama.cpp can be called by python and python could still do all the glue.

There is no language war. Use whatever tool is necessary to achieve effective results for accomplishing the mission.