(github.com)

186 points darkolorin | 5 comments | 15 Jul 25 11:29 UTC | HN request time: 0.358s | source

We wrote our inference engine on Rust, it is faster than llama cpp in all of the use cases. Your feedback is very welcomed. Written from scratch with idea that you can add support of any kernel and platform.

1. ewuhic ◴[15 Jul 25 14:56 UTC] No.44571799[source]▶

>>44570048 (OP) #

>faster than llama cpp in all of the use cases

What's your deliberate, well-thought roadmap for achieving adoption similar to llama cpp?

replies(2): >>44572037 #>>44576529 #

2. pants2 ◴[15 Jul 25 15:16 UTC] No.44572037[source]▶

>>44571799 (TP) #

Probably getting acquired by Apple :)

3. khurs ◴[15 Jul 25 22:28 UTC] No.44576529[source]▶

>>44571799 (TP) #

Ollama is the leader isn't it?

Brew stats (downloads last 30 days)

Ollama - 28,232 Lama.cpp - 7,826

replies(1): >>44578802 #

4. DiabloD3 ◴[16 Jul 25 04:56 UTC] No.44578802[source]▶

>>44576529 #

Ollama isn't an inference engine, its a GUI slapped onto a perpetually out-of-date vendored copy of Llama.cpp underneath.

So, if you're trying to actually count LLama.cpp downloads, you'd combine those two. Also, I imagine most users on OSX aren't using Homebrew, they're getting it directly from the GH releases, so you'd also have to count those.

replies(1): >>44579160 #

5. imtringued ◴[16 Jul 25 06:00 UTC] No.44579160{3}[source]▶

>>44578802 #

Actually, ollama has stopped using llama.cpp and is using ggml directly nowadays.

↑

Show HN: We made our own inference engine for Apple Silicon