(github.com)

186 points darkolorin | 1 comments | 15 Jul 25 11:29 UTC | HN request time: 0.218s | source

We wrote our inference engine on Rust, it is faster than llama cpp in all of the use cases. Your feedback is very welcomed. Written from scratch with idea that you can add support of any kernel and platform.

1. zackangelo ◴[15 Jul 25 21:07 UTC] No.44575833[source]▶

>>44570048 (OP) #

We also wrote our inference engine in rust for mixlayer, happy to answer any questions from those trying to do the same.

Looks like this uses ndarray and mpsgraph (which I did not know about!), we opted to use candle instead.

↑

Show HN: We made our own inference engine for Apple Silicon