←back to thread

602 points emrah | 2 comments | | HN request time: 0.508s | source
Show context
justanotheratom ◴[] No.43743956[source]
Anyone packaged one of these in an iPhone App? I am sure it is doable, but I am curious what tokens/sec is possible these days. I would love to ship "private" AI Apps if we can get reasonable tokens/sec.
replies(4): >>43743983 #>>43744244 #>>43744274 #>>43744863 #
1. nolist_policy ◴[] No.43744863[source]
FWIW, I can run Gemma-3-12b-it-qat on my Galaxy Fold 4 with 12Gb ram at around 1.5 tokens / s. I use plain llama.cpp with Termux.
replies(1): >>43745150 #
2. Casteil ◴[] No.43745150[source]
Does this turn your phone into a personal space heater too?