←back to thread

111 points galeos | 1 comments | | HN request time: 0.203s | source
Show context
balazstorok ◴[] No.43714642[source]
Does someone have a good understanding how 2B models can be useful in production? What tasks are you using them for? I wonder what tasks you can fine-tune them on to produce 95-99% results (if anything).
replies(7): >>43714663 #>>43714744 #>>43714864 #>>43714922 #>>43714969 #>>43715153 #>>43715192 #
1. future10se ◴[] No.43714969[source]
The on-device models used for Apple Intelligence (writing tools, notification and email/message summaries, etc.) are around ~3B parameters.

I mean, they could be better (to put it nicely), but there is a legitimate use-case for them and I'd love to see more work in this space.

https://machinelearning.apple.com/research/introducing-apple...

https://arxiv.org/abs/2407.21075