Does someone have a good understanding how 2B models can be useful in production? What tasks are you using them for? I wonder what tasks you can fine-tune them on to produce 95-99% results (if anything).
replies(7):
I mean, they could be better (to put it nicely), but there is a legitimate use-case for them and I'd love to see more work in this space.
https://machinelearning.apple.com/research/introducing-apple...
From some article I have in my draft, experimenting with open source text embeddings:
./match venture capital
purchase 0.74005488647684
sale 0.80926752301733
place 0.81188663814236
positive sentiment 0.90793311875207
negative sentiment 0.91083707598925
time 0.9108697315425
./store sillicon valley
./match venture capital
sillicon valley 0.7245139487301
purchase 0.74005488647684
sale 0.80926752301733
place 0.81188663814236
positive sentiment 0.90793311875207
negative sentiment 0.91083707598925
time 0.9108697315425
Of course you need to figure out what these black boxes understand. For example for sentiment analysis, instead of having it match against "positive" "negative" you would have the matching terms be "kawai" and "student debt". Depending how the text embedding internalized negatives and positives based on their training data.