←back to thread

261 points fzliu | 1 comments | | HN request time: 0.208s | source
Show context
mech4lunch ◴[] No.42163961[source]
The colab measures dot product values 0.428 and 0.498, describing them as "...similarity value is quite high." Is that high? Can you design a system that confidently labels data with a 0.4 threshold?
replies(3): >>42164339 #>>42165357 #>>42165524 #
1. fzliu ◴[] No.42165357[source]
While the raw similarity score does matter, what typically matters more is the score relative to other documents. In the case of the examples in the notebook, those values were the highest in relative terms.

I can see why this may be unclear/confusing -- we will correct it. Thank you for the feedback!