←back to thread

261 points fzliu | 1 comments | | HN request time: 0.21s | source
Show context
mech4lunch ◴[] No.42163961[source]
The colab measures dot product values 0.428 and 0.498, describing them as "...similarity value is quite high." Is that high? Can you design a system that confidently labels data with a 0.4 threshold?
replies(3): >>42164339 #>>42165357 #>>42165524 #
1. minimaxir ◴[] No.42165524[source]
A 0.4 with cosine similarity is not the same as a 0.4 with sigmoid thresholding.

0.4 cosine similarity is pretty good for real-world data that isn't an near-identical duplicate.