/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
All-in-one embedding model for interleaved text, images, and screenshots
(blog.voyageai.com)
261 points
fzliu
| 1 comments |
17 Nov 24 07:42 UTC
|
HN request time: 0.21s
|
source
Show context
mech4lunch
◴[
17 Nov 24 12:55 UTC
]
No.
42163961
[source]
▶
>>42162622 (OP)
#
The colab measures dot product values 0.428 and 0.498, describing them as "...similarity value is quite high." Is that high? Can you design a system that confidently labels data with a 0.4 threshold?
replies(3):
>>42164339
#
>>42165357
#
>>42165524
#
1.
minimaxir
◴[
17 Nov 24 17:38 UTC
]
No.
42165524
[source]
▶
>>42163961
#
A 0.4 with cosine similarity is not the same as a 0.4 with sigmoid thresholding.
0.4 cosine similarity is pretty good for real-world data that isn't an near-identical duplicate.
ID:
GO
↑