/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Un Ministral, Des Ministraux
(mistral.ai)
216 points
veggieroll
| 1 comments |
16 Oct 24 14:31 UTC
|
HN request time: 0.206s
|
source
Show context
barbegal
◴[
16 Oct 24 16:04 UTC
]
No.
41860730
[source]
▶
>>41859466 (OP)
#
Does anyone know why Mistral use a 17 bit (131k) vocabulary? I'm sure it's more efficient at encoding text but each token doesn't fit into a 16 bit register which must make it more inefficient computationally?
replies(1):
>>41865000
#
1.
cpldcpu
◴[
16 Oct 24 23:40 UTC
]
No.
41865000
[source]
▶
>>41860730
#
The tokens are immediately transformed into embeddings (very large vectors), so the 17 bit values are not used for any computation.
ID:
GO
↑