(lucalp.dev)

296 points todsacerdoti | 1 comments | 24 Jun 25 14:14 UTC | HN request time: 0.201s | source

Show context

smeeth ◴[24 Jun 25 17:15 UTC] No.44368465[source]▶

The main limitation of tokenization is actually logical operations, including arithmetic. IIRC most of the poor performance of LLMs for math problems can be attributed to some very strange things that happen when you do math with tokens.

I'd like to see a math/logic bench appear for tokenization schemes that captures this. BPB/perplexity is fine, but its not everything.

replies(6): >>44368862 #>>44369438 #>>44371781 #>>44373480 #>>44374125 #>>44375446 #

1. williamdclt ◴[25 Jun 25 09:57 UTC] No.44375446[source]▶

>>44368465 #

Even if LLMs get better at arithmetic, they don't seem like the right tool for the job.

LLMs might never be able to crunch numbers reliably, however I expect they should be very good at identifying the right formula and the inputs for a problem ("i need the answer to x*y, where x=12938762.3 and y=902832.2332"). Then they can call a math engine (calculator or wolfram alpha or whatever) to do the actual computation. That's what humans do anyway!

↑

The bitter lesson is coming for tokenization