←back to thread

579 points paulpauper | 2 comments | | HN request time: 0.48s | source
Show context
gundmc ◴[] No.43603886[source]
This was published the day before Gemini 2.5 was released. I'd be interested if they see any difference with that model. Anecdotally, that is the first model that really made me go wow and made a big difference for my productivity.
replies(4): >>43603928 #>>43603961 #>>43604159 #>>43610218 #
jonahx ◴[] No.43603928[source]
I doubt it. It still flails miserably like the other models on anything remotely hard, even with plenty of human coaxing. For example, try to get it to solve: https://www.janestreet.com/puzzles/hall-of-mirrors-3-index/
replies(2): >>43604005 #>>43604027 #
flutas ◴[] No.43604027[source]
FWIW 2.5-exp was the only one that managed to get a problem I asked it right, compared to Claude 3.7 and o1 (or any of the other free models in Cursor).

It was reverse engineering ~550MB of Hermes bytecode from a react native app, with each function split into a separate file for grep-ability and LLM compatibility.

The others would all start off right then quickly default to just greping randomly what they expected it to be, which failed quickly. 2.5 traced the function all the way back to the networking call and provided the expected response payload.

All the others hallucinated the networking response I was trying to figure out. 2.5 Provided it exactly enough for me to intercept the request and using the response it provided to get what I wanted to show up.

replies(1): >>43604169 #
1. arkmm ◴[] No.43604169[source]
How did you fit 550MB of bytecode into the context window? Was this using 2.5 in an agentic framework? (i.e. repeated model calls and tool usage)
replies(1): >>43606152 #
2. flutas ◴[] No.43606152[source]
I manually pre-parsed the bytecode file with awk into a bazillion individual files that were each just one function, and gave it the hint to grep to sort through them. This was all done in Cursor.

    awk '/^=> \[Function #/ {           
        if (out) close(out);
        fn = $0; sub(/^.*#/, "", fn); sub(/ .*/, "", fn);
        out = "function_" fn ".txt"
    }
    { if (out) print > out }' bundle.hasm
Quick example of the output it gave and it's process.

https://i.imgur.com/Cmg4KK1.png

https://i.imgur.com/ApNxUkB.png