(aggressivelyparaphrasing.me)

48 points markerz | 1 comments | 31 Mar 25 19:41 UTC | HN request time: 0.218s | source

Show context

esomod ◴[03 Apr 25 14:30 UTC] No.43570236[source]▶

>>43538986 (OP) #

o3-mini solved it

https://chatgpt.com/share/67ee9b1c-57d4-8005-b7ec-16fceab1ff...

replies(2): >>43570293 #>>43571020 #

lqet ◴[03 Apr 25 15:19 UTC] No.43571020[source]▶

>>43570236 #

Looking at this, I do not understand how people can be bearish about LLMs.

replies(4): >>43571292 #>>43571480 #>>43571695 #>>43573177 #

1. riku_iki ◴[03 Apr 25 16:03 UTC] No.43571695[source]▶

>>43571020 #

One concern is that coding/logical puzzles are verticals where LLMs have lots of training data, they require small context window, and that's why they are doing well, but they don't necessary scale/generalize on other topics. For example I yet to see agents which would grab say Postgres codebase from github, and add untrivial features, send patch which is accepted.

↑

AI/Math Puzzle