←back to thread

577 points simonw | 1 comments | | HN request time: 0s | source
Show context
bgwalter ◴[] No.44724997[source]
The GML-4.5 model utterly fails at creating ASCII art or factorizing numbers. It can "write" Space Invaders because there are literally thousands of open source projects out there.

This is another example of LLMs being dumb copiers that do understand human prompts.

But there is one positive side to this: If this photocopying business can be run locally, the stocks of OpenAI etc. should got to zero.

replies(1): >>44725037 #
simonw ◴[] No.44725037[source]
Why would you use an LLM to factorize numbers?
replies(1): >>44725141 #
bgwalter ◴[] No.44725141[source]
Because we are told that they can solve IMO problems. Yet they fail at basic math problems, not only at factorization but also when probing them with relatively basic symbolic math that would not require the invocation of an external program.

Also, you know it they fail they could say so instead of giving a hallucinated answer. First the models lie and say that a 20 digit number takes vast amounts of computing. Then, if pointed to a factorization program they pretend to execute it and lie about the output.

There is no intelligence or flexibility apart from stealing other people's open source code.

replies(1): >>44725261 #
simonw ◴[] No.44725261[source]
That's why the IMO results were so notable: that was one of those moments where new models were demonstrated doing something that they had previously been unable to do.
replies(2): >>44725609 #>>44728194 #
bgwalter ◴[] No.44728194[source]
The results were private and the methodology was not revealed. Even Tao, who was bullish on "AI", is starting to question the process.
replies(1): >>44728474 #
1. simonw ◴[] No.44728474[source]
The same thing has also been achieved by a Google DeepMind team and at least one group of independent researchers using publicly available models and careful promoting tricks.