←back to thread

625 points lukebennett | 1 comments | | HN request time: 0s | source
Show context
Animats ◴[] No.42139919[source]
"While the model was initially expected to significantly surpass previous versions of the technology behind ChatGPT, it fell short in key areas, particularly in answering coding questions outside its training data."

Right. If you generate some code with ChatGPT, and then try to find similar code on the web, you usually will. Search for unusual phrases in comments and for variable names. Often, something from Stack Overflow will match.

LLMs do search and copy/paste with idiom translation and some transliteration. That's good enough for a lot of common problems. Especially in the HTML/Javascript space, where people solve the same problems over and over. Or problems covered in textbooks and classes.

But it does not look like artificial general intelligence emerges from LLMs alone.

There's also the elephant in the room - the hallucination/lack of confidence metric problem. The curse of LLMs is that they return answers which are confident but wrong. "I don't know" is rarely seen. Until that's fixed, you can't trust LLMs to actually do much on their own. LLMs with a confidence metric would be much more useful than what we have now.

replies(4): >>42139986 #>>42140895 #>>42141067 #>>42143954 #
dmd ◴[] No.42139986[source]
> Right. If you generate some code with ChatGPT, and then try to find similar code on the web, you usually will.

People who "follow" AI, as the latest fad they want to comment on and appear intelligent about, repeat things like this constantly, even though they're not actually true for anything but the most trivial hello-world types of problems.

I write code all day every day. I use Copilot and the like all day every day (for me, in the medical imaging software field), and all day every day it is incredibly useful and writes nearly exactly the code I would have written, but faster. And none of it appears anywhere else; I've checked.

replies(5): >>42140406 #>>42142508 #>>42142654 #>>42143451 #>>42145565 #
ngai_aku ◴[] No.42140406[source]
You’re solving novel problems all day every day?
replies(2): >>42140436 #>>42144250 #
dmd ◴[] No.42140436[source]
Pretty much, yes. My job is pretty fun; it mostly entails things like "take this horrible file workflow some research assistant came up with while high 15 years ago and turn it into a newer horrible file format a NEW research assistant came up with (also while high) 3 years ago" - and automate this in our data processing pipeline.
replies(3): >>42140978 #>>42141764 #>>42141794 #
delusional ◴[] No.42141764[source]
If I understand that correctly you're converting file formats? That's not exactly "novel"
replies(1): >>42142072 #
1. llm_trw ◴[] No.42142072{3}[source]
This is exactly the type of novel work that llms are good at. It's tedious and has annoying internal logic, but that logic is quite flat and there are a million examples to generalise from.

What they fail at is code with high cyclomatic complexity. Back in the llama 2 finetune days I wrote a script that would break down what each node in the control flow graph into its own prompt using literate programming and the results were amazing for the time. Using the same prompts I'd get correct code in every language I tried.