←back to thread

1479 points sandslash | 3 comments | | HN request time: 0.661s | source
Show context
mentalgear ◴[] No.44316934[source]
Meanwhile, I asked this morning Claude 4 to write a simple EXIF normalizer. After two rounds of prompting it to double-check its code, I still had to point out that it makes no sense to load the entire image for re-orientating if the EXIF orientation is fine in the first place.

Vibe vs reality, and anyone actually working in the space daily can attest how brittle these systems are.

Maybe this changes in SWE with more automated tests in verifiable simulators, but the real world is far to complex to simulate in its vastness.

replies(7): >>44317104 #>>44317116 #>>44317136 #>>44317214 #>>44317305 #>>44317622 #>>44317741 #
ramon156 ◴[] No.44317136[source]
The real question is how long it'll take until they're not brittle
replies(3): >>44317160 #>>44317197 #>>44317483 #
kubb ◴[] No.44317160[source]
Or will they ever be reliable. Your question is already making an assumption.
replies(3): >>44317316 #>>44317424 #>>44317731 #
diggan ◴[] No.44317316[source]
They're reliable already if you change the way you approach them. These probabilistic token generators probably never will be "reliable" if you expect them to 100% always output exactly what you had in mind, without iterating in user-space (the prompts).
replies(1): >>44317546 #
kubb ◴[] No.44317546[source]
I also think they might never become reliable.
replies(2): >>44317591 #>>44317599 #
diggan ◴[] No.44317591[source]
But what does that mean? If you tell the LLM "Say just 'hi' without any extra words or explanations", do you not get "hi" back from it?
replies(2): >>44317612 #>>44318187 #
kubb ◴[] No.44318187[source]
Sometimes I get "Hi!", sometimes "Hey!".
replies(1): >>44318270 #
diggan ◴[] No.44318270[source]
Which model? Just tried a bunch of ChatGPT, OpenAI's API, Claude, Anthropic's API and DeepSeek's API with both chat and reasonee, every single one replied with a single "hi".
replies(1): >>44318659 #
1. throwdbaaway ◴[] No.44318659[source]
o3-mini-2025-01-31 with high reasoning effort replied with "Hi" after 448 reasoning tokens.

gpt-4.5-preview-2025-02-27 replied with "Hi!"

replies(1): >>44319506 #
2. diggan ◴[] No.44319506[source]
> o3-mini-2025-01-31 with high reasoning effort replied with "Hi" after 448 reasoning tokens.

I got "hi", as expected. What is the full system prompt + user message you're using?

https://i.imgur.com/Y923KXB.png

> gpt-4.5-preview-2025-02-27

Same "hi": https://i.imgur.com/VxiIrIy.png

replies(1): >>44324512 #
3. throwdbaaway ◴[] No.44324512[source]
Ah right, my bad. Somehow I thought the prompt was only:

    Say just 'hi'
while the "without any extra words or explanations" part was for the readers of your comment. Perhaps kubb also made a similar mistake.

I used empty system prompt.