Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1479 points sandslash | 2 comments | 19 Jun 25 00:33 UTC | HN request time: 0.438s | source

Show context

mentalgear ◴[19 Jun 25 09:33 UTC] No.44316934[source]▶

Meanwhile, I asked this morning Claude 4 to write a simple EXIF normalizer. After two rounds of prompting it to double-check its code, I still had to point out that it makes no sense to load the entire image for re-orientating if the EXIF orientation is fine in the first place.

Vibe vs reality, and anyone actually working in the space daily can attest how brittle these systems are.

Maybe this changes in SWE with more automated tests in verifiable simulators, but the real world is far to complex to simulate in its vastness.

replies(7): >>44317104 #>>44317116 #>>44317136 #>>44317214 #>>44317305 #>>44317622 #>>44317741 #

1. ApeWithCompiler ◴[19 Jun 25 10:28 UTC] No.44317214[source]▶

>>44316934 #

A manager in our company introduced Gemini as a chat bot coupled to our documentation.

> It failed to write out our company name.The rest was flawed with hallucinations also, hardly worth to mention.

I wish this is a rage bait towards others, but what should me feelings be? After all this is the tool thats sold to me, I am expected to work with.

replies(1): >>44317642 #

2. gorbachev ◴[19 Jun 25 11:33 UTC] No.44317642[source]▶

>>44317214 (TP) #

We had exactly the opposite experience. CoPilot was able to answer questions accurately and reformatted the existing documentation to fit the context of users' questions, which made the information much easier to understand.

Code examples, which we offer as sort of reference implementations, were also adopted to fit the specific questions without much issues. Granted these aren't whole applications, but 10 - 25 line examples of doing API setup / calls.

We didn't, of course, just send users' questions directly to CoPilot. Instead there's a bit of prompt magic behind the scenes that tweaks the context so that CoPilot can produce better quality results.

↑