Hallucinations in code are the least dangerous form of LLM mistakes

> No amount of meticulous code review—or even comprehensive automated tests—will demonstrably prove that code actually does the right thing. You have to run it yourself!

Absolutely not. If your testing requires a human to do testing, your testing has already failed. Your tests do need to include both positive and negative tests, though. If your tests don't include "things should crash and burn given ..." your tests are incomplete.

> If you’re using an LLM to write code without even running it yourself, what are you doing?

Running code through tests is literally running the code. Have code coverage turned on, so that you get yelled at for LLM code that you don't have tests for, and CI/CD that refuses to accept code that has no tests. By all means push to master on your own projects, but for production code, you better have checks in place that don't allow not-fully-tested code (coverage, unit, integration, and ideally, docs) to land.

The real problem comes from LLMs happily not just giving you code but also test cases. The same prudence applies as with test cases someone added to a PR/MR: just because there are tests doesn't mean they're good tests, or enough tests, review them in the assumption that they're testing the wrong thing entirely.