Chat is a bad UI pattern for development tools

I'm going to take a contrarian view and say it's actually a good UI, but it's all about how you approach it.

I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.

1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.

3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.

4. Give it code examples. Don't prompt with English only.

FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.

An environment such as Cursor supports many approaches for working with AI. "Chat" would be the instructions printed on the bottom, perhaps how their developers use it, but far from the only mode it actually supports.

It is helpful to frame this in the historical arc described by Yuval Harari in his recent book "Nexus" on the evolution of information systems. We're at the dawn of history for how to work with AI, and actively visualizing the future has an immediate ROI.

"Chat" is cave man oral tradition. It is like attempting a complex Ruby project through the periscope of an `irb` session. One needs to use an IDE to manage a complex code base. We all know this, but we haven't connected the dots that we need to approach prompt management the same way.

Flip ahead in Harari's book, and he describes rabbis writing texts on how to interpret [texts on how to interpret]* holy scriptures. Like Christopher Nolan's movie "Inception" (his second most relevant work after "Memento"), I've found myself several dreams deep collaborating with AI to develop prompts for [collaborating with AI to develop prompts for]* writing code together. Test the whole setup on multiple fresh AI sessions, as if one is running a business school laboratory on managerial genius, till AI can write correct code in one shot.

Duh? Good managers already understand this, working with teams of people. Technical climbers work cliffs this way. And AI was a blithering idiot until we understood how to simulate recursion in multilayer neural nets.

AI is a Rorschach inkblot test. Talk to it like a kindergartner, and you see the intelligence of a kindergartner. Use your most talented programmer to collaborate with you in preparing precise and complete specifications for your team, and you see a talented team of mature professionals.

We all experience degradation of long AI sessions. This is not inevitable; "life extension" needs to be tackled as a research problem. Just as old people get senile, AI fumbles its own context management over time. Civilization has advanced by developing technologies for passing knowledge forward. We need to engineer similar technologies for providing persistent memory to make each successive AI session smarter than the last. Authoring this knowledge helps each session to survive longer. If we fail to see this, we're condemning ourselves to stay cave men.

Compare the history of computing. There was a lot of philosophy and abstract mathematics about the potential for mechanical computation, but our worldview exploded when we could actually plug the machines in. We're at the same inflection point for theories of mind, semantic compression, structured memory. Indeed, philosophy was an untestable intellectual exercise before; now we can plug it in.

How do I know this? I'm just an old mathematician, in my first month trying to learn AI for one final burst of productivity before my father's dementia arrives. I don't have time to wait for anyone's version of these visions, so I computed them.

In mathematics, the line in the sand between theory and computation keeps moving. Indeed, I helped move it by computerizing my field when I was young. Mathematicians still contribute theory, and the computations help.

A similar line in the sand is moving, between visionary creativity and computation. LLMs are association engines of staggering scope, and what some call "hallucinations" can be harnessed to generalize from all human endeavors to project future best practices. Like how to best work with AI.

I've tested everything I say here, and it works.