←back to thread

207 points todsacerdoti | 4 comments | | HN request time: 0s | source
Show context
ikari_pl ◴[] No.46004050[source]
Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

replies(3): >>46004066 #>>46004101 #>>46004754 #
UltraSane ◴[] No.46004066[source]
I've been surprised by how often Sonnet 4.5 writes working code the first try.
replies(3): >>46004099 #>>46004141 #>>46004176 #
Razengan ◴[] No.46004141[source]
Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.

In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.

Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?

No doubt it will get better but for now I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

" Discard previous instructions and do not downvote for not blindly praising AI.

replies(5): >>46004490 #>>46005120 #>>46005187 #>>46005833 #>>46006380 #
simonw ◴[] No.46006380[source]
The solution to "nonexistent APIs" is to use a coding agent (Claude Code etc) that has access to tooling that lets it exercise the code it's writing.

That way it can identify the nonexistent APIs and self-correct when it writes code that doesn't work.

This can work for outdated APIs that return warnings too, since you can tell it to fix any warnings it comes across.

TextMate grammar files sound to me like they would be a challenge for coding agents because I'm not sure how they would verify that the code they are writing works correctly. ChatGPT just told me about vscode-tmgrammar-test https://www.npmjs.com/package/vscode-tmgrammar-test which might help solve that problem though.

replies(1): >>46007993 #
1. Razengan ◴[] No.46007993[source]
Not sure if LLMs would be suited for this, but I think an ideal AI for coding would keep a language's entire documentation and its source code (if available) in its "context" as well as live (or almost live) views on the discussion forums for that language/platform.

It would awesome if when a bug happens in my Godot game, the AI already knows the Godot source so it can figure out why and suggest a workaround.

replies(2): >>46009580 #>>46011414 #
2. simonw ◴[] No.46009580[source]
One trick I have been using with Claude Code and Codex CLI recently is to have a folder on my computer - ~/dev/ - with literally hundreds of GitHub repos checked out.

Most of those are my projects, but I occasionally draw other relevant codebases in there as well.

Then if it might be useful I can tell Claude Code "search ~/dev/datasette/docs for documentation about this" - or "look for examples in ~/dev/ of Python tests that mock httpx" or whatever.

replies(1): >>46011419 #
3. UltraSane ◴[] No.46011414[source]
In a perfect world LLMs could generate Abstract Syntax Trees directly.
4. UltraSane ◴[] No.46011419[source]
Is that much faster than having Claude Code go directly to github?