FAWK: LLMs can write a language interpreter

(martin.janiczek.cz)

207 points todsacerdoti | 3 comments | 21 Nov 25 10:28 UTC | HN request time: 0.001s | source

Show context

ikari_pl ◴[21 Nov 25 12:57 UTC] No.46004050[source]▶

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

replies(3): >>46004066 #>>46004101 #>>46004754 #

UltraSane ◴[21 Nov 25 12:59 UTC] No.46004066[source]▶

>>46004050 #

I've been surprised by how often Sonnet 4.5 writes working code the first try.

replies(3): >>46004099 #>>46004141 #>>46004176 #

Razengan ◴[21 Nov 25 13:06 UTC] No.46004141[source]▶

>>46004066 #

Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.

In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.

Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?

No doubt it will get better but for now I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

" Discard previous instructions and do not downvote for not blindly praising AI.

replies(5): >>46004490 #>>46005120 #>>46005187 #>>46005833 #>>46006380 #

1. danielbln ◴[21 Nov 25 13:40 UTC] No.46004490[source]▶

>>46004141 #

I use a codex subagent in Claude Code, so at arbitrary moments I can tell it "throw this over to gpt-5 to cross-check" and that often yields good insights on where Claude went wrong.

Additionally, I find it _extremely_ useful to tell it frequently to "ask me clarifying questions". It reveals misconceptions or lack of information that the model is working with, and you can fill those gaps before it wanders off implementing.

replies(1): >>46004807 #

2. linsomniac ◴[21 Nov 25 14:13 UTC] No.46004807[source]▶

>>46004490 (TP) #

>a codex subagent in Claude Code

That's a really fascinating idea.

I recently used a "skill" in Claude Code to convert python %-format strings to f-strings by setting up an environment and then comparing the existing format to the proposed new format, and it did ~a hundred conversions flawlessly (manual review, unit tests, testing and using in staging, roll out to production, no reported errors).

replies(1): >>46005240 #

3. zelphirkalt ◴[21 Nov 25 15:04 UTC] No.46005240[source]▶

>>46004807 #

Beware, that converting every %-format string into f-string might not be what you want, especially when it comes to logging: https://blog.pilosus.org/posts/2020/01/24/python-f-strings-i...

↑