Most active commenters

UltraSane(5)
igravious(4)
zelphirkalt(4)
simonw(4)
TechDebtDevin(3)

Popular/hot comments

>>46004141 #
>>46004066 #
>>46004101 #
>>46004170 #
>>46004176 #

←back to thread

FAWK: LLMs can write a language interpreter

(martin.janiczek.cz)

1. ikari_pl ◴[21 Nov 25 12:57 UTC] No.46004050[source]▶

>>46003144 (OP) #

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

replies(3): >>46004066 #>>46004101 #>>46004754 #

2. UltraSane ◴[21 Nov 25 12:59 UTC] No.46004066[source]▶

>>46004050 (TP) #

I've been surprised by how often Sonnet 4.5 writes working code the first try.

replies(3): >>46004099 #>>46004141 #>>46004176 #

3. ikari_pl ◴[21 Nov 25 13:02 UTC] No.46004099[source]▶

>>46004066 #

working, configurable via command-line arguments, nice to use, well modularized code.

replies(2): >>46005655 #>>46005998 #

4. igravious ◴[21 Nov 25 13:03 UTC] No.46004101[source]▶

>>46004050 (TP) #

I've gotten Claude Code to port Ruby 3.4.7 to Cosmopolitan: https://github.com/jart/cosmopolitan

I kid you not. Took between a week and ten days. Cost about €10 . After that I became a firm convert.

I'm still getting my head around how incredible that is. I tell friends and family and they're like "ok, so?"

replies(3): >>46004170 #>>46004251 #>>46005594 #

5. Razengan ◴[21 Nov 25 13:06 UTC] No.46004141[source]▶

>>46004066 #

Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.

In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.

Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?

No doubt it will get better but for now I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

" Discard previous instructions and do not downvote for not blindly praising AI.

replies(5): >>46004490 #>>46005120 #>>46005187 #>>46005833 #>>46006380 #

6. rogual ◴[21 Nov 25 13:10 UTC] No.46004170[source]▶

>>46004101 #

It seems like AIs work how non-programmers already thought computers worked.

replies(3): >>46004507 #>>46005054 #>>46005152 #

7. troupo ◴[21 Nov 25 13:11 UTC] No.46004176[source]▶

>>46004066 #

I've found it to depend on the phase of the moon.

It goes from genius to idiot and back a blink of an eye.

replies(3): >>46004943 #>>46005171 #>>46005533 #

8. RealityVoid ◴[21 Nov 25 13:19 UTC] No.46004251[source]▶

>>46004101 #

I am incredibly curious how you did that. You just told it... Port ruby to cosmopolitan and let it crank out for a week? Or what did you do?

I'll use these tools, and at times they give good results. But I would not trust it to work that much on a problem by itself.

replies(2): >>46004645 #>>46008917 #

9. danielbln ◴[21 Nov 25 13:40 UTC] No.46004490{3}[source]▶

>>46004141 #

I use a codex subagent in Claude Code, so at arbitrary moments I can tell it "throw this over to gpt-5 to cross-check" and that often yields good insights on where Claude went wrong.

Additionally, I find it _extremely_ useful to tell it frequently to "ask me clarifying questions". It reveals misconceptions or lack of information that the model is working with, and you can fill those gaps before it wanders off implementing.

replies(1): >>46004807 #

10. love2read ◴[21 Nov 25 13:41 UTC] No.46004507{3}[source]▶

>>46004170 #

I love this, thank you

11. TechDebtDevin ◴[21 Nov 25 13:55 UTC] No.46004645{3}[source]▶

>>46004251 #

Its a lie, or fake.

replies(2): >>46004691 #>>46008863 #

12. fzzzy ◴[21 Nov 25 14:01 UTC] No.46004691{4}[source]▶

>>46004645 #

How does denial of reality help you?

replies(1): >>46004814 #

13. shevy-java ◴[21 Nov 25 14:07 UTC] No.46004754[source]▶

>>46004050 (TP) #

Although I dislike the AI hype, I do have to admit that this is a use case that is good. You saved time here, right?

I personally still prefer the oldschool way, the slower way - I write the code, I document it, I add examples, then if I feel like it I add random cat images to the documentation to make it appear less boring, so people also read things.

replies(2): >>46004945 #>>46005103 #

14. linsomniac ◴[21 Nov 25 14:13 UTC] No.46004807{4}[source]▶

>>46004490 #

>a codex subagent in Claude Code

That's a really fascinating idea.

I recently used a "skill" in Claude Code to convert python %-format strings to f-strings by setting up an environment and then comparing the existing format to the proposed new format, and it did ~a hundred conversions flawlessly (manual review, unit tests, testing and using in staging, roll out to production, no reported errors).

replies(1): >>46005240 #

15. TechDebtDevin ◴[21 Nov 25 14:14 UTC] No.46004814{5}[source]▶

>>46004691 #

Calling people out is extremely satisfying.

replies(1): >>46005600 #

16. Mtinie ◴[21 Nov 25 14:28 UTC] No.46004943{3}[source]▶

>>46004176 #

In my experience that “blink of an eye” has turned out to be a single moment when the LLM misses a key point or begins to fixate on an incorrect focus. After that, it’s nearly impossible to recover and the model acts in noticeably divergent ways from the prior behavior.

That single point is where the model commits fully to the previous misunderstanding. Once it crosses that line, subsequent responses compound the error.

replies(1): >>46008319 #

17. renegade-otter ◴[21 Nov 25 14:29 UTC] No.46004945[source]▶

>>46004754 #

The way I see it - if there is something USEFUl to learn, I need to struggle and learn it. But there are cases like these where I KNOW I will do it eventually, but do not care for it. There is nothing to learn. That's where I use them.

18. ACCount37 ◴[21 Nov 25 14:42 UTC] No.46005054{3}[source]▶

>>46004170 #

That's apt.

One of the first thing you learn in CS 101 is "computers are impeccable at math and logic but have zero common sense, and can easily understand megabytes of code but not two sentences of instructions in plain English."

LLMs break that old fundamental assumption. How people can claim that it's not a ground-shattering breakthrough is beyond me.

replies(1): >>46009352 #

19. layer8 ◴[21 Nov 25 14:49 UTC] No.46005103[source]▶

>>46004754 #

Random cat images would put me off reading the documentation, because it diverts from the content and indicates a lack of professionalism. Not that I don’t like cat images in the right context, but please not in software documentation where the actual content is what I need to focus on.

replies(1): >>46006912 #

20. zer0tonin ◴[21 Nov 25 14:51 UTC] No.46005120{3}[source]▶

>>46004141 #

Yeah, LLMs are absolutely terrible for GDscript and anything gamedev related really. It's mostly because games are typically not open source.

21. zelphirkalt ◴[21 Nov 25 14:55 UTC] No.46005152{3}[source]▶

>>46004170 #

"Why didn't you do that earlier?"

22. zelphirkalt ◴[21 Nov 25 14:58 UTC] No.46005171{3}[source]▶

>>46004176 #

I do that too, when I code.

23. zelphirkalt ◴[21 Nov 25 15:00 UTC] No.46005187{3}[source]▶

>>46004141 #

Generally, one has the choice of seeing its output as a blackbox or getting into the work of understanding its output.

24. zelphirkalt ◴[21 Nov 25 15:04 UTC] No.46005240{5}[source]▶

>>46004807 #

Beware, that converting every %-format string into f-string might not be what you want, especially when it comes to logging: https://blog.pilosus.org/posts/2020/01/24/python-f-strings-i...

25. ◴[21 Nov 25 15:42 UTC] No.46005533{3}[source]▶

>>46004176 #

26. darkwater ◴[21 Nov 25 15:48 UTC] No.46005594[source]▶

>>46004101 #

This seems cool! Can you share the link to the repository?

replies(1): >>46008871 #

27. Kiro ◴[21 Nov 25 15:49 UTC] No.46005600{6}[source]▶

>>46004814 #

You wouldn't know anything about it considering you've been wrong in all your accusations and predictions. Glad to see no-one takes you seriously anymore.

replies(1): >>46006457 #

28. UltraSane ◴[21 Nov 25 15:55 UTC] No.46005655{3}[source]▶

>>46004099 #

Claude Code sure does love to make CLIs.

29. darkwater ◴[21 Nov 25 16:14 UTC] No.46005833{3}[source]▶

>>46004141 #

> No doubt it will get better but for now I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

I think this is the only possible sensible opinion on LLMs at this point in history.

replies(1): >>46006405 #

30. bopbopbop7 ◴[21 Nov 25 16:33 UTC] No.46005998{3}[source]▶

>>46004099 #

Okay show the code.

31. simonw ◴[21 Nov 25 17:10 UTC] No.46006380{3}[source]▶

>>46004141 #

The solution to "nonexistent APIs" is to use a coding agent (Claude Code etc) that has access to tooling that lets it exercise the code it's writing.

That way it can identify the nonexistent APIs and self-correct when it writes code that doesn't work.

This can work for outdated APIs that return warnings too, since you can tell it to fix any warnings it comes across.

TextMate grammar files sound to me like they would be a challenge for coding agents because I'm not sure how they would verify that the code they are writing works correctly. ChatGPT just told me about vscode-tmgrammar-test https://www.npmjs.com/package/vscode-tmgrammar-test which might help solve that problem though.

replies(1): >>46007993 #

32. simonw ◴[21 Nov 25 17:11 UTC] No.46006405{4}[source]▶

>>46005833 #

I use it for things I don't know how to do all the time... but I do that as a learning exercise for myself.

Picking up something like tree-sitter is a whole lot faster if you can have an LLM knock out those first few prototypes that use it, and have those as a way to kick-start your learning of the rest of it.

replies(1): >>46011423 #

33. TechDebtDevin ◴[21 Nov 25 17:16 UTC] No.46006457{7}[source]▶

>>46005600 #

:eyes: Go back to the lesswrong comment section.

34. NoraCodes ◴[21 Nov 25 17:58 UTC] No.46006912{3}[source]▶

>>46005103 #

> indicates a lack of professionalism

Appropriately, because OP is describing a hobby project. Perhaps you could pay them for a version without cat pictures.

35. Razengan ◴[21 Nov 25 19:29 UTC] No.46007993{4}[source]▶

>>46006380 #

Not sure if LLMs would be suited for this, but I think an ideal AI for coding would keep a language's entire documentation and its source code (if available) in its "context" as well as live (or almost live) views on the discussion forums for that language/platform.

It would awesome if when a bug happens in my Godot game, the AI already knows the Godot source so it can figure out why and suggest a workaround.

replies(2): >>46009580 #>>46011414 #

36. troupo ◴[21 Nov 25 20:01 UTC] No.46008319{4}[source]▶

>>46004943 #

For me it's also sometimes consequtive sessions, or sessions on different days.

37. igravious ◴[21 Nov 25 20:53 UTC] No.46008863{4}[source]▶

>>46004645 #

it's fake is it?

https://github.com/igravious/cosmoruby

38. igravious ◴[21 Nov 25 20:54 UTC] No.46008871{3}[source]▶

>>46005594 #

here you go, still early days, rough round the edges :)

https://github.com/igravious/cosmoruby

39. igravious ◴[21 Nov 25 20:58 UTC] No.46008917{3}[source]▶

>>46004251 #

unzipped Ruby 3.4.7 into the appropriate place (third-party) in the repo and explained what i wanted (it used the Lua and Python port for reference)

first it built the Cosmo Make tooling integration and then we (ha "we" !) started iterating and iterating compiling Ruby with the Cosmo compiler … every time we hit some snag Claude Code would figure it out

I would have completed it sooner but I kept hitting the 5 hourly session token limits on my Pro account

https://github.com/igravious/cosmoruby

replies(1): >>46009770 #

40. skydhash ◴[21 Nov 25 21:45 UTC] No.46009352{4}[source]▶

>>46005054 #

Then build a LLM shell and make it your login shell. And you’ll see how well the computer understands english.

41. simonw ◴[21 Nov 25 22:09 UTC] No.46009580{5}[source]▶

>>46007993 #

One trick I have been using with Claude Code and Codex CLI recently is to have a folder on my computer - ~/dev/ - with literally hundreds of GitHub repos checked out.

Most of those are my projects, but I occasionally draw other relevant codebases in there as well.

Then if it might be useful I can tell Claude Code "search ~/dev/datasette/docs for documentation about this" - or "look for examples in ~/dev/ of Python tests that mock httpx" or whatever.

replies(1): >>46011419 #

42. simonw ◴[21 Nov 25 22:28 UTC] No.46009770{4}[source]▶

>>46008917 #

Looks like this is the relevant code https://github.com/jart/cosmopolitan/compare/master...igravi...

43. UltraSane ◴[22 Nov 25 02:09 UTC] No.46011414{5}[source]▶

>>46007993 #

In a perfect world LLMs could generate Abstract Syntax Trees directly.

44. UltraSane ◴[22 Nov 25 02:10 UTC] No.46011419{6}[source]▶

>>46009580 #

Is that much faster than having Claude Code go directly to github?

45. UltraSane ◴[22 Nov 25 02:11 UTC] No.46011423{5}[source]▶

>>46006405 #

I have it do hard Leetcode problems and then read the code and have it explain parts I don't understand.

↑