←back to thread

467 points mraniki | 1 comments | | HN request time: 0s | source
Show context
HarHarVeryFunny ◴[] No.43535790[source]
I'd like to see an honest attempt by someone to use one of these SOTA models to code an entire non-trivial app. Not a "vibe coding" flappy bird clone or minimal ioS app (call API to count calories in photo), but something real - say 10K LOC type of complexity, using best practices to give the AI all the context and guidance necessary. I'm not expecting the AI to replace the programmer - just to be a useful productivity tool when we move past demos and function writing to tackling real world projects.

It seems to me that where we are today, AI is only useful for coding for very localized tasks, and even there mostly where it's something commonplace and where the user knows enough to guide the AI when it's failing. I'm not at all convinced it's going to get much better until we have models that can actually learn (vs pre-trained) and are motivated to do so.

replies(6): >>43535869 #>>43535969 #>>43536042 #>>43536795 #>>43536842 #>>43538608 #
kaiokendev ◴[] No.43535969[source]
I made this NES emulator with Claude last week [0]. I'd say it was a pretty non-trivial task. It involved throwing a lot of NESDev docs, Disch mapper docs, and test rom output + assembly source code to the model to figure out.

[0]: https://kaiokendev.github.io/nes/

replies(2): >>43536216 #>>43539114 #
HarHarVeryFunny ◴[] No.43536216[source]
How would you characterize the overall structural complexity of the project, and degree of novelty compared to other NES emulators Claude may have seen during training ?

I'd be a bit suspect of an LLM getting an emulator right, when all it has to go on is docs and no ability to test (since pass criteria is "behaves same as something you don't have access to")... Did you check to see the degree to which it may have been copying other NES emulators ?

replies(1): >>43536426 #
kaiokendev ◴[] No.43536426[source]
> How would you characterize the overall structural complexity of the project, and degree of novelty compared to other NES emulators Claude may have seen during training ?

Highly complex, fairly novel.

Emulators themselves, for any chipset or system, have a very learnable structure: there are some modules, each having their own registers and ways of moving data between those registers, and perhaps ways to send interrupts between those modules. That's oversimplifying a bit, but if you've built an emulator once, you generally won't be blindsided when it comes to building another one. The bulk of the work lies in dissecting the hardware, which has already been done for the NES, and more open architectures typically have their entire pinouts and processes available online. All that to say - I don't think Claude would have difficulty implementing most emulators - it's good enough at programming and parsing assembly that as long as the underlying microprocessor architecture is known, it can implement it.

As far as other NES emulators goes, this project does many things in non-standard ways, for instance I use per-pixel rendering whereas many emulators use scanline rendering. I use an AudioWorklet with various mixing effects for audio, whereas other emulators use something much simpler or don't even bother fully implementing the APU. I can comfortably say there's no NES emulator out there written the way this one is written.

> I'd be a bit suspect of an LLM getting an emulator right, when all it has to go on is docs and no ability to test (since pass criteria is "behaves same as something you don't have access to")... Did you check to see the degree to which it may have been copying other NES emulators ?

Purely javascript-based NES emulators are few in number, and those that implement all aspects of the system even fewer, so I can comfortably say it doesn't copy any of the ones I've seen. I would be surprised if it did, since I came up with most of the abstractions myself and guided Claude heavily. While Claude can't get docs on it's own, I can. I put all the relevant documentation in the context window myself, along with the test rom output and source code. I'm still commanding the LLM myself, it's not like I told Claude to build an emulator and left it alone for 3 days.

replies(1): >>43536511 #
1. HarHarVeryFunny ◴[] No.43536511[source]
Interesting - thanks!

Even with your own expert guidance, it does seem impressive that Claude was able complete a project like this without getting bogged down in the complexity.