My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

(simonwillison.net)

577 points simonw | 2 comments | 29 Jul 25 13:45 UTC | HN request time: 1.038s | source

Show context

AlexeyBrin ◴[29 Jul 25 14:02 UTC] No.44723521[source]▶

>>44723316 (OP) #

Most likely its training data included countless Space Invaders in various programming languages.

replies(6): >>44723664 #>>44723707 #>>44723945 #>>44724116 #>>44724439 #>>44724690 #

NitpickLawyer ◴[29 Jul 25 14:19 UTC] No.44723707[source]▶

>>44723521 #

This comment is ~3 years late. Every model since gpt3 has had the entirety of available code in their training data. That's not a gotcha anymore.

We went from chatgpt's "oh, look, it looks like python code but everything is wrong" to "here's a full stack boilerplate app that does what you asked and works in 0-shot" inside 2 years. That's the kicker. And the sauce isn't just in the training set, models now do post-training and RL and a bunch of other stuff to get to where we are. Not to mention the insane abilities with extended context (first models were 2/4k max), agentic stuff, and so on.

These kinds of comments are really missing the point.

replies(7): >>44723808 #>>44723897 #>>44724175 #>>44724204 #>>44724397 #>>44724433 #>>44729201 #

haar ◴[29 Jul 25 14:26 UTC] No.44723808[source]▶

>>44723707 #

I've had little success with Agentic coding, and what success I have had has been paired with hours of frustration, where I'd have been better off doing it myself for anything but the most basic tasks.

Even then, when you start to build up complexity within a codebase - the results have often been worse than "I'll start generating it all from scratch again, and include this as an addition to the initial longtail specification prompt as well", and even then... it's been a crapshoot.

I _want_ to like it. The times where it initially "just worked" felt magical and inspired me with the possibilities. That's what prompted me to get more engaged and use it more. The reality of doing so is just frustrating and wishing things _actually worked_ anywhere close to expectations.

replies(1): >>44724064 #

aschobel ◴[29 Jul 25 14:43 UTC] No.44724064[source]▶

>>44723808 #

Bingo, it's magical but the learning curve is very very steep. The METR study on open-source productivity alluded to this a bit.

I am definitely at a point where I am more productive with it, but it took a bunch of effort.

replies(2): >>44724470 #>>44724770 #

1. haar ◴[29 Jul 25 15:41 UTC] No.44724770[source]▶

>>44724064 #

Apologies if I was unclear.

The more I've used it, the more I've disliked how poor the results it's produced, and the more I've realised I would have been better served by doing it myself and following a methodical path for things that I didn't have experience with.

It's easier to step through a problem as I'm learning and making small changes than an LLM going "It's done, and production ready!" where it just straight up doesn't work for 101 different tiny reasons.

replies(1): >>44732184 #

2. airspresso ◴[30 Jul 25 09:03 UTC] No.44732184[source]▶

>>44724770 (TP) #

My preferred approach to avoid that outcome is to divide & conquer the problem. Ask the LLM to implement each small bit in the order you'd implement it yourself given what you know about the codebase.

↑