My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

(simonwillison.net)

577 points simonw | 1 comments | 29 Jul 25 13:45 UTC | HN request time: 0.2s | source

Show context

AlexeyBrin ◴[29 Jul 25 14:02 UTC] No.44723521[source]▶

>>44723316 (OP) #

Most likely its training data included countless Space Invaders in various programming languages.

replies(6): >>44723664 #>>44723707 #>>44723945 #>>44724116 #>>44724439 #>>44724690 #

NitpickLawyer ◴[29 Jul 25 14:19 UTC] No.44723707[source]▶

>>44723521 #

This comment is ~3 years late. Every model since gpt3 has had the entirety of available code in their training data. That's not a gotcha anymore.

We went from chatgpt's "oh, look, it looks like python code but everything is wrong" to "here's a full stack boilerplate app that does what you asked and works in 0-shot" inside 2 years. That's the kicker. And the sauce isn't just in the training set, models now do post-training and RL and a bunch of other stuff to get to where we are. Not to mention the insane abilities with extended context (first models were 2/4k max), agentic stuff, and so on.

These kinds of comments are really missing the point.

replies(7): >>44723808 #>>44723897 #>>44724175 #>>44724204 #>>44724397 #>>44724433 #>>44729201 #

haar ◴[29 Jul 25 14:26 UTC] No.44723808[source]▶

>>44723707 #

I've had little success with Agentic coding, and what success I have had has been paired with hours of frustration, where I'd have been better off doing it myself for anything but the most basic tasks.

Even then, when you start to build up complexity within a codebase - the results have often been worse than "I'll start generating it all from scratch again, and include this as an addition to the initial longtail specification prompt as well", and even then... it's been a crapshoot.

I _want_ to like it. The times where it initially "just worked" felt magical and inspired me with the possibilities. That's what prompted me to get more engaged and use it more. The reality of doing so is just frustrating and wishing things _actually worked_ anywhere close to expectations.

replies(1): >>44724064 #

aschobel ◴[29 Jul 25 14:43 UTC] No.44724064[source]▶

>>44723808 #

Bingo, it's magical but the learning curve is very very steep. The METR study on open-source productivity alluded to this a bit.

I am definitely at a point where I am more productive with it, but it took a bunch of effort.

replies(2): >>44724470 #>>44724770 #

devmor ◴[29 Jul 25 15:18 UTC] No.44724470[source]▶

>>44724064 #

The subjects in the study you are referencing also believed that they were more productive with it. What metrics do you have to convince yourself you aren't under the same illusionary bias they were?

replies(1): >>44724497 #

simonw ◴[29 Jul 25 15:20 UTC] No.44724497[source]▶

>>44724470 #

Yesterday I used ffmpeg to extract the frame at the 13 second mark of a video out as a JPEG.

If I didn't have an LLM to figure that out for me I wouldn't have done it at all.

replies(4): >>44724574 #>>44724628 #>>44724962 #>>44733418 #

dingnuts ◴[29 Jul 25 15:31 UTC] No.44724628[source]▶

>>44724497 #

It is nice to use LLMs to generate ffmpeg commands, because those can be pretty tricky, but really, you wouldn't have just used the man page before?

That explains a lot about Django that the author is allergic to man pages lol

replies(2): >>44724660 #>>44726328 #

ben_w ◴[29 Jul 25 17:49 UTC] No.44726328[source]▶

>>44724628 #

I remember when I was a kid, people asking a teacher how to spell a word, and the answer was generally "look it up in a dictionary"… which you can only do if you already have shortlist of possible spellings.

*nix man pages are the same: if you already know which tool can solve your problem, they're easy to use. But you have to already have a shortlist of tools that can solve your problem, before you even know which man pages to read.

replies(2): >>44729432 #>>44734259 #

1. adastra22 ◴[29 Jul 25 23:29 UTC] No.44729432[source]▶

>>44726328 #

That’s what GNU info is for, of course.

↑