I failed to recreate the 1996 Space Jam website with Claude

(j0nah.com)

549 points thecr0w | 2 comments | 07 Dec 25 17:18 UTC | HN request time: 0.474s | source

Show context

999900000999 ◴[07 Dec 25 18:05 UTC] No.46183645[source]▶

Space Jam website design as an LLM benchmark.

This article is a bit negative. Claude gets close , it just can't get the order right which is something OP can manually fix.

I prefer GitHub Copilot because it's cheaper and integrates with GitHub directly. I'll have times where it'll get it right, and times when I have to try 3 or 4 times.

replies(4): >>46183660 #>>46183768 #>>46184119 #>>46184297 #

1. smallnix ◴[07 Dec 25 18:20 UTC] No.46183768[source]▶

>>46183645 #

That's not the point of the article. It's about Claude/LLM being overconfident in recreating pixel perfect.

replies(1): >>46185413 #

2. jacquesm ◴[07 Dec 25 21:37 UTC] No.46185413[source]▶

>>46183768 (TP) #

All AI's are overconfident. It's impressive what they can do, but it is at the same time extremely unimpressive what they can't do while passing it off as the best thing since sliced bread. 'Perfect! Now I see the problem.'. 'Thank you for correcting that, here is a perfect recreation of problem 'x' that will work with your hardware.' (never mind the 10 glaring mistakes).

I've tried these tools a number of times and spent a good bit of effort on learning to maximize the return. By the time you know what prompt to write you've solved the problem yourself.

↑