(j0nah.com)

549 points thecr0w | 1 comments | 07 Dec 25 17:18 UTC | HN request time: 0.206s | source

Show context

Wowfunhappy ◴[07 Dec 25 17:57 UTC] No.46183598[source]▶

Claude is not very good at using screenshots. The model may technically be multi-modal, but its strength is clearly in reading text. I'm not surprised it failed here.

replies(3): >>46184084 #>>46184296 #>>46186300 #

dcanelhas ◴[07 Dec 25 19:25 UTC] No.46184296[source]▶

>>46183598 #

Even with text, parsing content in 2D seems to be a challenge for every LLM I have interacted with. Try getting a chatbot to make an ascii-art circle with a specific radius and you'll see what I mean.

replies(1): >>46185268 #

1. Wowfunhappy ◴[07 Dec 25 21:20 UTC] No.46185268[source]▶

>>46184296 #

I don't really consider ASCII art to be text. It requires a completely different type of reasoning. A blind person can be understand text if it's read out loud. A blind person really can't understand ASCII art if it's read out loud.

↑

I failed to recreate the 1996 Space Jam website with Claude