I’ve been working on a few benchmarks to test how well LLMs can recreate interfaces from screenshots. (https://github.com/alechewitt/llm-ui-challenge). From my basic tests, it seems GPT-5.2 is slightly better at these UI recreations. For example, in the MS Word replica, it implemented the undo/redo buttons as well as the bold/italic formatting that GPT-5.1 handled, and it generally seemed a bit closer to the original screenshot (https://alechewitt.github.io/llm-ui-challenge/outputs/micros...).
In the VS Code test, it also added the tabs that weren’t visible in the screenshot! (https://alechewitt.github.io/llm-ui-challenge/outputs/vs_cod...).
Generate an SVG of an octopus operating a pipe organ
Generate an SVG of a giraffe assembling a grandfather clock
Generate an SVG of a starfish driving a bulldozer
https://gally.net/temp/20251107pelican-alternatives/index.ht...
GPT-5.2 Pro cost about 80 cents per prompt through OpenRouter, so I stopped there. I don’t feel like spending that much on all thirty prompts.
https://clocks.brianmoore.com/
Probably Kimi or Deepseek are best
And lately, Claude (web) started to draw ascii charts from one day to another indstead of colorful infographicstyled-images as it did before (they were only slightly better than the ascii charts)