(www.designarena.ai)

86 points grace77 | 1 comments | 12 Jul 25 15:07 UTC | HN request time: 0.206s | source

I’ve been using AI to generate some repetitive frontend (guilty), and while most outputs felt vibe-coded, some results were surprisingly good. So I cleaned it up and made a ranking game out of it with friends, and you can check it out here: https://www.designarena.ai/vote

/vote: Your prompt will be answered by four random, anonymous models. You pick the one you prefer and crown the winner, tournament-style.

/leaderboard: See the current winning models, as dictated by voter preferences.

/play: Iterate quickly by seeing four models respond to the same input and pressing space to regenerate the results you don’t lock-in.

We were especially impressed with the quality of DeepSeek and Grok, and variance between categories (To judge by the results so far, OpenAI is very good for game dev, but seems to suck everywhere else).

We’ve learned a lot, and are curious to hear your comments and questions. Excited to make this better!

1. calcsam ◴[12 Jul 25 23:45 UTC] No.44546210[source]▶

>>44542578 (OP) #

interesting idea, this benchmark maps fairly closely to the types of output I typically ask LLMs to generate for me day-to-day

↑

Show HN: DesignArena – crowdsourced benchmark for AI-generated UI/UX