←back to thread

612 points meetpateltech | 7 comments | | HN request time: 0.825s | source | bottom
1. simonw ◴[] No.42952478[source]
I upgraded my llm-gemini plugin to handle this, and shared the results of my "Generate an SVG of a pelican riding a bicycle" benchmark here: https://simonwillison.net/2025/Feb/5/gemini-2/

The pricing is interesting: Gemini 2.0 Flash-Lite is 7.5c/million input tokens and 30c/million output tokens - half the price of OpenAI's GPT-4o mini (15c/60c).

Gemini 2.0 Flash isn't much more: 10c/million for text/image input, 70c/million for audio input, 40c/million for output. Again, cheaper than GPT-4o mini.

replies(5): >>42952546 #>>42953320 #>>42954077 #>>42954864 #>>42955373 #
2. zamadatix ◴[] No.42952546[source]
Is there a way to see/compare the shared results for all of the LLMs you've tested this prompt on in one place? The 2.0 pro result seems decent but I don't have a baseline if that's because it is or if the other 2 are just "extremely bad" or something.
replies(1): >>42953688 #
3. iimaginary ◴[] No.42953320[source]
The only benchmark worth paying attention to.
4. nolist_policy ◴[] No.42953688[source]
Search by tag: https://simonwillison.net/tags/pelican-riding-a-bicycle/
5. qingcharles ◴[] No.42954077[source]
Not a bad pelican from 2.0 Pro! The singularity is almost upon us :)
6. ◴[] No.42954864[source]
7. mattlondon ◴[] No.42955373[source]
The SVGs are starting to look actually recognisable! You'll need a new benchmark soon :)