←back to thread

2127 points bakugo | 2 comments | | HN request time: 0s | source
Show context
freediver ◴[] No.43164170[source]
Kagi LLM benchmark updated with general purpose and thinking mode for Sonnet 3.7.

https://help.kagi.com/kagi/ai/llm-benchmark.html

Appears to be second most capable general purpose LLM we tried (second to gemini 2.0 pro, in front of gpt-4o). Less impressive in thinking mode, about at the same level as o1-mini and o3-mini (with 8192 token thinking budget).

Overall a very nice update, you get higher quality and higher speed model at same price.

Hope to enable it in Kagi Assistant within 24h!

replies(8): >>43164279 #>>43164282 #>>43164709 #>>43164800 #>>43164997 #>>43165104 #>>43169517 #>>43171532 #
1. guelo ◴[] No.43164997[source]
How did you chose the 8192 token thinking budget? I've often seen Deepseek R1 use way more than that.
replies(1): >>43173457 #
2. freediver ◴[] No.43173457[source]
Arbitrary, and even with this budget it is already more verbose (and slower) overall than all the other thinking models - check tokens and latency in the table.