Big day for open source Chinese model releases - DeepSeek-v3-0324 came out today too, an updated version of DeepSeek v3 now under an MIT license (previously it was a custom DeepSeek license). https://simonwillison.net/2025/Mar/24/deepseek/
Big day for open source Chinese model releases - DeepSeek-v3-0324 came out today too, an updated version of DeepSeek v3 now under an MIT license (previously it was a custom DeepSeek license). https://simonwillison.net/2025/Mar/24/deepseek/
And it still can't answer this:
Q: "9.11 and 9.9, which one is larger?"
A: "To determine which number is larger between 9.11 and 9.9, let's compare them step by step. Both numbers have the same whole number part: 9. 9.11 has a 1 in the tenths place. 9.9 has a 9 in the tenths place. Since 9 (from 9.9) is greater than 1 (from 9.11), we can conclude that 9.9 is larger than 9.11."
"Final Answer: 9.9"
I don't think anything else needs to be said here.
Answer is correct no?
9.9-9.11 =0.79
Might want to check your math? Seems right to me
But the answer is correct? 9.9 is larger than 9.11
Parent is thinking Semantic Versioning.
Lol, well I guess we've a achieved the functional equivalent of AGI, at least for you. Please don't delete your comment.
jschoe's post is actually a Turing test for us. :)
(just kidding jschoe)
+1 to Deepseek
-1 to humanity
This is hilarious, especially if it's unintentional.
He's Poe's law testing us.
One of many pet peeves with semver
> I don't think anything else needs to be said here.
Will this humbling moment change your opinion?
I will now refer to this as the jschoe test in my writing and publications as well!
It's interesting to think that maybe one of the most realistic consequences of reaching artificial superintelligence will be when its answers start wildly diverging from human expectations and we think it's being "increasingly wrong".
You just failed the Turing test, now we know you're an LLM.
Based on the presented reasoning, that means humanity wins! Yay!