←back to thread

544 points tosh | 1 comments | | HN request time: 0s | source
Show context
simonw ◴[] No.43464227[source]
Big day for open source Chinese model releases - DeepSeek-v3-0324 came out today too, an updated version of DeepSeek v3 now under an MIT license (previously it was a custom DeepSeek license). https://simonwillison.net/2025/Mar/24/deepseek/
replies(5): >>43464375 #>>43464498 #>>43464686 #>>43465383 #>>43467111 #
jschoe ◴[] No.43467111[source]
And it still can't answer this:

Q: "9.11 and 9.9, which one is larger?"

A: "To determine which number is larger between 9.11 and 9.9, let's compare them step by step. Both numbers have the same whole number part: 9. 9.11 has a 1 in the tenths place. 9.9 has a 9 in the tenths place. Since 9 (from 9.9) is greater than 1 (from 9.11), we can conclude that 9.9 is larger than 9.11."

"Final Answer: 9.9"

I don't think anything else needs to be said here.

replies(17): >>43467119 #>>43467120 #>>43467131 #>>43467132 #>>43467138 #>>43467147 #>>43467148 #>>43467381 #>>43467419 #>>43467451 #>>43467459 #>>43467485 #>>43467500 #>>43467530 #>>43467732 #>>43467781 #>>43468916 #
AuryGlenz ◴[] No.43467147[source]
I suggest we’ve already now passed what shall be dubbed the jschoe test ;)
replies(2): >>43467458 #>>43468092 #
1. sebastiennight ◴[] No.43468092[source]
I will now refer to this as the jschoe test in my writing and publications as well!

It's interesting to think that maybe one of the most realistic consequences of reaching artificial superintelligence will be when its answers start wildly diverging from human expectations and we think it's being "increasingly wrong".