←back to thread

544 points tosh | 1 comments | | HN request time: 0.243s | source
Show context
simonw ◴[] No.43464227[source]
Big day for open source Chinese model releases - DeepSeek-v3-0324 came out today too, an updated version of DeepSeek v3 now under an MIT license (previously it was a custom DeepSeek license). https://simonwillison.net/2025/Mar/24/deepseek/
replies(5): >>43464375 #>>43464498 #>>43464686 #>>43465383 #>>43467111 #
echelon ◴[] No.43464498[source]
Pretty soon I won't be using any American models. It'll be a 100% Chinese open source stack.

The foundation model companies are screwed. Only shovel makers (Nvidia, infra companies) and product companies are going to win.

replies(7): >>43464607 #>>43464651 #>>43464792 #>>43466340 #>>43466493 #>>43469085 #>>43469922 #
refulgentis ◴[] No.43464792[source]
I've been waiting since November for 1, just 1*, model other than Claude than can reliably do agentic tool call loops. As long as the Chinese open models are chasing reasoning and benchmark maxxing vs. mid-2024 US private models, I'm very comfortable with somewhat ignoring these models.

(this isn't idle prognostication hinging on my personal hobby horse. I got skin in the game, I'm virtually certain I have the only AI client that is able to reliably do tool calls with open models in an agentic setting. llama.cpp got a massive contribution to make this happen and the big boys who bother, like ollama, are still using a dated json-schema-forcing method that doesn't comport with recent local model releases that can do tool calls. IMHO we're comfortably past a point where products using these models can afford to focus on conversational chatbots, thats cute but a commodity to give away per standard 2010s SV thinking)

* OpenAI's can but are a little less...grounded?...situated? i.e. it can't handle "read this file and edit it to do $X". Same-ish for Gemini, though, sometimes I feel like the only person in the world who actually waits for the experimental models to go GA, as per letter of the law, I shouldn't deploy them until then

replies(3): >>43464831 #>>43472567 #>>43473947 #
1. anon373839 ◴[] No.43472567[source]
A but of a tangent, but what’re your thoughts on code agents compared to the standard “blobs of JSON” approach? I haven’t tried it myself, but it does seem like it would be a better fit for existing LLMs’ capabilities.