https://openrouter.ai/deepseek/deepseek-chat-v3-0324:free
do you think this needs attention?
The foundation model companies are screwed. Only shovel makers (Nvidia, infra companies) and product companies are going to win.
Gemini isn't too special , it's actually just comparable to deepseek / less than deepseek but it is damn fast so maybe forget gemini for true tasks.
Grok / gemini can be used as a deep research model which I think I like ? Grok seems to have just taken the deepseek approach but just scaled it by their hyper massive gpu cluster, I am not sure I think that grok can also be replaced.
What I truly believe in is claude.
I am not sure but claude really feels good for coding especially.
For any other thing I might use something like deepseek / chinese models
I used cerebras.ai and holy moly they are so fast , I used the deepseek 70 b model , it is still something incredibly fast and my time matters so I really like the open source way so that companies like cereberas can focus on what they do best.
I am not sure about nvidia though. Nvidia seems so connected to the western ai that deepseek improvements impact nvidia.
I do hope that nvidia cheapens the price of gpu though I don't think they have much incentive.
Any kind of media with zero or near zero copying/distribution costs becomes a deflationary race to the bottom. Someone will eventually release something that's free, and at that point nothing can compete with free unless it's some kind of very specialized offering. Then you run into a the problem the OP described: how do you fund free? Answer: ads. Now the customer is the advertiser, not the user/consumer, which is why most media converges on trash.
https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
Companies will have to detect and police distilling if they want to keep their moat. Maybe you have to have an enterprise agreement (and arms control waiver) to get GPT-6-large API access.
Also I have seen that once a open source llm is released to public, though you can access it on any website hosting it, most people would still prefer it to be the one which created the model.
Deepseek released its revenue models and it's crazy good.
And no they didn't have full racks of h100.
Also one more thing. Open source has always had an issue of funding.
Also they are not completely open source, they are just open weights, yes you can fine tune them but from my limited knowledge, there is some limitations of fine tuning so owning that training data proprietary also helps fund my previous idea of consulting other ai.
Yes it's not a much profitable venture,imo it's just a decently profitable venture, but the current hype around ai is making it lucrative for companies.
Also I think this might be a winner takes all market which increases competition but in a healthy way.
What deepseek did with releasing the open source model and then going out of their way to release some other open source projects which themselves could've been worth a few millions (bycloud said it), helps innovate ai in general.
Perplexity released the deepseek r1 1331? ( I am not sure I forgot) It basically removes chinese censorships / yes you can ask it about the tiananmen square.
I think the next iteration of these ai model ads would be sneaky which might be hard to remove
Though it's funny you comment about chinese censorship yet american censorship is fine lol
Also china doesn't have access to that many gpus because of the chips act.
And i hate it , i hate it when america sounds more communist than china who open sources their stuff because free markets.
I actually think that more countries need to invest into AI and not companies wanting profit.
This could be the decision that can impact the next century.
Commoditizing the AI/intelligence part means that the main advantage isn't the bits - its the atoms. Physical dexterity, social skills and manufacturing skills will gain more of a comparative advantage vs intelligence work in the future as a result - AI makes the old economy new again in the long term. It also lowers the value of AI investments in that they no longer can command first mover/monopoly like pricing for what is a very large capex cost undermining US investment in what is their advantage. As long as it is strategic, it doesn't necessarily need to be economic on its own.
(this isn't idle prognostication hinging on my personal hobby horse. I got skin in the game, I'm virtually certain I have the only AI client that is able to reliably do tool calls with open models in an agentic setting. llama.cpp got a massive contribution to make this happen and the big boys who bother, like ollama, are still using a dated json-schema-forcing method that doesn't comport with recent local model releases that can do tool calls. IMHO we're comfortably past a point where products using these models can afford to focus on conversational chatbots, thats cute but a commodity to give away per standard 2010s SV thinking)
* OpenAI's can but are a little less...grounded?...situated? i.e. it can't handle "read this file and edit it to do $X". Same-ish for Gemini, though, sometimes I feel like the only person in the world who actually waits for the experimental models to go GA, as per letter of the law, I shouldn't deploy them until then
Fwiw, Claude Sonnet 3.5 100% had some sort of agentic loop x precise file editing trained into it. Wasn't obvious to me until I added a MCP file server to my client, and still isn't well-understood outside a few.
I'm not sure on-device models will be able to handle it any time soon because it relies on just letting it read the whole effing file.
Seperately...
I say I don't understand why no other model is close, but it makes sense. OpenAI has been focused on reasoning, Mistral, I assume is GPU-starved, and Google...well, I used to work there, so I have to stop myself from going on and on. Let's just say I assume that there wouldn't be enough Consensus Built™ to do something "scary" and "experimental" like train that stuff in.
This also isn't going so hot for Sonnet IMHO.
There's vague displeasure and assumptions it "changed" the last week, but, AFAICT the real problem is that the reasoning stuff isn't as "trained in" as, say, OpenAI's.
This'd be a good thing except you see all kinds of whacky behavior.
One of my simple "read file and edit" queries yesterday did about 60 pages worth of thinking, and the thinking contained 130+ separate tool calls that weren't actually called, so it was just wandering around in the wilderness, reacting to hallucinated responses it never actually got.
Which plays into another one of my hobbyhorses, chat is a "hack" on top of an LLM. Great. So is reasoning, especially in the way Anthropic implemented it. At what point are the abstractions too much, so much that it's unreliable? 3.7 Sonnet may be answering that, because when it fails, all that thinking looks like the agentic loop cooked into Sonnet 3.5. So maybe it's altogether too much to have chat, reasoning, and fully reliable agentic loops...
It feels as an outsider that very little progress is made on the energy issue. I genuinely think that ai can be accelerated so so much more if energy could be more cheap / green
I have set up the same thing at work for my colleagues, and they find it better than openai for their tasks.
But, I don’t really see the connection on the flip side. Why should proprietary AI be associated with communism? If anything I guess a communist handling of AI would also be to share the model.
This has clearly been part of a viable business model for a long time. Why should LLM models be any different?
Big companies have so much data they won't be having a human look at mine specifically. Some small place probably has the engineer looking at my logs as user #4.
Also, big companies have security teams whose job is securing the data, and it won't be going over some unencrypted link to cloudflare because OP was too lazy to set up Https certs.
- That'll be 1 turn scores: at multiturn, 4o is 3x as good as the 3b
- BFCL is generally turn natural language into an API call, then multiturn will involve making another API call.
- I hope to inspire work towards an open model that can eat the paid models sooner rather than later
- trained quite specifically on an agent loop with tools read_files and edit_file (you'll also probably do at least read_directory and get_shared_directories, search_filenames and search_files_text are good too), bonus points for cli_command
- IMHO, this is much lower hanging-fruit than ex. training an open computer-vision model, so I beseech thee, intrepid ML-understander, to fill this gap and hear your name resound throughout the age
I say this because I think that the Perplexity model is tuned on additional information, whereas the alliterated models only include information trained into the underlying model, which is interesting to see.
https://chatgpt.com/share/67e1eba1-c658-800e-9161-a0b8b7b683...
While theres some synchronistic effects... I think the physical manufacturing and logistics base is harder to develop than deploying a new model, and will be the hard leading edge. (That's why the US seems to be hellbent on destroying international trade to try and build a domestic market.)
Q: "9.11 and 9.9, which one is larger?"
A: "To determine which number is larger between 9.11 and 9.9, let's compare them step by step. Both numbers have the same whole number part: 9. 9.11 has a 1 in the tenths place. 9.9 has a 9 in the tenths place. Since 9 (from 9.9) is greater than 1 (from 9.11), we can conclude that 9.9 is larger than 9.11."
"Final Answer: 9.9"
I don't think anything else needs to be said here.
Might want to check your math? Seems right to me
Google: 2.385511e+39 Your chat: "Numerically, that’s about 2.3855 × 10^39"
Also curious how you think about LLM-as-calculator in relation to tool calls.
(just kidding jschoe)
-1 to humanity
> Also curious how you think about LLM-as-calculator in relation to tool calls.
I just tried this because I heard all existing models are bad at this kind of problem, and wanted to try it with the most powerful one I have access to. I think it shows that you really want an AI to be able to use computational tools in appropriate circumstances.
Will this humbling moment change your opinion?
It's interesting to think that maybe one of the most realistic consequences of reaching artificial superintelligence will be when its answers start wildly diverging from human expectations and we think it's being "increasingly wrong".
I've tried LibreChat before, but the app is terrible at generating titles for chats instead of leaving it as "New Chat". Also it lacks a working Code Interpreter.
For example , Chatgpt etc. self hosts them on their own gpu and they can generate 10tk/s or something.
Now there exists groq , cerebras who can do token generation of 4000 tk/s but they kind of require a open source model.
So that is why I feel its not really abiding by the true capitalist philosophy
What I love about "open" models in general and Deepseek in particular, is how they undermine that market. Deepseek drops especially were fun to watch, they were like last minute plot twists, like dropping some antibiotic into a perti dish filled with bacteria. Sorry, try again with a better moat.
"Open" models are in fact the very thing enabling having a functioning market in this space.
If you are talking about DeepSeek's own hosted API service. It's because they deliberately decided to run the service in heavily overloaded conditions and have very aggressive batching policy to extract more out of their (limited) H800s.
Yes, for some reason (the reason I heard is "our boss don't want to run such a business" which sounds absurd but /shrug) they refuse to scale up serving their own models.
So for them this is a case of insurance and hedging risks, not profit making.
You can. Ask your friendly local IRS.
The thing is, model is in effect a piece of software that has almost 0 marginal cost. You just need a few, maybe even one company to release SOTA models consistently to really crash the valuation of every model companies because every one can acquire that single piece of software without cost to leave other model companies by themselves. The foundational model scene is basically in an extremely unstable state readily to return to a stable state of the model cost goes to 0. You really don't need the state competition assumption to explain the current state of affairs.
Liang gave up the No.1 Chinese hedge fund position to create AGI, he has very good chance to short the entire US share market and pocket some stupid amount of $ when R2 is released, he has pretty much unlimited support from local and central Chinese government. Trying to make some pennies from hosting models is not going to sustain what he enjoys now.
Personally I heavily dislike the experience though, so I might not be the best one to answer.
Tell it to use code if you want an exact answer. It should do that automatically, of course, and obviously it eventually will, but jeez, that's not a bad Fermi guess for something that wasn't designed to attempt such problems.
That seems based on a very weird idea of what capitalism and communism are; idealized free markets have very little to do with the real-world economic system for which the name “capitalism” was coined, and dis-integration where “everyone does one thing” has little to do with either capitalism or free markets, though it might be a convenient assumption for 101-level discussions of market competition where you want to avoid dealing with real-world issues like partially-overlapping markets and imperfect substitutes to assume every good exists in an isolated market of goods which compete only and exactly with the other groups in that same market in a simple way.
Given as you say the long term cost of AI models is marginally zero, I don't think this is a bad position to be in.