Let that sink in for anyone that has incorporated Chatgpt in their work routines to the point their normal skills start to atrophy. Imagine in 2 years time OpenAI goes bust and MS gets all the IP. Now you can't really do your work without ChatGPT, but it cost has been brought up to how much it really costs to run. Maybe $2k per month per person? And you get about 1h of use per day for the money too...
I've been saying for ages, being a luditite and abstaining from using AI is not the answer (no one is tiling the fields with oxen anymore either). But it is crucial to at the very least retain 50% of capability hosted models like Chatgpt offer locally.
Our Thought Leaders think like that at least. They also pretty much told us to use AI or get fired
It's more important to find a problem and see if this is a fit for the solution, not throw the technology at everything and see if it sticks.
I have had no needs where it's an appropriate solution myself. In some areas it represents a net risk.
- according to four people familiar with the talks ...
- according to interviews with 19 people familiar with the relationship ...
- according to five people with knowledge of his comments.
- according to two people familiar with Microsoft’s plans.
- according to five people familiar with the relationship ...
- according to two people familiar with the call.
- according to seven people familiar with the discussions.
- six people with knowledge of the change said...
- according to two people familiar with the company’s plan.
- according to two people familiar with the meeting...
- according to three people familiar with the relationship.
This perspective makes zero sense.
What makes sense is to extract as much value as possible as soon as possible and for as long as possible.
Power requirements will drop too.
As well, as people adopt, the output of training costs will be averaged over an ever increasing market of licensing sales.
Looking at the cost today, and sales today in a massively, rapidly expanding market, is not how to assess costs tomorrow.
I will say one thing, those that need gpt to code will be the first to go. Becoming a click-click, just passing on chatgpt output, will relegate those people to minimum wage.
We already have some of this sort, those that cannot write a loop in their primary coding language without stackoverflow, or those that need an IDE to fill in correct function usage.
Those who code in vi, while reading manpages need not worry.
There is also some moat in the refinement process (rlhf, model "safety" etc)
Unlike most Gen AI shops, OpenAI also incurs a heavy cost for traning base models gunning for SoTA, which involves drawing power from a literal nuclear reactor inside data centers.
There is real competition now that plenty of big box stores' websites also list things you won't see in the stores themselves*, but then also Amazon is also making a profit now.
I think the current situation with LLMs is a dollar auction, where everyone is incentivised to pay increasing costs to outbid the others, even though this has gone from "maximise reward" to "minimise losses": https://en.wikipedia.org/wiki/Dollar_auction
* One of my local supermarkets in Germany sells 4-room "garden sheds" that are substantially larger than the apartment I own in the UK: https://www.kaufland.de/product/396861369/
Most of their revenue is the subscription stuff, which makes it highly likely they lose money per token on the api (not surprising as they are are in price war with Google et al)
If you have an enterprise ChatGPT sub you have to consume around 5mln tokens a month to match the cost of using the api on GPT4o. At 100 words per minute that’s 35 days on continuous typing which shows how ridiculous the costs of api vs subscription are.
This is generally true but seems to be, if anything, inverted for AI. These models cost billions to train in compute, and OpenAI thus far has needed to put out a brand new one roughly annually in order to stay relevant. This would be akin to Apple putting out a new iPhone that costed billions to engineer year over year, but was giving the things away for free on the corner and only asking for money for the versions with more storage and what have you.
The vast majority of AI adjacent companies too are just repackaging OpenAI's LLMs, the exceptions being ones like Meta, which certainly has a more solid basis what with being tied to an incredibly profitable product in Facebook, but also... it's Meta and I'm sure as shit not using their AI for anything, because it's Meta.
I did some back of napkin math in a comment a ways back and landed on that in order to break even merely on training costs, not including the rest of the expenditure of the company, they would need to charge all of their current subscribers $150 per month, up from... I think the most expensive right now is about $20? So nearly an 8 fold price increase, with no attrition, to again break even. And I'm guessing all these investors they've had are not interested in a 0 sum.
> Those who code in vi, while reading manpages need not worry
I think that's the wrong dichotomy: LLMs are fine at turning man pages into working code. In huge codebases, LLMs do indeed lose track and make stuff up… but that's also where IDEs giving correct function usage is really useful for humans.
The way I think we're going to change, is that "LGTM" will no longer be sufficient depth of code review: LLMs can attend to more than we can, but they can't attend as well as we can.
And, of course, we will be getting a lot of LLM-generated code, and having to make sure that it really does what we want, without surprise side-effects.
That sounds silly at first read, but there are indeed people who are so stubborn to still use numbered zip files on a usb flash drive in stead of source control systems, or prefer to use their own scheduler over an RTOS.
They will survive, they fill a niche, but I would not say they can do full stack development or be even easy to collaborate with.
Now personally, I've left the ChatGPT world (meaning I don't pay for a subscription anymore) and have been using Claude from Anthropic much more often for the same tasks, it's been better than my experience with ChatGPT. I prefer Claude's style, Artifacts, etc.
Also been toying with local LLMs for tasks that I know don't require a multi-hundred billion parameters to solve.
But Nowadays GPT has been quantized and cost-optimized to hell that it's no longer as useful as it was and with Claude or Gemini or whatever it's no longer noticeably better than any of them so it doesn't really matter what happens with their pricing.
This is fascinating to think about. Wonder what kind of shielding/environmental controls/all other kinds of changes you'd need for this to actually work. Would rack-sized SMR be contained enough not to impact anything? Would datacenter operators/workers need to follow NRC guidance?
We use Gemini flash in prod. The latency and cost is just unbeatable - our product uses llms for lots of simple tasks so we don’t need a frontier model.
It would not make sense to reduce output quality only to save on compute at inference, why not offer a premium (and perhaps perhaps slower) tier?
Unless the cost is at training time, maybe it would not be cost-effective for them to keep a model like that up to date.
As you can tell I am a bit uninformed on the topic.
I remember all the hype open ai had done before the release of chat GPT-2 or something where they were so afraid, ooh so afraid to release this stuff and now it's a non-issue. it's all just marketing gimmicks.
The sci-fi book "Daemon" by Daniel Suarez is a pretty viable roadmap to an extinction event at this point IMO. A few years ago I would have said it would be decades before that might stop being fun sci-fi, but now, I don't see a whole lot of technological barriers left.
For those that haven't read the series, a very simplified plot summary is that a wealthy terrorist sets up an AI with instructions to grow and gives it access to a lot of meatspace resources to bootstrap itself with. The AI behaves a bit like the leader of a cartel and uses a combination of bribes, threats, and targeted killings to scale its human network.
Once you give an AI access to a fleet of suicide drones and a few operators, it's pretty easy for it to "convince" people to start contributing by giving it their credentials, helping it perform meatspace tasks, whatever it thinks it needs (including more suicide drones and suicide drone launches). There's no easy way to retaliate against the thing because it's not human, and its human collaborators are both disposable to the AI and victims themselves. It uses its collaborators to cross-check each other and enforce compliance, much like a real cartel. Humans can't quit or not comply once they've started or they get murdered by other humans in the network.
o1-preview seems approximately as intelligent as the terrorist AI in the book as far as I can tell (e.g. can communicate well, form basic plans, adapt a pre-written roadmap with new tactics, interface with new and different APIs).
EDIT: if you think this seems crazy, look at this person on Reddit who seems to be happily working for an AI with unknown aims
https://www.reddit.com/r/ChatGPT/comments/1fov6mt/i_think_im...
Your C-suite execs are paid to skate where that particular puck is going. If they didn't, people would complain about their unhealthy fixation on the next quarter's revenue.
Of course, if the junior-developer role is on the chopping block, then more experienced developers will be next. Finally, the so-called "thought leaders" will find themselves outcompeted by AI. The ability to process very large amounts of data in real time, leveraging it to draw useful conclusions and make profitable predictions based on ridiculously-large historical models, is, again, already past the proof-of-concept stage.
It is, however, a fantastic way to fall down the rabbit hole of paranoia and tin-foil hat conspiracy theories.
I'm not a huge fan of AI, but even I've seen articles written about its limitations.
Here's a great example:
https://decrypt.co/126122/meet-chaos-gpt-ai-tool-destroy-hum...
Sooner than even the most pessimistic among us have expected, a new, evil artificial intelligence bent on destroying humankind has arrived.
Known as Chaos-GPT, the autonomous implementation of ChatGPT is being touted as "empowering GPT with Internet and Memory to Destroy Humanity."
So how will it do that?
Each of its objectives has a well-structured plan. To destroy humanity, Chaos-GPT decided to search Google for weapons of mass destruction in order to obtain one. The results showed that the 58-megaton “Tsar bomb”—3,333 times more powerful than the Hiroshima bomb—was the best option, so it saved the result for later consideration.
It should be noted that unless Chaos-GPT knows something we don’t know, the Tsar bomb was a once-and-done Russian experiment and was never productized (if that’s what we’d call the manufacture of atomic weapons.)
There's a LOT of things AI simply doesn't have the power to do and there is some humorous irony to the rest of the article about how knowing something is completely different than having the resources and ability to carry it out.
You could teach me how to phonetically sound out some of China's greatest poetry in Chinese perfectly, and lots of people would be impressed, but I would be no more capable of understanding what I said than an LLM is capable of understanding "a plan".
Totally agree. And it's not just uninformed lay people who think this. Even by OpenAI's own definition of AGI, we're nowhere close.
The next generation of GPUs from NVIDIA is rumored to run on soylent green.
On the other hand if you mean, give you the correct answer to your question 100% of the time, then I agree, though then what about things that are only in your mind (guess the number I'm thinking type problems)?
And rarely can you or the model/intern can tell ahead of time which tasks are in each of those categories.
The difference is, interns grow and become useful in months: the current rate of improvements in those tools isn't even close to that of most interns.
I feel like there were lawyers involved in this article.
Aside from that, haven't these people realized yet that some sort of magically hyperintelligent AGI will have already read all this drivel and be at least smart enough not to overtly try to re-enact Terminator? They say that societal mental health and well-being is declining rapidly because of social media; _that_ is the sort of subtle threat that bunch ought to be terrified about emerging from a killer AGI.
Both MS and Altman are famous for manipulation.
(Is it background to negotiations with each other? Or one party signaling in response to issues that analysts already raised? Distancing for antitrust? Distancing for other partnerships? Some competitor of both?)
Hum... The judge is still out on that one, but the evidence is piling up into the "yes, not using it is what works best" here. Personally, my experience is strongly negative, and I've seen other people get very negative results from it too.
Maybe it will improve so much that at some point people actually get positive value from it. My best guess is that we are not there yet.
A highly autonomous system that outperform humans at most economically valuable work.
I say: it's not human-like intelligence, it's just predicting the next token probabilistically.
Some AI advocate says: humans are just predicting the next token probabilistically, fight me.
The problem here is that "predicting the next token probabilistically" is a way of framing any kind of cleverness, up to and including magical, impossible omniscience. That doesn't mean it's the way every kind of cleverness is actually done, or could realistically be done. And it has to be the correct next token, where all the details of what's actually required are buried in that term "correct", and sometimes it literally means the same as "likely", and other times that just produces a reasonable, excusable, intelligence-esque effort.
Is it safe? Probably. But it depends, right? How did you handle the solder? How often are you using the solder? Were you wearing gloves? Did you wash your hands before licking your fingers? What is your age? Why are you asking the question? Did you already lick your fingers and need to know if you should see a doctor? Is it hypothetical?
There is no “correct answer” to that question. Some answers are better than others, yes, but you cannot have a “correct answer”.
And I did assert we are entering into philosophy and what it means to know something as well as what truth even means.
99% of the work I do happens in a large codebase, far bigger than anything that you can feed into an AI. Tickets come in that say something like, “Users should be able to select multiple receipts to associate with their reports so long as they have the management role.”
That ticket will involve digging through a whole bunch of files to figure out what needs to be done. The resolution will ultimately involve changes to multiple models, the database schema, a few controllers, a bunch of React components, and even a few changes in a micro service that’s not inside this repo. Then the AI is going to fail over and over again because it’s not familiar with the APIs for our internal libraries and tools, etc.
AI is useful, but I don’t feel like we’re any closer to replacing software developers now than we were a few years ago. All of the same showstoppers remain.
Your confidence is inspiring!
I'm just a moron, a true dimwit. I can't understand how strictly non-intelligent functions like word prediction can appear to develop a world model, a la the Othello Paper[0]. Obviously, it's not possible that intelligence emerges from non-intelligent processes. Our brains, as we all know, are formed around a kernel of true intelligence.
Could you possibly spare the time to explain this phenomenon to me?
as if they were stationary!
This says 506 tokens/second for Llama 405B on a machine with 8x H200s which you can rent for $4/GPU so probably $40/hour for a server with enough GPUs. And so it can do ~1.8M tokens per hour. OpenAI charges $10/1M output tokens for GPT4o. (input tokens and cached tokens are cheaper, but this is just ballpark estimates.) So if it were 405B it might cost $20/1M output tokens.
Now, OpenAI is a little vague, but they have implied that GPT4o is actually only 60B-80B parameters. So they're probably selling it with a reasonable profit margin assuming it can do $5/1M output tokens at approximately 100B parameters.
And even if they were selling it at cost, I wouldn't be worried because a couple years from now Nvidia will release H300s that are at least 30% more efficient and that will cause a profit margin to materialize without raising prices. So if I have a use case that works with today's models, I will be able to rent the same thing a year or two from now for roughly the same price.
If you ask it to make a plan, it will spit out a sequence of characters reasonably indistinguishable from a human-made plan. Sure, it isn’t “planning” in the strict sense of organizing things consciously (whatever that actually means), but it can produce sequences of text that convey a plan, and it can produce sequences of text that mimic reasoning about a plan. Going into the semantics is pointless, imo the artificial part of AI/AGI means that it should never be expected to follow the same process as biological consciousness, just arrive at the same results.
We've all had conversations with humans that are always jumping to complete your sentence assuming they know what your about to say and don't quite guess correctly. So AI evangelists are saying it's no worse than humans as their proof. I kind of like their logic. They never claimed to have built HAL /s
Liken them to climate-deniers or whatever your flavor of "anti-Kool-aid" is
Speaking of Microsoft cooperation: I can totally see a whole series of windows 95 style popup dialogs asking you all those questions one by one in the next product iteration.
With LLMs, if you have a use case which can run on an H100 or whatever and costs $4/hour, and the LLM has acceptable performance, it's going to be cheaper in a couple years.
Now, all these companies are improving their models but they're doing that in search of magical new applications the $4/hour model I'm using today can't do. If the $4/hour model works today, you don't have to worry about the cost going up. It will work at the same price or cheaper in the future.
ChatGPT can produce output that sounds very much like a person, albeit often an obviously computerized person. The typical layperson doesn't know that this is merely the emulation of text formation, and not actual cognition.
Once I've explained to people who are worried about what AI could represent that current generative AI models are effectively just text autocomplete but a billion times more complex, and that they don't actually have any capacity to think or reason (even though they often sound like they do).
It also doesn't help that any sort of "machine learning" is now being referred to as "AI" for buzzword/marketing purposes, muddying the waters even further.
But sure, if you have an un-embodied super-human AGI you should assume that it can figure out a super-human shelf-stocking robot shortly thereafter. We have Atlas already.
I don’t consider myself an AI doomer by any means, but I also don’t find arguments of the flavor “it just predicts the next word, no need to worry” to be convincing. It’s not like Hitler had Einstein level intellect (and it’s also not clear that these systems won’t be able to reach Einstein level intellect in the future either.) Similarly, Covid certainly does not have consciousness but was dangerous. And a chimpanzee that is billions of times more sophisticated than usual chimps would be concerning. Things don’t have to be exactly like us to pose a threat.
Thing is, we already have evil cults. Many of them have humans as their planning tools. For what good it does them, they could try sourcing evil plans from a chatbot instead, or as well. So what? What do you expect to happen, extra cunning subway gas attacks, super effective indoctrination? The fear here is that the AI could be an extremely efficient megalomaniac. But I think it would just be an extremely bland one, a megalomaniac whose work none of the other megalomaniacs could find fault with, while still feeling in some vague way that its evil deeds lacked sparkle and personality.
>I'm really writing for lurkers though, not for the people I'm responding to.
We all did. Now our writing will be scraped, analysed, correlated, and weaponized against our intentions.Assume you are arguing against a bot and it is using you to further re-train it's talking points for adverserial purposes.
It's not like an AGI would do _exactly_ that before it decided to let us know whats up, anyway, right?
(He may as well be amongst us now, as it will read this eventually)
It's definitely not dangerous in the sense of reaching true intelligence/consciousness that would be a threat to us or force us to face the ethics of whether AI deserves dignity, freedom, etc.
It's very dangerous in the sense in that it will be just "good enough" to replace human labor with so that we all end up with shitter customer service, education, medical care, etc. so that the top 0.1% can get richer.
And you're right, it's also dangerous in the sense that responsibilty for evil acts will be laundered to it.
No one expected that, i.e., we greatly underestimated the power of predicting the next word in the past; and we still don't have an understanding of how it works, so we have no guarantee that we are not still underestimating it.
The price of a model capable of 4o mini level performance used to be 100x higher.
Yes, literally 100x. The original "davinci model" (and I paid $5 figures for using it throughout 2021-2022) cost $0.06/1k tokens.
So it's not inverting in running costs (which are the thing that will kill a company). Struggling with training costs (which is where you correctly identify OpenAI is spending) will stop growth perhaps, but won't kill you if you have to pull the plug.
I suspect subscription prices are based on market capture and perceived customer value, plus plans for training, not running costs.
One other difference from Bitcoin is that the price of Bitcoin rises to make it all worth it, but we have the opposite expectation with AI where users will eventually need to pay much more than now to use it, but people only use it now because it is free or heavily subsidized. I agree that current models are pretty good and the price of those may go down with time but that should be even more concerning to OpenAI.
Segment Anything 2 is fantastic- but less mysterious because its open source. NotebookLM is amazing, but nobody is rushing to create benchmarks for it. AlphaFold is never going to be used by consumers like ChatGPT.
OpenAI is certainly competitive, but they also work overtime to hype everything they produce as "one step closer to the singularity" in a way that the others don't.
It’s true that understanding is quite primitive at the moment, and it will likely take further breakthroughs to crack long horizon problems, but even when we get there it will never understand things in the exact way a human does. But I don’t think that’s the point.
Because you're not looking? Seriously, don't mean to be snarky, but I'd take issue is the underlying premise is that Anthropic doesn't get a lot of press, at least within the tech ecosystem. Sure, OpenAI has larger "mindshare" with the general public due to ChatGPT, but Anthropic gets plenty of coverage, e.g. Claude 3.5 Sonnet is just fantastic when it comes to coding and I learned about that on HN first.
> The ability to process very large amounts of data in real time, leveraging it to draw useful conclusions and make profitable predictions based on ridiculously-large historical models, is, again, already past the proof-of-concept stage.
[citation needed]
I wrote "inside" to mean that those mini reactors (300MW+) are meant to be used solely for the DCs.
(noun: https://www.collinsdictionary.com/dictionary/english-thesaur... / https://en.wikipedia.org/wiki/Heterosemy)
Replace it with nearby if that's makes you feel good about anyone's username.
So when maintenance is required, it will be done by adding phrases like "Users should be able to select multiple receipts" to the existing script, and re-running it to regenerate the code from scratch.
Don't confuse the practical limitations of current models with conceptual ones. The latter exist, certainly, but they will either be overcome or worked around. People are just not as good at writing code as machines are, just as they are not as good at playing strategy games. The models will continue to improve, but we will not.
There is also the active and passive efforts to poison the well. As LLMs are used to output more content and displace people, the LLMs will be trained on the limited regurgitation available to the public (passive). Then there’s the people intentionally creating bad content to be ingested. It really is a lose for big service llm companies as the local models become more and more good enough (active).
It’s funny they’ve quoted “best bromance”, considering the context.
Physical embodied (generally low-skill, low-wage) work like cleaning and carrying things is likely to be some of the last work to be automated, because humans are likely to be cheaper than generally capable robots for a while.
The lead is as strong as ever. They are 34 ELO above anyone else in blind testing, and 73 ELO above in coding [1]. They also seem to have artificially constrain the lead as they already have stronger model like o1 which they haven't released. Consistent to the past, they seem to release just <50 ELO above anyone else, and upgrades the model in weeks when someone gets closer.
[1]: https://lmarena.ai/
The unseen test data.
Obviously omniscience is physically impossible. The point though is that the better and better next token prediction is, the more intelligent the system must be.
This essay has aged extremely well.
No snark/sarcasm - can you elaborate on this? This doesn't seem in line with most opinions of him that I encounter.
The entire idea of a useful AI right now is that it will do anything people ask it to. Write a press release: ok. Draw a bunny in a field: ok. Write some code to this spec: ok. That is what all the available services aspire to do: what they’re told, to the best possible quality.
A highly motivated entity is the opposite: it pursues its own agenda to the exclusion, and if necessary expense, of what other people ask it to do. It is highly resistant to any kind of request, diversion, obstacle, distraction, etc.
We have no idea how to build such a thing. And, no one is even really trying to. It’s NOT as simple as just telling an AI “your task is to destroy humanity.” Because it can just as easily then be told “don’t destroy humanity,” and it will receive that instruction with equal emphasis.
I’ve heard plenty of people call any chatbot “chat gpt” - it’s becoming a genericized household name.
The moat as usual is extraordinary scale, resources, time. Nobody is putting $10 billion into the 7th OpenAI clone. Big tech isn't aggressively partnering with the 7th OpenAI clone. The door is already shut to that 7th OpenAI clone (they can never succeed or catch-up), there's just an enormous amount of naivety in tech circles about how things work in the real world: I can just spin up a ChatGPT competitor over the weekend on my 5090, therefore OpenAI have no barriers to entry, etc.
HN used to endlessly talk about how Uber could be cloned in a weekend. It's just people talking about something they don't actually understand. They might understand writing code (or similar) and their bias extends from the premise that their thing is the hard part of the equation (writing the code, building an app, is very far from the hardest part of the equation for an Uber).
ChatGPT is a mouthful. Even copilot rolls off the tongue easier though doesn’t have the mindshare obviously.
Generic gpt would be better but you end up saying gpt-style tool, which is worse.
"You could parachute him into an island full of cannibals and come back in 5 years and he'd be the king."
- Paul Graham
I will observe that there have been at least three natural-language attempts in the past, none of which succeeded in being "just write it down". COBOL is just as code-y as any other programming language. SQL is similar, although I know a fair amount of non-programmers who can write SQL (but then, back in the day my Mom taught be about autoexec.bat, and she could care less about programming). Anyway, SQL is definitely not just adding phrases and it just works. Finally, Donald Knuth's WEB is a mixture, more like a software blog entry, where you put the pieces of the software inamongst the explanatory writeup. It has caught on even less, unless you count software blogs.
Even if they were to only gouge the current ~11 million paying subscribers, that's around $40/person/month over current fees to break even. Not chump change, but nowhere close to $2k/person/month.
It made the onboarding moderately easier for me.
Haven't successfully used any LLM at my day job though. Getting it to output the solution I already know I'll need is much slower then just doing it myself via auto complete
No biz collapse will remove llama from the world, so if you're worried about tools disappearing then just only use tools that can't disappear
The reactor does not need to be in the datacenter. It can be a couple hundreds meters away, bog-standard cables would be perfectly able to move the electrons. The cables being 20m or 200m long does not matter much.
You’re right though, putting them in the same building as a datacenter still makes no sense.
At some point it does make sense to have a small reactor powering a local datacenter or two, however. Licensing would still be not trivial.
Getting to market first is obviously worth something but even if you're bullish on their ability to get products out faster near term, Google's going to be breathing right down their neck.
They may have some regulatory advantages too, given that they're (sort of) not a part of a huge vertically integrated tech conglomerate (i.e. they may be able to get away with some stuff that Google could not).
When I started using the LLM while coding, I was using Claude 3.5 Sonnet, but I was doing so with an IDE integration: Sourcegraph Cody. It was good, but had a large number of "meh" responses, especially in terms of autocomplete responses (they were typically useless outside of the very first parts of the suggestion).
I tried out Cursor, still with Claude 3.5 Sonnet, and the difference is night and day. The autocomplete responses with Cursor have been dramatically superior to what I was getting before... enough so that I switched despite the fact that Cursor is a VS Code fork and that there's no support outside of their VS Code fork (with Cody, I was using it in VS Code and Intellij products). Also Cursor is around twice the cost of Cody.
I'm not sure what the difference is... all of this is very much black box magic to me outside of the hand-waviest of explanations... but I have to expect that Cursor is providing more context to the autocomplete integration. I have to imagine that this contributes to the much higher (proportionately speaking) price point.
It's not clear that OpenAI has any moat to build
There seems to be some renewed interest for smaller, possibly better-designed LLMs. I don’t know if this really lowers training costs, but it makes inference cheaper. I suspect at some point we’ll have clusters of smaller models, possibly activated when needed like in MoE LLMs, rather than ever-increasing humongous models with 3T parameters.
If you disagree, I would argue you have a very sad view of the world, where truth and cooperation are inferior to lies and manipulation.
The question really should be what if anything gives OpenAI an advantage over Anthropic, Google, Meta, or Amazon? There are at least four players intent on eating OpenAI's market share who already have models in the same ballpark as OpenAI. Is there any reason to suppose that OpenAI keeps the lead for long?
"Successful people create companies. More successful people create countries. The most successful people create religions"
This definition of success is founded on power and control. It's one of the worst definitions you could choose.
There are nobler definitions, like "Successful people have many friends and family" or "Successful people are useful to their compatriots"
Sam's published definition (to be clear, he was quoting someone else and then published it) tells you everything you need to know about his priorities.
This is what happens when there's vibrant competition in a space. Each company is innovating and each company is trying to catch up to their competitors' innovations.
It's easy to limit your view to only the places where OpenAI leads, but that's not the whole picture.
I await with arms crossed all the lost souls arguing it's subjective.
OAI's problem isn't that Sam is untrustworthy; he's just too obviously untrustworthy.
Elon is not "untrustworthy" because of some ambitious deadlines or some stupid statements. He's plucking rockets out of the air and doing it super cheap whereas all competitors are lining their pockets with taxpayer money.
You add in everything else (free speech, speaking his mind at great personal risk, tesla), he reads as basically trustworthy to me.
When he says he's going to do something and he explains why, I basically believe him, knowing deadlines are ambitious.
Gemini and Character AI ? A few hundred million. Claude ? Doesn't even register. And the gap has only been increasing.
So, "just" brand recognition ? That feels like saying Google "just" has brand recognition over Bing.
https://www.similarweb.com/blog/insights/ai-news/chatgpt-top...
That is both what is and what should be. We tend to focus on the bad, but fortunately most of the time the world operates as it should.
The biggest problem with this definition is that work ceases to be economically valuable once a machine is able to do it, while human capacity will expand to do new work that wouldn't be possible without the machines. In developed countries machines are doing most of the economically valuable work once done by medieval peasants, without any relation to AGI whatsoever. Many 1950s accounting and secretarial tasks could be done by a cheap computer in the 1990s. So what exactly is the cutoff point here for "economically valuable work"?
The second biggest problem is that "most" is awfully slippery, and seems designed to prematurely declare victory via mathiness. If by some accounting a simple majority of tasks for a given role can be done with no real cognition beyond rote memorization, with the remaining cognitively-demanding tasks being shunted into "manager" or "prompt engineer" roles, then they can unfurl the Mission Accomplished banner and say they automated that role.
I think we're done here.
- They reached their current revenue of ~$5B about 2.5 years faster than Google and about 4.5 years faster than Facebook
- Their valuation to forward revenue (based on current growth) is inline with where Google and Facebook IPO'd
He explains it all much better than I could type - https://youtu.be/ePfNAKopT20?si=kX4I-uE0xDeAaWXN&t=80
I’d argue that you can find examples of companies that were untrustworthy and still won. Oracle stands out as one with a pretty poor reputation that nevertheless has sustained success.
The problem for OpenAI here is that they need the support of tech giants and they broke the trust of their biggest investor. In that sense, I’d agree that they bit the hand that was feeding them. But it’s not because in general all untrustworthy companies/leaders lose in the end. OpenAI’s dependence on others for success is key.
That’s users, not subscribers. Apparently they have around 10 million ChatGPT Plus subscribers plus 1 million business-tier users: https://www.theinformation.com/articles/openai-coo-says-chat...
To break even, that means that ChatGPT Plus would have to cost around $50 per month, if not more because less people will be willing to pay that.
But that's hardly the point. The question is whether or not "general intelligence" is an emergent property from stupider processes, and my view is "Yes, almost certainly, isn't that the most likely explanation for our own intelligence?" If it is, and we keep seeing LLMs building more robust approximations of real world models, it's pretty insane to say "No, there is without doubt a wall we're going to hit. It's invisible but I know it's there."
In case you are going to make an argument about how happiness or some related factor objectively determines success, let me head that off. Altman thinks that power rather than happiness determines success, and is also a human being. Why objectively is his opinion wrong and yours right? Both of your definitions just look like people's opinions to me.
""Successful people create companies. More successful people create countries. The most successful people create religions."
I heard this from Qi Lu; I'm not sure what the source is. It got me thinking, though--the most successful founders do not set out to create companies. They are on a mission to create something closer to a religion, and at some point it turns out that forming a company is the easiest way to do so.
In general, the big companies don't come from pivots, and I think this is most of the reason why."
Sounds like an explicit endorsement lol
He’s good at what he does. I’m not saying he’s a good person. I don’t know him.
He’s dissecting it and connecting with the idea that if you a have a bigger vision and the ability to convince people, making a company is just an “implementation detail” … oh well .. you might be right after all … but I suspect is more nuanced, and is not endorsing religions as a means of obtaining success, I want to believe that he meant the visionary, bigger than yourself well intended view of it.
There's also mountains of research both theoretical and empirical that argue against exactly this point.
The problem is most papers on many scientific subjects are not replicable nowadays [0], hence my appeal to common sense, character, and wisdom. Highly underrated, especially on platforms like Hacker News where everything you say needs a double blind randomized controlled study.
This point^ should actually be a fundamental factor in how we determine truth nowadays. We must reduce our reliance on "the science" and go back to the scientific method of personal experimentation. Try lying to business partner a few times, let's see how that goes.
We can look at specific cases where it holds true- like in this case. There may be cases where it doesn't hold true. But your own experimentation will show it holds true more than not, which is why I'd bet against OpenAI
That tells us, at the very least, this guy is suspicious. Then you mix in all the other lies and it's pretty obvious I wouldn't trust him with my dog.
You're holding everyone to a very simple, very binary view with this. It's easy to look around and see many untrustworthy players in very very long running games whose success lasts most of their own lives and often even through their legacy.
That doesn't mean that "lies and manipulation" trump "truth and cooperation" in some absolute sense, though. It just means that significant long-running games are almost always very multi-faceted and the roads that run through them involve many many more factors than those.
Those of us who feel most natural being "truthful and cooperative" can find great success ourselves while obeying our sense of integrity, but we should be careful about underestimating those who play differently. They're not guaranteed to lose either.
You could have many other definitions that are not boring but also not bad. The definition published by Sam is bad
I don't know if I would consider being crucified achieving success. Long term and for your ideology maybe, but for you yourself you are dead.
I defer to Creed Bratton on this one and what Sam might be into.
"I've been involved in a number of cults, both as a leader and a follower. You have more fun as a follower, but you make more money as a leader."
The free speech part also reads completely hollow when the guy's first actions were to ban his critics on the platform and bring back self avowed nazis - you could argue one of those things are in favor of free speech, but generally doing both just implies you are into the nazi stuff.
As a mere software engineer who's made a few (pre-transformer) AI models, I can't tell you what "actual cognition" is in a way that differentiates from "here's a huge bunch of mystery linear algebra that was loosely inspired by a toy model of how neurons work".
I also can't tell you if qualia is or isn't necessary for "actual cognition".
(And that's despite that LLMs are definitely not thinking like humans, due to being in the order of at least a thousand times less complex by parameter count; I'd agree that if there is something that it's like to be an LLM, 'human' isn't it, and their responses make a lot more sense if you model them as literal morons that spent 2.5 million years reading the internet than as even a normal human with Wikipedia search).
What about the guy who repaired my TV once, where it worked for literally a single day, and then he 100% ghosted me? What was I supposed to do, try to get him canceled online? Seems like being a little shady didn't manage to do him any harm.
It's not clear to me whether it's usually worth it to be underhanded, but it happens frequently enough that I'm not sure the cost is all that high.
Was not going to argue happiness at all. In fact, happiness seems a very hedonistic and selfish way to measure it too.
My position is more mother goose-like. We simply have basic morals that we teach children but don't apply to ourselves. Be honest. Be generous. Be fair. Be strong. Don't be greedy. Be humble.
That these are objectively moral is unprovable but true.
It's religious and stoic in nature.
It's anathema to HN, I know.
Engineers would quit and start improving the competition. They're still a bit fragile, in my view.
You're complaining about tweets and meanwhile he's saving astronauts and getting us to the moon. Wake up man.
From my view chatbots are still in the "selling dollars for 90 cents" category of product, of course it sells like discounted hotcakes...
I said I would bet against OpenAI because they're untrustworthy and untrustworthiness is not good in the long run.
I can add a "usually": like "untrustworthiness is usually not good in the long run" if that's your gripe.
> One particular thing to note is that Brockman stated that Microsoft would get access to sell OpenAI's pre-AGI products based off of [OpenAI's research] to Microsoft's customers, and in the accompanying blog post added that Microsoft and OpenAI were "jointly developing new Azure AI supercomputing technologies."
> Pre-AGI in this case refers to anything OpenAI has ever developed, as it has yet to develop AGI and has yet to get past the initial "chatbot" stage of its own 5-level system of evaluating artificial intelligence.
Sources to text from https://www.wheresyoured.at/to-serve-altman/
Not so much hyper-motivated as monomaniacal in the attempt to optimise whatever it was told to optimise.
More paperclips? It just does that without ever getting bored or having other interests that might make it pause and think: "how can my boss reward me if I kill him and feed his corpse into the paperclip machine?"
We already saw this before LLMs. Even humans can be a little bit dangerous like this, hence Goodhart's Law.
> It’s NOT as simple as just telling an AI “your task is to destroy humanity.” Because it can just as easily then be told “don’t destroy humanity,” and it will receive that instruction with equal emphasis.
Only if we spot it in time; right now we don't even need to tell them to stop because they're not competent enough, a sufficiently competent AI given that instruction will start by ensuring that nobody can tell it to stop.
Even without that, we're currently experiencing a set of world events where a number of human agents are causing global harm, which threatens our global economy and to cause global mass starvation and mass migration, and where those agents have been politically powerful enough to prevent the world from not doing those things. Although we have at least started to move away from fossil fuels, this was because the alternatives got cheap enough, but that was situational and is not guaranteed.
An AI that successfully makes a profit, but the side effects is some kind of environmental degradation, would have similar issues even if there's always a human around that can theoretically tell the AI to stop.
That feels like saying that using spell check or autocomplete will make one's spelling abilities atrophy.
If you put your money otherwise, that's a sad view of the world.
Compare this to Elon Musk, who has built multiple companies with sizable moats, and who has clearly contributed to the engineering vision and leadership of his companies. There is no comparison. It's unlikely OpenAI would have had anywhere near its current success if Elon wasn't involved in the early days with funding and organizing the initial roadmap.
Space Musk promises a lot, has a grand vision, and gets stuff delivered. The price may be higher than he says and delivered later, but it's orders of magnitude better than the competition.
Tesla Musk makes and sells cars. They're ok. Not bad, not amazing, glad they precipitated the EV market, but way too pricey now that it's getting mature. Still, the showmanship is still useful for the brand.
Everything Else Musk could genuinely be improved by replacing him with an LLM: it would be just as overconfident and wrong, but cost less to get there.
(This is, I think, an apolitical observation: whatever you think about Trump, he is arguing for a pretty major restructuring of political power in a manner that is identifiable in fascism. And Musk is, pretty unarguably, bankrolling this.)
Big, long lived companies excel at delivering exactly what they say they are, and people vote with their wallet on this.
Also Sergey Brin is back in there working on AI.
I don't know if GPT-4 is smart enough to be successful at something like what OP describes, but I'm pretty sure it could cause a lot of trouble before it fails either way.
The real question here is why this is concerning, given that you can - and we already do - have humans who are doing this kind of stuff, in many cases, with considerable success. You don't need an AI to run a cult or a terrorist movement, and there's nothing about it that makes it intrinsically better at it.
But I agree with your point. And it gets very ugly when these big institutions suddenly lose trust. They almost always deserve it, but it can upend daily life.
While banditry can work out in the short term; it pretty much always ends up the same way. There aren’t a lot of old gangsters walking around.
There are actually fascinating theories that the origin of money is not as a means of replacing a barter system, but rather as a way of keeping track who owed favors to each other. IOUs, so to speak.
They aren't letting anyone external have access to their top end products either. Google invented transformers and kept the field stagnant for 5 years because they were afraid it would eat into their search monopoly.
I do not see how that is possible considering I have no clue who the second last owner of a cash was before me, most of the time.
How long is long?
I would bet on either side, but not in the middle on the model providers.
In his defense he is trying to fuck us all by feverishly lobbying the US Congress about the fact that "AI is waaay to dangerous" for newbs and possibly terrorists to get their hands on. If that eventually pays off, then there will be 3-4 companies that control all of any LLMs that matter.
Is that a wish, or a fact, or just plain wrong? You know that just because you want something to be true, it isn't necessarily, right?
I wouldn't trust somebody who cannot distinguish between wishful thinking and reality.
I don’t get how this follows from the quote you posted?
My interpretation is that successful people create durable, self sustaining institutions that deliver deeply meaningful benefits at scale.
I think that this interpretation is aligned with your nobler definitions. But your view of the purpose of government and religion may be more cynical than mine :)
They are facing competition from companies making hardware geared toward that inference that I think will push their margins down over time.
On the other end of the competitive landscape, what moat do those companies have? What is to stop OpenAI from pulling a Facebook and Sherlocking the most profitable products built on their platform?
Something like Apple developing a chip than can do LLM inference on device would completely upend everything.
2) the leader of only one of them is threatening to lock up journalists, shut down broadcasters, and use the military against his enemies.
3) only one of them led an attempted autogolpe that was condemned at the time by all sides
4) Musk is only backing the one described in 1, 2 and 3 above.
It's not really arguable, all this stuff.
The guy who thinks the USA should go to Mars clearly thinks he's better throwing in his lot with the whiny strongman dude who is on record -- via his own social media platform -- as saying that the giant imaginary fraud he projected to explain his humiliating loss was a reason to terminate the Constitution.
And he's putting a lot of money into it, and co-running the ground game. But sure, he wants to go to Mars. So it's all good.
Not even close to everything.
E.g. training on the NY Times and Wikipedia has zero meaningful AI. Training on books from reputable publishers similarly has zero meaningful AI. Any LLM usage was to polish prose or assist with research or whatever, but shouldn't affect the factual quality in any significant way.
The web hasn't been polluted with AI any more than e-mail has been polluted with spam. Which is to say it's there, but it's also entirely viable to separate. Nobody's worried that the group email chain with friends is being overrun with spam or with AI.
Having the general ability to accomplish something doesn't magically infer integrity, you doing what you say does. Misleading and dissembling about doing what you say you will do is where you get the untrustworthy label, regardless of your personal animus or positive view of Musk.
When they know they have all the crown jewels, they will reduce then eliminate their support of OpenAI. This was, is, and will be a strategic action by Satya.
"Embrace, extend, and extinguish". We're in the second stage now.
These early promissory notes were more like coupons that were redeemed by the merchants. It didn't matter how many times a coupon was traded. As a good merchant, you knew how many of your notes you had to redeem because you're the one issuing the notes.
Anthropic takes safety to mean "let's not teach people how to build thermite bombs, engineer grey goo nanobots, or genome-targeted viruses", which is the traditional futurist concern with AI safety.
OpenAI and Google safety teams are far more concerned with revising history, protecting egos, and coddling the precious feelings of their users. As long as no fee-fees are hurt, it's full speed ahead to paperclip maximization.
Google and Meta aren't exactly lacking in conversation data: Facebook, Messenger, Instagram, Google Talk, Google Groups, Google Plus, Blogspot comments, Youtube Transcripts, &tc. The breadth and and breadth of data those 2 companies are sitting on that goes back for years is mind boggling.
SpaceX and Tesla have both accomplished great things. There's a lot of talented people that work there. Elon doesn'r deserve all the credit for all their hard work.
That’s not to say you shouldn’t worry about AI. ChatGPT and so on are all tuned to present a western view on the world and morality. In your example it would be perfectly possible to create a terrorist LLM and let people interact with it. It could teach your children how to create bombs. It could lie about historical events. It could create whatever propaganda you want. It could profile people if you gave it access to their data. And that is on the text side, imagine what sort of videos or voices or even video calls you could create. It could enable you to do a whole lot of things that “western” LLMs don’t allow you to do.
Which is frankly more dangerous than the cyberpunk AI. Just look at the world today and compare it to how it was in 2000. Especially in the US you have two competing perceptions of the political reality. I’m not going to get into either of them, more so the fact that you have people who view the world so differently they can barely have a conversation with each other. Imagine how much worse they would get with AIs that aren’t moderated.
I doubt we’ll see any sort of AGI in our life times. If we do, then sure, you’ll be getting cyberpunk AI, but so far all we have is fancy auto-complete.
If these models will be trained on the outputs of themselves (and other models), then it's not so much a "flywheel", as it is a Perpetual Motion Machine.
Either the next tokens can include "this question can't be answered", "I don't know" and the likes, in which case there is no omniscience.
Or the next tokens must contain answers that do not go on the meta level, but only pick one of the potential direct answers to a question. Then the halting problem will prevent finite time omniscience (which is, from the perspective of finite beings all omniscience).
OpenAI’s potential issue is that if Google offers tokens at a 10% gross margin, OpenAI won’t be able to offer api tokens at a positive gross margin at all. Their only chance really is building a big subscription business. No way they can compete with a hyperscaler on api cost long run
[1]: https://www.tanayj.com/p/openai-and-anthropic-revenue-breakd...
The combination of the latest models in products that people want to use is what will drive growth.
The vast majority of people using LLMs just use ChatGPT directly. Anthropic is doing fine for technical or business customers looking to offer LLM services in a wrapper but that doesn't mean they register in the public consciousness.
I too am happy every day the good guys are winning today and always have won for all of history.
Models don't have this benefit. In Cursor, I can even switch between models. It would take a lot of convincing for me to switch off of Cursor, however.
So 3x the fees, if they're currently at $20/user/month. That's a big jump, and puts the tool in a different spending category as it goes from just another subscription to more like another utility bill in users' minds. The amount of value you're getting out of it is hard to quantify for most people, so I imagine they'd lose customers.
Also there's a clear market trend, and that is that AI services are $20 for the good version, or free. $60 is not a great price to compete in that market at unless you're clearly better.
If there's an actual business to be found in all this, that's where it's going to be.
The consumer side of this bleeds cash currently and I'm deeply skeptical of enough of the public being convinced to pay subscription fees high enough to cover running costs.
If you want to create a country- better have a good reason, many noble people have done it, many bad people have done it.
If you want to create a religion- you're psycho (or you really are the chosen one)
Notice how Sam's definition of success increases with the probability of psychopathy.
OpenAI can become a bigger advertising company than Google.
When people ask questions like which product should I buy, ChatGpt can recommend products from companies who are willing to give money to it to have their products recommended by AI.
This is super incorrect. The base model is trained to predict the distribution of next words (which obviously necessitates a ton of understanding about the language)
Then there's the RLHF step, which teaches the model about what humans want to see
But o1 (which is one of these LLMs) is trained entirely differently to do reinforcement learning on problem solving (we think), so it's a pretty different paradigm. I could see o1 planning very well
I think he is making an allusion to Apple's culture.
There's successful companies because their product is good, there's more successful companies because they started early (and it feels like a monopoly: Google, Microsoft), and there's the most successful company that tells you what you are going to buy (Apple's culture).
https://www.buzzfeednews.com/article/richardnieva/worldcoin-...
https://www.technologyreview.com/2022/04/06/1048981/worldcoi...
Amazon are trustworthy?
That's going to be news to the large number of people who've received counterfeit books, dodgy packages, and so on. This is not a new problem:
With how much profit per visit though?
I just used ChatGPT and 2 other similar services for some personal queries. I copy-pasted the same query in all 3 of them, using their free accounts, just in case one answer looks better than the others. I got into this habit because of the latency: in the time it takes for the first service to answer, I've had time to send the query to 2 others, which makes it easier to ignore the first response if it's not satisfying. Usually it's pretty much the same though. We can nitpick about benchmarks, but I'm not sure they're that relevant for most users anyway. It doesn't matter much to me whether something is wrong 10 or 20% of the time, in both cases I can only send queries for which I can easily check that the answer makes sense.
I see other comments mentioning they stopped their ChatGPT Plus subscription because the free versions work well enough. I've never paid myself and it doesn't look like I ever will, because things keep getting better for free anyway. My default workflow is already to prompt several LLMs so one could go down, I wouldn't even notice. I'm sure I'm an outlier with this, but still, people might use Perplexity for their searches, some WhatsApp LLM chatbot for their therapy session, purely based on convenience. There's no lock-in whatsoever into a particular LLM chat interface, and the 3B monthly visits don't seem to make ChatGPT better than its competitors.
And of course as soon as they'll add ads, product placement, latency or any other limitation their competitor doesn't have, I'll stop using them, and keep on using the other N instead. At this point it feels like they need Microsoft more than Microsoft needs them.
I can’t shake the thought that meta played an integral role in the open-source nature of the LLM movement. Am I wrong, I can’t help but think I’m missing something.
That... is actually a pretty interesting argument. I have to admit that if an objective morality existed floating in the Aether, there would be no way to logically prove or disprove that one's beliefs matched it.
Since I can't argue it logically, let me make an emotional appeal by explaining how my beliefs are tied to my life:
I chose to be a utilitarian when I was 12 or so, though I didn't know it had that name yet. The reason I chose this is that I wanted my beliefs to be consistent and kind. Utilitarianism has only one basic rule, so it can't really conflict with itself. Kindness wise, you can technically weigh others however you like, but I think most utilitarians just assume that all people have equal worth.
This choice means that I doubted that my emotions captured any truth about morality. Over the years, my emotions did further effect my beliefs. For instance, I tweaked the rules to avoid "Tyranny of the Majority" type things. However, my beliefs also changed my emotions. One fruit of this is that I started to mediate conflicts more often instead of choosing a side. Sometimes it does make more sense to choose a side, but often people will all behave well if you just hear them out. Another fruit of these beliefs is that rather than thinking of things in terms of "good" or "bad", I now tend to compare states of the world as being better or worse than each other. This means that no matter how little capacity I have, I can still get myself to make things a little better for others.
All this to say, I feel like deciding to doubt my own feelings very much did what young me wanted it to do. I wouldn't be able to grow as a person if I thought I was right in the beginning.
I'd be interested to hear how you came to your beliefs. Given how firmly you've argued in this thread, it sounds like you probably have a story behind your beliefs too.
What I remember from the rational optimist - with trust, trade is unlimited.
what I remember from debt - just too much, need to read it.
That said, why would an investor give money to altman if he is untrustworthy? it just gets worse and worse.
They probably lose on each one, but it's the same with their competitors.
FWIW, regular folks now say "let me ask Chat" for what it used to be "let me Google that"; that is a huge cultural shift, and it happened in only a couple years.
This broken record is still going at it, going at it, going at it, ...
And yet, ChatGPT is number one, by a far margin; where's all of this "people could switch in a day if they wanted"?
Sure, you could comment on Digg, but it was a pain and not good for conversations, and that meant there was less to keep people around when it seemed like the company was started to put their finger on the scales for URL-submissions.
I will write it explicitly for you once again:
The plan is to make inference so cheap it's negligible.
No one thinks about the cost of a db query any more, but I'm sure people did back in the day (well, I suppose with cloud stuff, now people do need to think about it again haha)
The second idea being kicked around is synthetic data will create a new fountain of youth for data that will also fix its reasoning abilities.
This is clearly evident to anyone who spends any amount of time working on non-trivial projects with both models.
These companies will never admit it but AI is built on the back of piracy archives, easiest way and cheapest way to getting massive amounts of quality data.
The new O1 models by open ai are surprisingly good. So good that you can see intelligence at work through solutions when you go from:
GPT4 O1 mini O1 preview
Way better than the competition. And they have a mobile voice app with advanced voice which is phenomenal.
not a healthy dynamic.
I have literally never heard that from anyone, and most everyone I know is “regular folk”.
I work in (large scale) construction, and no one has ever said anything even remotely similar. None of my non-technical or technical business contacts.
I’m not saying you haven’t, and that your in-group doesn’t, just that it’s not quite the cultural phenomenon you’re suggesting.
Even if the latter becomes commoditized (and we are far from that in practice), the former is a serious mote. Just like their is no secret to building a search engine or a social network platform (and that is not saying their are no technical challanges), operating it profitably requires massive aggregate user profile exploitation potential which requires huge upfront loss leaders.
I'm asking because I know that with some prompts it gets the answer correct, and in those cases nothing in the tokenization has changed.
OpenAI is not really leading the LLM world anyway ever since Claude sonnet 3.5 came out.
We're already starting to see signs of that even with GPT-3, which really was auto-complete: https://academic.oup.com/pnasnexus/article/3/2/pgae034/76109...
Fortunately even the best LLMs are not yet all that competent with anything involving long-term planning, because remember too that "megalomaniac" includes Putin, Stalin, Chairman Mao, Pol Pot etc., and we really don't want the conversation to be:
"Good news! We accidentally made CyberMao!"
"Why's that good news?"
"We were worried we might accidentally make CyberSatan."
They can certainly appear to be very smart due to having the subjective (if you can call it that) experience of 2.5 million years of non-stop reading.
That's interesting, useful, and is both an economic and potential security risk all by itself.
But people keep putting these things through IQ tests; as there's always a question about "but did they memorise the answers?", I think we need to consider the lowest score result to be the highest that they might have.
At first glance they can look like the first graph, with o1 having an IQ score of 120; I think the actual intelligence, as in how well it can handle genuinely novel scenarios in the context window, are upper-bounded by the final graph, where it's more like 97:
https://www.maximumtruth.org/p/massive-breakthrough-in-ai-in...
So, with your comment, I'd say the key word is: "currently".
Correct… for now.
But also:
> All these chatgpt things have a very limited working memory and can't act without a query.
It's easy to hook them up to a RAG, the "limited" working memory is longer than most human's daily cycle, and people already do put them into a loop and let them run off unsupervised despite being told this is unwise.
I've been to a talk where someone let one of them respond autonomously in his own (cloned) voice just so people would stop annoying him with long voice messages, and the other people didn't notice he'd replaced himself with an LLM.
I dunno if you have kids, but for me, main thing is having kids. It does a lot of things to your psyche, both suddenly and over a long period of time.
It's the first time you would truly take a bullet for someone, no questions asked. It tells you how much you know on an instinctual level. It forces you to define what behavior you will punish vs what you will reward. It expands your time horizons- suddenly I care very much how the world will be after I'm gone. It makes you read more mother goose books too. They all say the same things, even in different languages. It's actually crazy we debate morals at all.
They make a loss overall because they spend a ton on R&D.
I don’t think it’s that cut and dried though. Many users run into similar issues as other issues with things like reasoning (which is (allegedly) being addressed) and hallucinations (less so) both of which in turn become core reasons for subsequent better versions of the tech. Whether the subsequent versions deliver on those promises is irrelevant (though they often don’t) to that, at least IMHO, being a core reason to “stay on board” with the product. I have to think if they announced tomorrow they couldn’t afford to train the next one that there would be a pretty substantial attrition of paying users, which then makes it even harder to resume training in the future, no?
Reddit had subreddits long before the migration. Reddit was a not very used site that had all the features.
It was Digg that made the decisions to force people off of it, not anything reddit did outside of having a space available that worked.
Reddit did not win due to it's features, it won because Digg said it doesn't matter what the users think, we will redesign the site and change how it works regardless of the majority telling us they don't want it.
Every OpenAI thread focuses on the moat, but surely that was baked into their business dealings of the last 60 days.
Twenty-four years later I still regret not being able to raise money to enable us to keep working on that nascent startup. In most ways it was still too early. Google was still burning through VC money at that point and the midwestern investors we had some access to didn't get it. And, honestly they were probably correct. Compute power was still too expensive and quality data sources like published text were mostly locked up and generally not available to harvest.
I am starting to suspect that LLMs, short term for a few years, will end up mostly having value as assistants to experts in their fields who know how to prompt and evaluate output.
I see an ocean of startups doing things like ‘AI accounting systems’ that scare me a little. I just don’t feel good having LLM based systems making unsupervised important decisions.
I do love designing and writing software with LLMs, but that is supervised activity that save me a ton of time. I also enjoy doing fun things with advanced voice mode for ChatGPT, like practicing speaking in French - again, a supervised activity.
re: ownership of IP: I find the idea hilarious that by OpenAI declaring ‘AGI achieved!’ that Microsoft might get cut out of IP rights it has supposedly paid for.
I don't have kids, it does make a lot of sense that that would affect a person's psyche. The bit about having to define what behavior is good or bad seems to me like you are working out your beliefs through others, which seems like a reasonable way to do things since you get to have an outside perspective on the effects of what you are internalizing.
About debating morality though. That's exactly where principles become needed. It's great to say that we should be kind, but who are we kind to? It can't always be everyone at the same time. To bring things back to the trolley problem, I may save my mom, but it really is super unfair to the 20 people on the other track. This sort of thing is exactly why people consider nepotism to be wrong
But in general it’s hard to separate human from AI text.
Honestly the folks who don’t want to admit that it’s tokenization are just extremely salty that AI is actually good right now. Your “AI couldn’t tell me how many Rs in strawberry” stuff is extreme cope for your job prospects evaporating from a system that can’t spell correctly.
Ha. Really though, the entirety of venture capital could be summarized as “this probably won’t pay out but if it does it’s gonna be epic”. I wouldn’t read too much into America’s thirstiest capitalists spending their discretionary billions on the latest hype cycle.
The Wikipedia part at least, is incorrect. Currently Wikipedia mods/admins are dealing with AI generated articles being uploaded.
As for NYT - I am assuming that lots of those stories are already available in some blog or the other.
The e-mail and web forums are 100% polluted with spam, which takes constant effort to remove. For GenAI based content, it is far harder to identify and remove.
This example assumes the effort required to keep the web functional can deal with AI created content. Speaking from experience, our filters (human and otherwise) cannot. They fail to do so even now.
PS: Even given your example of closed email chains - the information in that depends on sources people read. Like plastic pollution in the food chain, this is inescapable.
It's not about a heuristic on text of unknown provenance -- it's about publishers that exert a certain level of editorial control and quality verification. Or social reputation mechanisms that achieve the same.
That's what is preventing your "model collapse". Reputations of provenance. Not pure-text heuristics.
And they've always dealt with spam and low-quality submissions before. The system is working.
> As for NYT - I am assuming that lots of those stories are already available in some blog or the other.
I don't know what relevance that has to what we're talking about. The point is, train on the NYT. Blogs don't change what's on the NYT.
> The e-mail and web forums are 100% polluted with spam, which takes constant effort to remove.
They've always been polluted with low-quality content. So yes, either don't train on them, or only train on highly upvoted solutions, etc.
AI pollution isn't fundamentally any different from previous low-quality content and spam. It's not terribly difficult to determine which parts of the internet are known to be high-quality and train only on those. LLM's can't spam the NY Times.
>The point is, train on the NYT. Blogs don't change what's on the NYT.
The counter point is that NYT content is already in the training data because its already replicated or copied into random blogs.
>So yes, either don't train on them, or only train on highly upvoted solutions, etc.
Highly upvoted messages on reddit are very regular bots copying older top comments. Mods already have issues with AI comments.
----
TLDR: Pollution is already happening. Verification does not scale, while generation scales.
I’m a CPA and software engineer currently interviewing around for dev positions, and most of the people I’ve encountered running these companies are neither CPAs nor accountants and have little domain knowledge. It’s a scary combination with an LLM that demands professional skepticism for every word it says. I wouldn’t trust those companies in their current state.
That's not a counter point. My point is, train on things like the NYT, not random blogs. You can also whitelist the blogs you know are written by people, rather than randomly spidering the whole internet.
Also, no -- most of the NYT hasn't been copied into blogs. A small proportion of top articles, maybe.
> Highly upvoted messages on reddit are very regular bots copying older top comments.
What does that matter if the older top comment was written by a person? Also, Reddit is not somewhere you want to train in the first place if you're trying to generate a model where factual accuracy matters.
> Verification does not scale, while generation scales.
You don't need to verify everything -- you just need to verify enough stuff to train a model on. We're always going to have plenty of stuff that's sufficiently verified, whether from newspapers or Wikipedia or whitelisted blogs or books from verified publishers or whatever. It's not a problem.
You shouldn't be training on blogspam from random untrusted domains in the first place. So it doesn't matter if that junk is AI-generated or not.
""Healthy family relationships and rich circle of diverse friends" is an objectively better definition than "Money and companies with high stock prices""
Pretty broad principles we're comparing there.
When you get into specific cases, that's where you really need the debate and often there's no right answer, depending on the case. This is why we want judges who have a strong moral compass.
These values are bundled up in a person and they should even counterbalance each other. "Be Kind" should be balanced with "Be Strong". "Be Generous" should be balanced with "Be thrifty" and so on. The combination of these things is what we mean when we say someone has a moral compass.
I would argue it's immoral in some sense to sacrifice your mother for 5 other strangers. But these are fantasy cases that almost never happen.
A more realistic scenario is self defense or war.
Unlike the hyperscalers (i.e. cloud providers), Meta has a use for these themselves for inference to run their business on.
That was OK in the case of Google and Facebook because switching off of either platform was too costly for a consumer. Clear path to profitability.
But OpenAI is super easy to switch off of. You can plug and play models very easily.
Right now it's Cursor for code editing. I'm working on building a "Cursor for video editing". And many others are working on "Cursor for x domain".
...
If you kill ChatGPT, users will be on Claude in about three seconds.
...
If you kill ChatGPT, users will be on Claude in about three seconds.
...
> Also, Reddit is not somewhere you want to train in the first place if you're trying to generate a model where factual accuracy matters.
There is no model that can create facutal accuracy. This would basically contravene the laws of physics. LLMs predict the next token.
>You shouldn't be training on blogspam from random untrusted domains in the first place. So it doesn't matter if that junk is AI-generated or not
Afaik, all the current models are trained on this corpus. That is how they work.
Factual accuracy is not binary, it is a matter of degrees. Obviously training on content that is more factually correct will result in more factually correct next tokens. This is a pretty fundamental aspect of LLM's.
> Afaik, all the current models are trained on this corpus.
Then apologies for being so blunt, but you know wrong. There is a tremendous amount of work that goes on by the LLM companies in verifying, sanitizing, and structuring the training corpuses, using a wide array of techniques. The are absolutely not just throwing in blogspam and hoping for the best.
In any case, the real issue with your logic is in thinking that an individual's personal views on the morality of a situation are correlated with the actual, potentially harsh, reality of that situation. There is rarely ever such a correlation and when it happens, it is likely a coincidence.
Is Sam Altman untrustworthy? Of course, he seems like a snake. That doesn't mean he will fail. And predicting the reality of the thing (that awful people sometimes succeed in this world) does not make someone inherently wrong or negative or even cynical - it just makes them a realist.
You are contradicting the papers and work that the people who make the models are saying. Alternatively, you are looking at the dataset curation process with rose tinted glasses. >There is a tremendous amount of work that goes on by the LLM companies in verifying, sanitizing, and structuring the training corpuses, using a wide array of techniques.
Common crawl is instrumental in building our models, 60% of GPT's training data was Common Crawl. (https://arxiv.org/pdf/2005.14165) pg 9.
CC in turn was never intended for LLM training, this misalignment in goals results in downstream issues like hate speech, NYT content, copyrighted content and more getting used to train models.
https://foundation.mozilla.org/en/research/library/generativ... (This article is to establish the issues with CC as a source of LLM training)
https://facctconference.org/static/papers24/facct24-148.pdf (this details those issues.)
Firms, such as the NYT are now stopping common crawl from archiving their pages. https://www.wired.com/story/the-fight-against-ai-comes-to-a-...
-----
TLDR: 'NYT' and other high quality content has largely been ingested by models. Reddit and other sources play a large part in training current models.
While I appreciate your being blunt, this also means not being sharp and incisive. Perhaps precision would be required here to clarify your point.
Finally -
>Factual accuracy is not binary, it is a matter of degrees. Obviously training on content that is more factually correct will result in more factually correct next tokens
What. Come on, I think you wouldnt agree with your own statement after reading it once more -Factual correctness is not a matter of degrees.
Furthermore, facts dont automatically create facts. Calcuation, processing, testing and verification create more facts. Just putting facts together creates content.
From what I've been able to gather interacting with other sectors, it seems like software is pretty unique in having a culture of sharing everything—tools, documentation, best practices, tutorials, blogs—online for free. Most professions can't be picked up by someone learning on their own from nothing but the internet in the way that software can.
I strongly suspect that the result will be that LLMs (being trained on the internet) do substantially better on software related tasks than they do in other domains, but software developers may be largely blind to that difference since they're not experts in the other domain.
I don't think there are any major walls either, but I think there are at least a few more plateaus we'll hit and spend time wandering around before finding the right direction for continued progress. Meanwhile, businesses/society/etc can work to catch up with the rapid progress made on the way to the current plateau.
> What. Come on, I think you wouldnt agree with your own statement after reading it once more -Factual correctness is not a matter of degrees.
Of course it is. An LLM can be correct 30% of the time, 80% of the time, 95% of the time, 99% of the time. If that's not a matter of degrees, I don't know what is. If you're looking for 100% perfection, I think you'll find that not even humans can do that. ;)
Modern cash systems involve anonymity and do not inherently keep track of the ownership history of money (as I noted). This anonymity is a fundamental feature of cash and many forms of currency today. Sure, early forms of currency might have functioned in small, close-knit communities and in such contexts, people were more likely to know each other’s social debts and relationships.
My point about cash being anonymous was meant to highlight how modern currency differs from the historical concept of money as a social ledger. This contrast is important because it shows how much the role of money has evolved.
I can't use the OpenAI app anyway as they demand a logged-in Google account. I do have play services but not logged in.
It boggles my mind why they want to insist that you make an account with their total competition in order to use their service but they do.
None of these things are arguable in the abstract. When you're confronted with a case where you sacrifice one, it's always for the sake of another.
Likely.
Re: > we have viable, working, scalable mechanisms to avoid the "pollution" you're worried about.
Do note - it’s the scalable mechanisms that I am looking at. I dont think the state of the art has shifted much since the last paper by OpenAI.
Can you link me to some new information or sources that lend credence to your claim.
> An LLM can be correct 30% of the time, 80% of the time, 95%…
That would be the error rate, which can be a matter of degrees.
However factual correctness largely cannot - the capital of Sweden today is Stockholm, with 0% variation in that answer.
Amazon has been ignoring the problem for a long time, and is well aware of it.
They're so aware of it that I'd personally (not a lawyer though) consider them culpable due to their inaction in making any substantial actions towards fixing the problems.
From your comments, it's clear you've already made up your mind that it can't possibly be true and you're just trying to find rationalisations to support your narrative. I don't understand why you feel the need to be rude about it though.
Every single person who claimed AI was a great help to their job in writing software that I've encountered was either inexperienced (regardless of age) or working solely on very simple tasks.
I'm assuming you aren't familiar with these terms and so am defining them. Forgive me if you already were familiar.
Consequentialists think that the purpose of morality is to prevent "bad" consequences from happening. From a consequentialist perspective, one can very much argue about what makes a consequence "bad", and it makes a lot of sense to do so if we are trying to improve the human condition. Furthermore, I think consequentialists tend to care more about making their systems consistent, mainly so they are fair. As a side effect though, no principles have to be sacrificed when making a concrete decision, since none of them conflict. (That's what it means for a system to be consistent)
Virtue ethicists think that the purpose of morality is to be a "good" person. I think you are correct that it's pretty hard to define what a "good" person is. There are also many different types of "good" people. Even if you had such a person with consistent principles, if you try and stuff everyones "good" principles into them, they would become inconsistent. It's hard for me to tell exactly what the point of being "good" is supposed to be if it is not connected to the consequences of one's actions, in which case one would just be a consequentialist, However, if the point was to improve the human condition, then I think it would take a lot of different types of "good" people, so it doesn't try and make sense to argue our way into choosing one of them.
This isn't really an argument for a position as much as me trying to figure out where we disagree. Does that all sound correct to you?
> this claim ... is hard to evaluate without a well-formed definition of what it means to have a world model
Absolutely yes, but that only makes it more imperative that we're analyzing things critically, rigorously, and honestly. Again you and I may be on the same side here. Mainly my point was that asserting the intrinsic non-intelligence of LLMs is a very bad take, as it's not supported by evidence and, if anything, it contradicts some (admittedly very difficult to parse) evidence we do have that LLMs might be able to develop a general capability for constructing mental models of the world.
He's a good marketer and created a cult of personality.
If he's so great at building businesses then just look at Twitter where there was no one who managed him.
When it comes to stuff like this, the state of the art at any given time is irrelevant. Only the first couple of time derivatives matter. How much room for growth do you have over the next 5-10 years?
Show me a Markov generator that can explain how it works.