←back to thread

LLM Inevitabilism

(tomrenner.com)
1613 points SwoopsFromAbove | 1 comments | | HN request time: 0.295s | source
Show context
lsy ◴[] No.44568114[source]
I think two things can be true simultaneously:

1. LLMs are a new technology and it's hard to put the genie back in the bottle with that. It's difficult to imagine a future where they don't continue to exist in some form, with all the timesaving benefits and social issues that come with them.

2. Almost three years in, companies investing in LLMs have not yet discovered a business model that justifies the massive expenditure of training and hosting them, the majority of consumer usage is at the free tier, the industry is seeing the first signs of pulling back investments, and model capabilities are plateauing at a level where most people agree that the output is trite and unpleasant to consume.

There are many technologies that have seemed inevitable and seen retreats under the lack of commensurate business return (the supersonic jetliner), and several that seemed poised to displace both old tech and labor but have settled into specific use cases (the microwave oven). Given the lack of a sufficiently profitable business model, it feels as likely as not that LLMs settle somewhere a little less remarkable, and hopefully less annoying, than today's almost universally disliked attempts to cram it everywhere.

replies(26): >>44568145 #>>44568416 #>>44568799 #>>44569151 #>>44569734 #>>44570520 #>>44570663 #>>44570711 #>>44570870 #>>44571050 #>>44571189 #>>44571513 #>>44571570 #>>44572142 #>>44572326 #>>44572360 #>>44572627 #>>44572898 #>>44573137 #>>44573370 #>>44573406 #>>44574774 #>>44575820 #>>44577486 #>>44577751 #>>44577911 #
api ◴[] No.44572142[source]
My take since day one:

(1) Model capabilities will plateau as training data is exhausted. Some additional gains will be possible by better training, better architectures, more compute, longer context windows or "infinite" context architectures, etc., but there are limits here.

(2) Training on synthetic data beyond a very limited amount will result in overfitting because there is no new information. To some extent you could train models on each other, but that's just an indirect way to consolidate models. Beyond consolidation you'll plateau.

(3) There will be no "takeoff" scenario -- this is sci-fi (in the pejorative sense) because you can't exceed available information. There is no magic way that a brain in a vat can innovate beyond available training data. This includes for humans -- a brain in a vat would quickly go mad and then spiral into a coma-like state. The idea of AI running away is the information-theoretic equivalent of a perpetual motion machine and is impossible. Yudkowski and the rest of the people afraid of this are crackpots, and so are the hype-mongers betting on it.

So I agree that LLMs are real and useful, but the hype and bubble are starting to plateau. The bubble is predicated on the idea that you can just keep going forever.

replies(1): >>44578020 #
1. ogogmad ◴[] No.44578020[source]
The next step is clearly improved vision and everyday-physics models. These can also solve hallucinations.