Remember the revolutionary, seemingly inevitable tech that was poised to rewrite how humans thought about transportation? The incredible amounts of hype, the secretive meetings disclosing the device, etc.? That turned out to be the self-balancing scooter known as a Segway?
2. Segways were just ahead of their time: portable lithium-ion powered urban personal transportation is getting pretty big now.
I got to try one once. It was very underwhelming...
No, I don't remember it like that. Do you have any serious sources from history showing that Segway hype is even remotely comparable to today's AI hype and the half a trillion a year the world is spending on it?
You don't. I love the argument ad absurdum more than most but you've taken it a teensy bit too far.
The Segway always had a high barrier to entry. Currently for ChatGPT you don't even need an account, and everyone already has a Google account.
https://www.youtube.com/watch?v=SK362RLHXGY
Hey, it still beats what you go through at the airports.
It is even cheaper to serve an LLM answer than call a web search API!
Zero chance all the users evaporate unless something much better comes along, or the tech is banned, etc...
> It is even cheaper to serve an LLM answer than call a web search API
These, uhhhh, these are some rather extraordinary claims. Got some extraordinary evidence to go along with them?
Counterpoint: That's how I feel about ebikes and escooters right now.
Over the weekend, I needed to go to my parent's place for brunch. I put on my motorcycle gear, grabbed my motorcycle keys, went to my garage, and as I was about to pull out my BMW motorcycle (MSRP ~$17k), looked at my Ariel ebike (MSRP ~$2k) and decided to ride it instead. For short trips they're a game changing mode of transport.
Anecdotally thanks to hardware advancements the locally-run AI software I develop has gotten more than 100x faster in the past year thanks to Moore's law
LLMs have hundreds of millions of users. I just can't stress how insane this was. This wasn't built on the back of Facebook or Instagram's distribution like Threads. The internet consumer has never so readily embraced something so fast.
Calling LLMs "hype" is an example of cope, judging facts based on what is hoped to be true even in the face of overwhelming evidence or even self-evident imminence to the contrary.
I know people calling "hype" are motivated by something. Maybe it is a desire to contain the inevitable harm of any huge rollout or to slow down the disruption. Maybe it's simply the egotistical instinct to be contrarian and harvest karma while we can still feign to be debating shadows on the wall. I just want to be up front. It's not hype. Few people calling "hype" can believe that this is hype and anyone who does believes it simply isn't credible. That won't stop people from jockeying to protect their interests, hoping that some intersubjective truth we manufacture together will work in their favor, but my lord is the "hype" bandwagon being dishonest these days.
I can totally go about my life pretending Segway doesn't exist, but I just can't do that with ChatGPT, hence why the author felt compelled to write the post in the first place. They're not writing about Segway, after all.
You had me until you basically said, "and for my next trick, I am going to make up stories".
Projecting is what happens when someone doesn't understand some other people, and from that somehow concludes that they do understand those other people, and feels the need to tell everyone what they now "know" about those people, that even those people don't know about themselves.
Stopping at "I don't understand those people." is always a solid move. Alternately, consciously recognizing "I don't understand those people", followed up with "so I am going to ask them to explain their point of view", is a pretty good move too.
But I want to point out that going from CPU to TPU is basically the opposite of a Moore's law improvement.
(A mid to high end GPU can get similar or better performance but it's a lot harder to get more RAM.)
I haven't seen that at all. I've seen a whole lot of top-down AI usage mandates, and every time what sounds like a sensible positive take comes along, it turns out to have been written by someone who works for an AI company.
LLM are more useful than Segway, but it can still be overhyped because the hype is so much larger. So its comparable, as you say LLM is so much more hyped doesn't mean it can't be overhyped.
5060 Ti 16GB, $450
If you want more than 16GB, that's when it gets bad.
And you should be able to get two and load half your model into each. It should be about the same speed as if a single card had 32GB.
So? The blog notes that if something is inevitable, then the people arguing against it are lunatics, and so if you can frame something as inevitable then you win the rhetorical upper-hand. It doesn't -- however -- in any way attempt to make the argument that LLMs are _not_ inevitable. This is a subtle straw man: the blog criticizes the rhetorical technique of inevitabilism rather than engaging directly with whether LLMs are genuinely inevitable or not. Pointing out that inevitability can be rhetorically abused doesn't itself prove that LLMs aren't inevitable.
In times when people are being more honest. There's a huge amount of perverse incentive to chase internet points or investment or whatever right now. You don't get honest answers without reading between the lines in these situations.
It's important to do because after a few rounds of battleship, when people get angry, they slip something out like, "Elon Musk" or "big tech" etc and you can get a feel that they're angry that a Nazi was fiddling in government etc, that they're less concerned about overblown harm from LLMs and in fact more concerned that the tech will wind up excessively centralized, like they have seen other winner-take-all markets evolve.
Once you get people to say what they really believe, one way or another, you can fit actual solutions in place instead of just short-sighted reactions that tend to accomplish nothing beyond making a lot of noise along the way to the same conclusion.
How cheap is inference, really? What about 'thinking' inference? What are the prices going to be once growth starts to slow and investors start demanding returns on their billions?
The unprofitability of the frontier labs is mostly due to them not monetizing the majority of their consumer traffic at all.
Maybe it's more like Pogs.
Relative to its siblings, things have gotten worse. A GTX 970 could hit 60% of the performance of the full Titan X at 35% of the price. A 5070 hits 40% of a full 5090 for 27% of the price. That's overall less series-relative performance you're getting, for an overall increased price, by about $100 when adjusting for inflation.
But if you have a fixed performance baseline you need to hit, as long as tech gets improving, things will eventually be cheaper for that baseline. As long as you aren't also trying to improve in a way that moves the baseline up. Which so far has been the only consistent MO of the AI industry.
I think the core issue is separating the perception of value versus actual value. There have been a couple of studies to this effect, pointing to a misalignment towards overestimating value and productivity boosts.
One reason this happens imo, is because we sequester a good portion of the cognitive load of our thinking to the latter parts of the process so when we are evaluating the solution we are primed to think we have saved time when the solution is sufficiently correct, or if we have to edit or reposition it by re-rolling, we don't account for the time spent because we may feel we didn't do anything.
I feel like this type of discussion is effectively a top topic every day. To me, the hype is not in the utility it does have but in its future utility. The hype is based on the premise that these tools and their next iteration can and will make all knowledge-based work obsolete, but crucially, will yield value in areas of real need; cancer, aging, farming, climate, energy and etc.
If these tools stop short of those outcomes, then the investment all of SV has committed to it at this point will have been over invested and
This seems super duper expensive and not really supported by the more reasonably priced Nvidia cards, though. SLI is deprecated, NVLink isn't available everywhere, etc.
And nothing I've seen about recent GPUs or TPUs, from ANY maker (Nvidia, AMD, Google, Amazon, etc) say anything about general speedups of 100x. Heck, if you go across multiple generations of what are still these very new types of hardware categories, for example for Amazon's Inferentia/Trainium, even their claims (which are quite bold), would probably put the most recent generations at best at 10x the first generations. And as we all know, all vendors exaggerate the performance of their products.
Every layer of an LLM runs separately and sequentially, and there isn't much data transfer between layers. If you wanted to, you could put each layer on a separate GPU with no real penalty. A single request will only run on one GPU at a time, so it won't go faster than a single GPU with a big RAM upgrade, but it won't go slower either.