←back to thread

625 points lukebennett | 6 comments | | HN request time: 0s | source | bottom
Show context
LASR ◴[] No.42140045[source]
Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

I lead a team exploring cutting edge LLM applications and end-user features. It's my intuition from experience that we have a LONG way to go.

GPT-4o / Claude 3.5 are the go-to models for my team. Every combination of technical investment + LLMs yields a new list of potential applications.

For example, combining a human-moderated knowledge graph with an LLM with RAG allows you to build "expert bots" that understand your business context / your codebase / your specific processes and act almost human-like similar to a coworker in your team.

If you now give it some predictive / simulation capability - eg: simulate the execution of a task or project like creating a github PR code change, and test against an expert bot above for code review, you can have LLMs create reasonable code changes, with automatic review / iteration etc.

Similarly there are many more capabilities that you can ladder on and expose into LLMs to give you increasingly productive outputs from them.

Chasing after model improvements and "GPT-5 will be PHD-level" is moot imo. When did you hire a PHD coworker and they were productive on day-0 ? You need to onboard them with human expertise, and then give them execution space / long-term memories etc to be productive.

Model vendors might struggle to build something more intelligent. But my point is that we already have so much intelligence and we don't know what to do with that. There is a LOT you can do with high-schooler level intelligence at super-human scale.

Take a naive example. 200k context windows are now available. Most people, through ChatGPT, type out maybe 1500 tokens. That's a huge amount of untapped capacity. No human is going to type out 200k of context. Hence why we need RAG, and additional forms of input (eg: simulation outcomes) to fully leverage that.

replies(43): >>42140086 #>>42140126 #>>42140135 #>>42140347 #>>42140349 #>>42140358 #>>42140383 #>>42140604 #>>42140661 #>>42140669 #>>42140679 #>>42140726 #>>42140747 #>>42140790 #>>42140827 #>>42140886 #>>42140907 #>>42140918 #>>42140936 #>>42140970 #>>42141020 #>>42141275 #>>42141399 #>>42141651 #>>42141796 #>>42142581 #>>42142765 #>>42142919 #>>42142944 #>>42143001 #>>42143008 #>>42143033 #>>42143212 #>>42143286 #>>42143483 #>>42143700 #>>42144031 #>>42144404 #>>42144433 #>>42144682 #>>42145093 #>>42145589 #>>42146002 #
senko ◴[] No.42140679[source]
No.

The scaling laws may be dead. Does this mean the end of LLM advances? Absolutely not.

There are many different ways to improve LLM capabilities. Everyone was mostly focused on the scaling laws because that worked extremely well (actually surprising most of the researchers).

But if you're keeping an eye on the scientific papers coming out about AI, you've seen the astounding amount of research going on with some very good results, that'll probably take at least several months to trickle down to production systems. Thousands of extremely bright people in AI labs all across the world are working on finding the next trick that boosts AI.

One random example is test-time compute: just give the AI more time to think. This is basically what O1 does. A recent research paper suggests using it is roughly equivalent to an order of magnitude more parameters, performance wise. (source for the curious: https://lnkd.in/duDST65P)

Another example that sounds bonkers but apparently works is quantization: reducing the precision of each parameter to 1.58 bits (ie only using values -1, 0, 1). This uses 10x less space for the same parameter count (compared to standard 16-bit format), and since AI operatons are actually memory limited, directly corresponds to 10x decrease in costs: https://lnkd.in/ddvuzaYp

(Quite apart from improvements like these, we shouldn't forget that not all AIs are LLMs. There's been tremendous advance in AI systems for image, audio and video generation, interpretation and munipulation and they also don't show signs of stopping, and there's possibility that a new or hybrid architecture for the textual AI might be developed).

AI winter is a long way off.

replies(2): >>42140877 #>>42142955 #
1. limaoscarjuliet ◴[] No.42140877[source]
Scaling laws are not dead. The number of people predicting death of Moore's law doubles every two years.

- Jim Keller

https://www.youtube.com/live/oIG9ztQw2Gc?si=oaK2zjSBxq2N-zj1...

replies(2): >>42141464 #>>42142962 #
2. nyrikki ◴[] No.42141464[source]
There are way too many personal definitions of what "Moore's Law" even is to have a discussion without deciding on a shared definition before hand.

But Goodhart's law; "When a measure becomes a target, it ceases to be a good measure"

Directly applies here, Moore's Law was used to set long term plans at semiconductor companies, and Moore didn't have empirical evidence it was even going to continue.

If you say, arbitrarily decide CPU, or worse, single core performance as your measurement, it hasn't held for well over a decade.

If you hold minimum feature size without regard to cost, it is still holding.

What you want to prove usually dictates what interpretation you make.

That said, the scaling law is still unknown, but you can game it as much as you want in similar ways.

GPT4 was already hinting at an asymptote on MMLU, but the question is if it is valid for real work etc...

Time will tell, but I am seeing far less optimism from my sources, but that is just anecdotal.

3. slashdave ◴[] No.42142962[source]
Moore's law is doomed. At some point you start reaching the level of individual atoms. This is just physics.
replies(2): >>42143378 #>>42144700 #
4. XenophileJKO ◴[] No.42143378[source]
You are missing the economic component.. it isn't just how small can a transistor be.. it was really about how many transistors can you get for your money. So even when we reach terminal density, we probably haven't reached terminal economics.
replies(1): >>42144282 #
5. slashdave ◴[] No.42144282{3}[source]
I didn't say we have currently reached a limit. I am saying that there obvious is a limit (at some point). So, scaling cannot go forever. This is a counterpoint to the dubious analogy with deep learning.
6. Earw0rm ◴[] No.42144700[source]
The limits are engineering, not physics. Atoms need not be a barrier for a long time if you can go fully 3D, for example, but manufacturing challenges, power and heat get in the way long before that.

Then you can go ultra-wide in terms of cores, dispatchers and vectors (essentially building bigger and bigger chips), but an algorithm which can't exploit that will be little faster on today's chips than on a 4790K from ten years ago.