OpenAI, Google and Anthropic are struggling to build more advanced AI

(www.bloomberg.com)

625 points lukebennett | 3 comments | 13 Nov 24 13:28 UTC | HN request time: 0.871s | source

Show context

LASR ◴[14 Nov 24 19:19 UTC] No.42140045[source]▶

Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

I lead a team exploring cutting edge LLM applications and end-user features. It's my intuition from experience that we have a LONG way to go.

GPT-4o / Claude 3.5 are the go-to models for my team. Every combination of technical investment + LLMs yields a new list of potential applications.

For example, combining a human-moderated knowledge graph with an LLM with RAG allows you to build "expert bots" that understand your business context / your codebase / your specific processes and act almost human-like similar to a coworker in your team.

If you now give it some predictive / simulation capability - eg: simulate the execution of a task or project like creating a github PR code change, and test against an expert bot above for code review, you can have LLMs create reasonable code changes, with automatic review / iteration etc.

Similarly there are many more capabilities that you can ladder on and expose into LLMs to give you increasingly productive outputs from them.

Chasing after model improvements and "GPT-5 will be PHD-level" is moot imo. When did you hire a PHD coworker and they were productive on day-0 ? You need to onboard them with human expertise, and then give them execution space / long-term memories etc to be productive.

Model vendors might struggle to build something more intelligent. But my point is that we already have so much intelligence and we don't know what to do with that. There is a LOT you can do with high-schooler level intelligence at super-human scale.

Take a naive example. 200k context windows are now available. Most people, through ChatGPT, type out maybe 1500 tokens. That's a huge amount of untapped capacity. No human is going to type out 200k of context. Hence why we need RAG, and additional forms of input (eg: simulation outcomes) to fully leverage that.

replies(43): >>42140086 #>>42140126 #>>42140135 #>>42140347 #>>42140349 #>>42140358 #>>42140383 #>>42140604 #>>42140661 #>>42140669 #>>42140679 #>>42140726 #>>42140747 #>>42140790 #>>42140827 #>>42140886 #>>42140907 #>>42140918 #>>42140936 #>>42140970 #>>42141020 #>>42141275 #>>42141399 #>>42141651 #>>42141796 #>>42142581 #>>42142765 #>>42142919 #>>42142944 #>>42143001 #>>42143008 #>>42143033 #>>42143212 #>>42143286 #>>42143483 #>>42143700 #>>42144031 #>>42144404 #>>42144433 #>>42144682 #>>42145093 #>>42145589 #>>42146002 #

simonw ◴[14 Nov 24 20:33 UTC] No.42140886[source]▶

>>42140045 #

Right. I've been saying for a while that if all LLM development stopped entirely and we were stuck with the models we have right now (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1/2, Qwen 2.5 etc) we could still get multiple years worth of advances just out of those existing models. There is SO MUCH we haven't figured out about how to use them yet.

replies(2): >>42142404 #>>42142817 #

1. niobe ◴[15 Nov 24 00:31 UTC] No.42142817[source]▶

>>42140886 #

> There is SO MUCH we haven't figured out about how to use them yet.

I mean, it's pretty clear to me they're a potentially great human-machine interface, but trying to make LLMs - in their current fundamental form - a reliable computational tool.. well, at best it's an expensive hack, but it's just not the right tool for the job.

I expect the next leap forward will require some orthogonal discovery and lead to a different kind of tool. But perhaps we'll continue to use LLMs as we knownthem now for what they're good at - language.

replies(1): >>42143045 #

2. simonw ◴[15 Nov 24 01:14 UTC] No.42143045[source]▶

>>42142817 (TP) #

One of the biggest challenges in learning how to use and build on LLMs is figuring out how to work productively with a technology that - unlike most computers - is inherently unreliable and non-deterministic.

It's possible, but it's not at all obvious and requires a slightly skewed way of looking at them.

replies(1): >>42143433 #

3. XenophileJKO ◴[15 Nov 24 02:26 UTC] No.42143433[source]▶

>>42143045 #

This really reminds me of a trend years ago to create probabilistic programming constructs. I think it was just a trend way ahead of its time. Typical software engineers tend to be very ill-suited to think in probabilities and how to build reasonably reliable systems around them.

↑