LLM Inevitabilism

(tomrenner.com)

1611 points SwoopsFromAbove | 1 comments | 15 Jul 25 04:35 UTC | HN request time: 0s | source

Show context

Animats ◴[15 Jul 25 05:26 UTC] No.44568076[source]▶

There may be an "LLM Winter" as people discover that LLMs can't be trusted to do anything. Look for frantic efforts by companies to offload responsibility for LLM mistakes onto consumers. We've got to have something that has solid "I don't know" and "I don't know how to do this" outputs. We're starting to see reports of LLM usage having negative value for programmers, even though they think it's helping. Too much effort goes into cleaning up LLM messes.

replies(5): >>44568232 #>>44568321 #>>44568785 #>>44570451 #>>44578122 #

imiric ◴[15 Jul 25 06:12 UTC] No.44568321[source]▶

>>44568076 #

> Look for frantic efforts by companies to offload responsibility for LLM mistakes onto consumers.

Not just by companies. We see this from enthusiastic consumers as well, on this very forum. Or it might just be astroturfing, it's hard to tell.

The mantra is that in order to extract value from LLMs, the user must have a certain level of knowledge and skill of how to use them. "Prompt engineering", now reframed as "context engineering", has become this practice that separates anyone who feels these tools are wasting their time more than they're helping, and those who feel that it's making them many times more productive. The tools themselves are never the issue. Clearly it's the user who lacks skill.

This narrative permeates blog posts and discussion forums. It was recently reinforced by a misinterpretation of a METR study.

To be clear: using any tool to its full potential does require a certain skill level. What I'm objecting to is the blanket statement that people who don't find LLMs to be a net benefit to their workflow lack the skills to do so. This is insulting to smart and capable engineers with many years of experience working with software. LLMs are not this alien technology that require a degree to use correctly. Understanding how they work, feeding them the right context, and being familiar with the related tools and concepts, does not require an engineering specialization. Anyone claiming it does is trying to sell you something; either LLMs themselves, or the idea that they're more capable than those criticizing this technology.

replies(8): >>44568512 #>>44568713 #>>44568924 #>>44569062 #>>44569220 #>>44569431 #>>44571044 #>>44574569 #

rgoulter ◴[15 Jul 25 07:11 UTC] No.44568713[source]▶

>>44568321 #

A couple of typical comments about LLMs would be:

"This LLM is able to capably output useful snippets of code for Python. That's useful."

and

"I tried to get an LLM to perform a niche task with a niche language, it performed terribly."

I think the right synthesis is that there are some tasks the LLMs are useful at, some which they're not useful at; practically, it's useful to be able to know what they're useful for.

Or, if we trust that LLMs are useful for all tasks, then it's practically useful to know what they're not good at.

replies(2): >>44568934 #>>44569093 #

ygritte ◴[15 Jul 25 07:50 UTC] No.44568934[source]▶

>>44568713 #

Even if that's true, they are still not reliable. The same question can produce different answers each time.

replies(2): >>44569738 #>>44570807 #

hhh ◴[15 Jul 25 10:29 UTC] No.44569738[source]▶

>>44568934 #

This isn't really true when you control the stack, no? If you have all of your parameters set to be reproducible (e.g. temp 0, same seed), the output should be the same as long as everything further down the stack is the same, no?

replies(1): >>44570208 #

1. imiric ◴[15 Jul 25 11:55 UTC] No.44570208{3}[source]▶

>>44569738 #

That's not a usable workaround. In most cases it doesn't actually produce full determinism[1].

And even if it did, a certain degree of non-determinism is actually desirable. The most probable tokens might not be correct, and randomness is partly responsible for what humans interpret as "creativity". Even hallucinations are desirable in some applications (art, entertainment, etc.).

[1]: https://medium.com/google-cloud/is-a-zero-temperature-determ...

↑