Not just by companies. We see this from enthusiastic consumers as well, on this very forum. Or it might just be astroturfing, it's hard to tell.
The mantra is that in order to extract value from LLMs, the user must have a certain level of knowledge and skill of how to use them. "Prompt engineering", now reframed as "context engineering", has become this practice that separates anyone who feels these tools are wasting their time more than they're helping, and those who feel that it's making them many times more productive. The tools themselves are never the issue. Clearly it's the user who lacks skill.
This narrative permeates blog posts and discussion forums. It was recently reinforced by a misinterpretation of a METR study.
To be clear: using any tool to its full potential does require a certain skill level. What I'm objecting to is the blanket statement that people who don't find LLMs to be a net benefit to their workflow lack the skills to do so. This is insulting to smart and capable engineers with many years of experience working with software. LLMs are not this alien technology that require a degree to use correctly. Understanding how they work, feeding them the right context, and being familiar with the related tools and concepts, does not require an engineering specialization. Anyone claiming it does is trying to sell you something; either LLMs themselves, or the idea that they're more capable than those criticizing this technology.
"This LLM is able to capably output useful snippets of code for Python. That's useful."
and
"I tried to get an LLM to perform a niche task with a niche language, it performed terribly."
I think the right synthesis is that there are some tasks the LLMs are useful at, some which they're not useful at; practically, it's useful to be able to know what they're useful for.
Or, if we trust that LLMs are useful for all tasks, then it's practically useful to know what they're not good at.
The thing is that there's no way to objectively measure this. Benchmarks are often gamed, and like a sibling comment mentioned, the output is not stable.
Also, everyone has different criteria for what constitutes "good". To someone with little to no programming experience, LLMs would feel downright magical. Experienced programmers, or any domain expert for that matter, would be able to gauge the output quality much more accurately. Even among the experienced group, there are different levels of quality criteria. Some might be fine with overlooking certain issues, or not bother checking the output at all, while others have much higher standards of quality.
The problem is when any issues that are pointed out are blamed on the user, instead of the tool. Or even worse: when the issues are acknowledged, but are excused as "this is the way these tools work."[1,2]. It's blatant gaslighting that AI companies love to promote for obvious reasons.
Sure. But isn't that a bit like if someone likes VSCode, & someone likes Emacs.. the first method of comparison I'm reaching for isn't "what objective metrics do you have", so much as "how do you use it?".
> > This is insulting to smart and capable engineers with many years of experience working with software.
> Experienced programmers, or any domain expert for that matter, would be able to gauge the output quality much more accurately.
My experience is that smart and capable engineers have varying opinions on things. -- "What their opinion is" is less interesting than "why they have the opinion".
I would be surprised, though, if someone were to boast about their experience/skills, & claim they were unable to find any way to use LLMs effectively.
And even if it did, a certain degree of non-determinism is actually desirable. The most probable tokens might not be correct, and randomness is partly responsible for what humans interpret as "creativity". Even hallucinations are desirable in some applications (art, entertainment, etc.).
[1]: https://medium.com/google-cloud/is-a-zero-temperature-determ...
I've used a tool to do a task today. I used a suction sandblasting machine to remove corrosion from a part.
Without the tool, had I wanted to remove the corrosion, I would've spent all day (if not more) scraping it with sandpaper (is that a tool too? With the skin of my hands then?) - this would have been tedious and could have taken me all day, scraping away millimeter by millimeter.
With the machine, it took me about 3 minutes. I necessitated 4-5 minutes of training to attain this level of expertise.
The worth of this machine is undeniable.
How is it that LLMs are not at all so undeniably efficient? I keep hearing people tell me how they will take everyones job, but it seems like the first faceplant from all the big tech companies.
(Maybe second after Meta's VR stuff)
Formulaic, unspecific in results while making extraordinary claims, and always of a specific upbeat tenor.
For example, people try to compare this LLM tech with the automation of the car manufacturing industry. That analogy is a terrible one, because machines build better cars and are much more reliable than humans.
LLMs don't build better software, they build bad software faster.
Also, as a tool, LLMs discourage understanding in a way that no other tool does.