Most active commenters
  • bsenftner(5)

←back to thread

451 points imartin2k | 16 comments | | HN request time: 1.052s | source | bottom
Show context
bsenftner ◴[] No.44479706[source]
It's like talking into a void. The issue with AI is that it is too subtle, too easy to get acceptable junk answers and too subtle for the majority to realize we've made a universal crib sheet, software developers included, perhaps one of the worst populations due to their extremely weak communications as a community. To be repeatedly successful with AI, one has to exert mental effort to prompt AI effectively, but pretty much nobody is willing to even consider that. Attempts to discuss the language aspects of using an LLM get ridiculed as 'prompt engineer is not engineering' and dismissed, while that is exactly what it is: prompt engineering using a new software language, natural language, that the industry refuses to take seriously, but is in fact an extremely technical programming language so subtle few to none of you realize it, nor the power that is embodied by it within LLMs. They are incredible, they are subtle, to the degree the majority think they are fraud.
replies(3): >>44479916 #>>44479955 #>>44480067 #
1. einrealist ◴[] No.44479916[source]
Isn't "Engineering" is based on predictability, on repeatability?

LLMs are not very predictable. And that's not just true for the output. Each change to the model impacts how it parses and computes the input. For someone claiming to be a "Prompt Engineer", this cannot work. There are so many variables that are simply unknown to the casual user: training methods, the training set, biases, ...

If I get the feeling I am creating good prompts for Gemini 2.5 Pro, the next version might render those prompts useless. And that might get even worse with dynamic, "self-improving" models.

So when we talk about "Vibe coding", aren't we just doing "Vibe prompting", too?

replies(2): >>44479980 #>>44481626 #
2. oceanplexian ◴[] No.44479980[source]
> LLMs are not very predictable. And that's not just true for the output.

If you run an open source model from the same seed on the same hardware they are completely deterministic. It will spit out the same answer every time. So it’s not an issue with the technology and there’s nothing stopping you from writing repeatable prompts and promoting techniques.

replies(5): >>44480240 #>>44480288 #>>44480395 #>>44480523 #>>44480581 #
3. dimitri-vs ◴[] No.44480240[source]
Realistically, how many people do you think have the time, skills and hardware required to do this?
4. mafuy ◴[] No.44480288[source]
Who's saying that the model stays the same and the seed is not random for most of the companies that run AI? There is no drawback to randomness for them.
5. enragedcacti ◴[] No.44480395[source]
Predictable does not necessarily follow from deterministic. Hash algorithms, for instance, are valuable specifically because they are both deterministic and unpredictable.

Relying on model, seed, and hardware to get "repeatable" prompts essentially reduces an LLM to a very lossy natural language decompression algorithm. What other reason would someone have for asking the same question over and over and over again with the same input? If that's a problem you need solve then you need a database, not a deterministic LLM.

6. CoastalCoder ◴[] No.44480523[source]
> If you run an open source model from the same seed on the same hardware they are completely deterministic.

Are you sure of that? Parallel scatter/gather operations may still be at the mercy of scheduling variances, due to some forms of computer math not being associative.

replies(1): >>44482762 #
7. o11c ◴[] No.44480581[source]
By "unpredictability", we mean that AIs will return completely different results if a single word is changed to a close synonym, or an adverb or prepositional phrase is moved to a semantically identical location, etc. Very often this simple change will move you from "get the correct answer 90% of the time" (about the best that AIs can do) to "get the correct answer <10% of the time".

Whenever people talk about "prompt engineering", they're referring to randomly changing these kinds of things, in hopes of getting a query pattern where you get meaningful results 90% of the time.

replies(1): >>44480868 #
8. bsenftner ◴[] No.44480868{3}[source]
What you're describing is specifically the subtle nature of LLMs I'm pointing at; that changing of a single word to a close synonym is meaningful. Why and how they are meaningful gets pushback from the developer community, they somehow do not see this as being a topic, a point of engineering proficiency. It is, but requires an understanding of how LLMs encode and retrieve data.

The reason changing one word in a prompt to a close synonym changes the reply is because it is the specific words used in a series that is how information is embedded and recovered by LLMs. The 'in a series' aspect is subtle and important. The same topic is in the LLM multiple times, with different levels of treatment from casual to academic. Each treatment from casual to formal uses different words, similar words, but different and that difference is very meaningful. That difference is how seriously the information is being handled. The use of one term versus another term causes a prompt to index into one treatment of the subject versus another. The more formal the terms used, meaning the synonyms used by experts of that area of knowledge, generate the more accurate replies. While the close synonyms generate replies from outsiders of that knowledge, those not using the same phrases as those with the most expertise, the phrases used by those perhaps trying to understand but do not yet?

It is not randomly changing things in one's prompts at all. It's understanding the knowledge space one is prompting within such that the prompts generate accurate replies. This requires knowing the knowledge space one prompts within, so one knows the correct formal terms that unlock accurate replies. Plus, knowing that area, one is in a better position to identify hallucination.

replies(2): >>44480995 #>>44481486 #
9. handfuloflight ◴[] No.44480995{4}[source]
Words are power, and specifically, specific words are power.
replies(1): >>44483216 #
10. noduerme ◴[] No.44481486{4}[source]
What you are describing is not natural language programming, it's the use of incantations discovered by accident or by trial and error. It's alchemy, not chemistry. That's what people mean when they say it's not reproducible. It's not reproducible according to any useful logical framework that could be generally applied to other cases. There may be some "power" in knowing magical incantations, but mostly it's going to be a series of parlor tricks, since neither you nor anyone else can explain why one prompt produces an algorithm that spits out value X whilst changing a single word to its synonym produces X*-1, or Q, or 14 rabbits. And if you could, why not just type the algorithm yourself?

Higher level programming languages may make choices for coders regarding lower level functionality, but they have syntactic and semantic rules that produce logically consistent results. Claiming that such rules exist for LLMs but are so subtle that only the ultra-enlightened such as yourself can understand them begs the question: If hardly anyone can grasp such subtlety, then who exactly are all these massive models being built for?

replies(1): >>44483198 #
11. atemerev ◴[] No.44482762{3}[source]
Sure. Just set the temperature to 0 in every model and see it become deterministic. Or use a fully deterministic PRNG like random123.
12. bsenftner ◴[] No.44483198{5}[source]
You are being stubborn, the method is absolutely reproducible. But across models, of course not, that is not how they operate.

> It's not reproducible according to any useful logical framework that could be generally applied to other cases.

It absolutely is, you are refusing to accept that natural language contains this type of logical structure. You are repeatedly trying to project "magic incantations" allusions, when it is simply that you do not understand. Plus, you're openly hostile to the idea that this is a subtle logic you are not seeing.

It is a simple mechanism: multiple people treat the same subjects differently, with different words. Those that are professionally experts in an area tend to use the same words to describe their work. Use those words of you want the LLM to reply from their portion of the LLM's training. This is not any form of "magical incantation" it is knowing what you are referencing by using the formal terminology.

This is not magic, nor is it some kind of elite knowledge. Drop your anger and just realize that it's subtle, that's all. It is difficult to see, that is all. Why this causes developers to get so angry is beyond me.

replies(1): >>44486544 #
13. bsenftner ◴[] No.44483216{5}[source]
Yes! Why do people get so angry about it? "Oh, you're saying I'm hold it wrong?!" Well, actually, yes, If you speak Pascal to Ruby you get syntax errors, and this is the same basic idea. If you want to talk sports to an LLM and you use shit talking sports language, that's what you'll get back. Obvious, right? Same goes for anything formal, and why is that an insult to people to point that out?
replies(1): >>44483523 #
14. handfuloflight ◴[] No.44483523{6}[source]
For a subset of these detractors, it's their investment and personal moat building into learning syntax which is now being threatened to be obsoleted by natural language programming. Now people with domain knowledge are able to become developers, whereas previously domain experts relied on syntax writers to translate their requirements into reality.

The syntax writers may say: "I do more than write syntax! I think in systems, logic, processes, limits, edge cases, etc."

The response to that is: you don't need syntax to do that, yet until now syntax was the barrier to technical expression.

So ironically, when they show anger it is a form of hypocrisy: they already know that knowing how to write specific words is power. They're just upset that the specific words that matter have changed.

15. noduerme ◴[] No.44486544{6}[source]
I'm not angry, I'm just extremely skeptical. If a programming language varied from version to version the way LLMs do, to the extent that the same input could have radically different consequences, no one would use it. Even if the "compiled code" of the LLM's output is proven to work, you will need to make changes in the "source code" of your higher level natural language. Again it's one thing to divorce memory management from logic; it's another to divorce logic from your desire for a working program. Without selecting the logic structures that you need and want, or understanding them, pretty much anything could be introduced to your code.

The point of coding, and what developers are paid for, is taking a vision of a final product which receives input and returns output, and making that perfectly consistent with the express desire of whoever is paying to build that system. Under all use cases. Asking questions about what should happen if a hundred different edge cases arise, before they do, is 99% of the job. Development is a job well suited to students of logic, poorly suited to memorizers and mathematicians, and obscenely ill suited to LLMs and those who attempt to follow the supposed reasoning that arises from gradient descent through a language's structure. Even in the best case scenario, edge case analysis will never be possible for AIs that are built like LLMs, because they demonstrate a lack of abstract thought.

I'm not hostile to LLMs so much as toward the implication that they do anything remotely similar to what we do as developers. But you're welcome to live in a fantasy world where they "make apps". I suppose it's always obnoxious to hear someone tout a quick way to get rich or to cook a turkey in 25 minutes, no knowledge required. Just do be aware that your intetnet fame and fortune will be no reflection on whether your method will actually work. Those of us in the industry are already acutely aware that it doesn't work, and that some folks are just leading children down a lazy pied piper's path rather than teaching them how to think. That's where the assumption comes from that anyone promoting what you're promoting is selling snake oil.

replies(1): >>44488967 #
16. bsenftner ◴[] No.44488967{7}[source]
This is the disconnect, no where do I say use them to make apps, in fact I am strongly opposed to their use for automation, they create Rube Goldberg Machines. But they are great advisors, not coders, but critics of code and sounding boards for strategy, that one when writes their own code to perform the logic they constructed in their head. It is possible and helpful to include LLMs within the decision support roles that software provides for users, but not the decision roles, include LLMs as information resources for the people making decisions, but not as the agents of decision.

But all of that is an aside from the essential nature of using them, which far too many use them to think for them, in place of their thinking, and that is also a subtle aspect of LLMs - using them to think for you damages your own ability to critically think. That's why understanding them is so important, so one does not anthropomorphize them to trust them, which is a dangerous behavior. They are idiot savants, and get that much trust: nearly none.

I also do not believe that LLMs are even remotely capable of anything close to what software engineers do. That's why I am a strong advocate of not using them to write code. Use them to help one understand, but know that the "understanding" that they can offer is of limited scope. That's their weakness: they can't encompass scope. Detailed nuance they get, but two detailed nuances in a single phenomenon and they only focus on one and drop the surrounding environment. They are idiots drawn to shiny complexity, with savant-like abilities. They are closer to a demonic toy for programmers than anything else we have..