←back to thread

451 points imartin2k | 4 comments | | HN request time: 0.001s | source
Show context
bsenftner ◴[] No.44479706[source]
It's like talking into a void. The issue with AI is that it is too subtle, too easy to get acceptable junk answers and too subtle for the majority to realize we've made a universal crib sheet, software developers included, perhaps one of the worst populations due to their extremely weak communications as a community. To be repeatedly successful with AI, one has to exert mental effort to prompt AI effectively, but pretty much nobody is willing to even consider that. Attempts to discuss the language aspects of using an LLM get ridiculed as 'prompt engineer is not engineering' and dismissed, while that is exactly what it is: prompt engineering using a new software language, natural language, that the industry refuses to take seriously, but is in fact an extremely technical programming language so subtle few to none of you realize it, nor the power that is embodied by it within LLMs. They are incredible, they are subtle, to the degree the majority think they are fraud.
replies(3): >>44479916 #>>44479955 #>>44480067 #
einrealist ◴[] No.44479916[source]
Isn't "Engineering" is based on predictability, on repeatability?

LLMs are not very predictable. And that's not just true for the output. Each change to the model impacts how it parses and computes the input. For someone claiming to be a "Prompt Engineer", this cannot work. There are so many variables that are simply unknown to the casual user: training methods, the training set, biases, ...

If I get the feeling I am creating good prompts for Gemini 2.5 Pro, the next version might render those prompts useless. And that might get even worse with dynamic, "self-improving" models.

So when we talk about "Vibe coding", aren't we just doing "Vibe prompting", too?

replies(2): >>44479980 #>>44481626 #
oceanplexian ◴[] No.44479980[source]
> LLMs are not very predictable. And that's not just true for the output.

If you run an open source model from the same seed on the same hardware they are completely deterministic. It will spit out the same answer every time. So it’s not an issue with the technology and there’s nothing stopping you from writing repeatable prompts and promoting techniques.

replies(5): >>44480240 #>>44480288 #>>44480395 #>>44480523 #>>44480581 #
o11c ◴[] No.44480581[source]
By "unpredictability", we mean that AIs will return completely different results if a single word is changed to a close synonym, or an adverb or prepositional phrase is moved to a semantically identical location, etc. Very often this simple change will move you from "get the correct answer 90% of the time" (about the best that AIs can do) to "get the correct answer <10% of the time".

Whenever people talk about "prompt engineering", they're referring to randomly changing these kinds of things, in hopes of getting a query pattern where you get meaningful results 90% of the time.

replies(1): >>44480868 #
bsenftner ◴[] No.44480868{3}[source]
What you're describing is specifically the subtle nature of LLMs I'm pointing at; that changing of a single word to a close synonym is meaningful. Why and how they are meaningful gets pushback from the developer community, they somehow do not see this as being a topic, a point of engineering proficiency. It is, but requires an understanding of how LLMs encode and retrieve data.

The reason changing one word in a prompt to a close synonym changes the reply is because it is the specific words used in a series that is how information is embedded and recovered by LLMs. The 'in a series' aspect is subtle and important. The same topic is in the LLM multiple times, with different levels of treatment from casual to academic. Each treatment from casual to formal uses different words, similar words, but different and that difference is very meaningful. That difference is how seriously the information is being handled. The use of one term versus another term causes a prompt to index into one treatment of the subject versus another. The more formal the terms used, meaning the synonyms used by experts of that area of knowledge, generate the more accurate replies. While the close synonyms generate replies from outsiders of that knowledge, those not using the same phrases as those with the most expertise, the phrases used by those perhaps trying to understand but do not yet?

It is not randomly changing things in one's prompts at all. It's understanding the knowledge space one is prompting within such that the prompts generate accurate replies. This requires knowing the knowledge space one prompts within, so one knows the correct formal terms that unlock accurate replies. Plus, knowing that area, one is in a better position to identify hallucination.

replies(2): >>44480995 #>>44481486 #
1. noduerme ◴[] No.44481486{4}[source]
What you are describing is not natural language programming, it's the use of incantations discovered by accident or by trial and error. It's alchemy, not chemistry. That's what people mean when they say it's not reproducible. It's not reproducible according to any useful logical framework that could be generally applied to other cases. There may be some "power" in knowing magical incantations, but mostly it's going to be a series of parlor tricks, since neither you nor anyone else can explain why one prompt produces an algorithm that spits out value X whilst changing a single word to its synonym produces X*-1, or Q, or 14 rabbits. And if you could, why not just type the algorithm yourself?

Higher level programming languages may make choices for coders regarding lower level functionality, but they have syntactic and semantic rules that produce logically consistent results. Claiming that such rules exist for LLMs but are so subtle that only the ultra-enlightened such as yourself can understand them begs the question: If hardly anyone can grasp such subtlety, then who exactly are all these massive models being built for?

replies(1): >>44483198 #
2. bsenftner ◴[] No.44483198[source]
You are being stubborn, the method is absolutely reproducible. But across models, of course not, that is not how they operate.

> It's not reproducible according to any useful logical framework that could be generally applied to other cases.

It absolutely is, you are refusing to accept that natural language contains this type of logical structure. You are repeatedly trying to project "magic incantations" allusions, when it is simply that you do not understand. Plus, you're openly hostile to the idea that this is a subtle logic you are not seeing.

It is a simple mechanism: multiple people treat the same subjects differently, with different words. Those that are professionally experts in an area tend to use the same words to describe their work. Use those words of you want the LLM to reply from their portion of the LLM's training. This is not any form of "magical incantation" it is knowing what you are referencing by using the formal terminology.

This is not magic, nor is it some kind of elite knowledge. Drop your anger and just realize that it's subtle, that's all. It is difficult to see, that is all. Why this causes developers to get so angry is beyond me.

replies(1): >>44486544 #
3. noduerme ◴[] No.44486544[source]
I'm not angry, I'm just extremely skeptical. If a programming language varied from version to version the way LLMs do, to the extent that the same input could have radically different consequences, no one would use it. Even if the "compiled code" of the LLM's output is proven to work, you will need to make changes in the "source code" of your higher level natural language. Again it's one thing to divorce memory management from logic; it's another to divorce logic from your desire for a working program. Without selecting the logic structures that you need and want, or understanding them, pretty much anything could be introduced to your code.

The point of coding, and what developers are paid for, is taking a vision of a final product which receives input and returns output, and making that perfectly consistent with the express desire of whoever is paying to build that system. Under all use cases. Asking questions about what should happen if a hundred different edge cases arise, before they do, is 99% of the job. Development is a job well suited to students of logic, poorly suited to memorizers and mathematicians, and obscenely ill suited to LLMs and those who attempt to follow the supposed reasoning that arises from gradient descent through a language's structure. Even in the best case scenario, edge case analysis will never be possible for AIs that are built like LLMs, because they demonstrate a lack of abstract thought.

I'm not hostile to LLMs so much as toward the implication that they do anything remotely similar to what we do as developers. But you're welcome to live in a fantasy world where they "make apps". I suppose it's always obnoxious to hear someone tout a quick way to get rich or to cook a turkey in 25 minutes, no knowledge required. Just do be aware that your intetnet fame and fortune will be no reflection on whether your method will actually work. Those of us in the industry are already acutely aware that it doesn't work, and that some folks are just leading children down a lazy pied piper's path rather than teaching them how to think. That's where the assumption comes from that anyone promoting what you're promoting is selling snake oil.

replies(1): >>44488967 #
4. bsenftner ◴[] No.44488967{3}[source]
This is the disconnect, no where do I say use them to make apps, in fact I am strongly opposed to their use for automation, they create Rube Goldberg Machines. But they are great advisors, not coders, but critics of code and sounding boards for strategy, that one when writes their own code to perform the logic they constructed in their head. It is possible and helpful to include LLMs within the decision support roles that software provides for users, but not the decision roles, include LLMs as information resources for the people making decisions, but not as the agents of decision.

But all of that is an aside from the essential nature of using them, which far too many use them to think for them, in place of their thinking, and that is also a subtle aspect of LLMs - using them to think for you damages your own ability to critically think. That's why understanding them is so important, so one does not anthropomorphize them to trust them, which is a dangerous behavior. They are idiot savants, and get that much trust: nearly none.

I also do not believe that LLMs are even remotely capable of anything close to what software engineers do. That's why I am a strong advocate of not using them to write code. Use them to help one understand, but know that the "understanding" that they can offer is of limited scope. That's their weakness: they can't encompass scope. Detailed nuance they get, but two detailed nuances in a single phenomenon and they only focus on one and drop the surrounding environment. They are idiots drawn to shiny complexity, with savant-like abilities. They are closer to a demonic toy for programmers than anything else we have..