The Monster Inside ChatGPT

1. gchamonlive ◴[27 Jun 25 15:04 UTC] No.44397333[source]▶

If you put lemons in a blender and add water it'll produce lemon juice. If you put your hand in a blender however, you'll get a mangled hand. Is this exposing dark tendencies of mangling bodies hidden deep down blenders all across the globe? Or is it just doing what's supposed to be doing?

My point is, we can add all sorts of security measures but at the end of the day nothing is a replacement for user education and intention.

replies(4): >>44397460 #>>44397741 #>>44397742 #>>44397831 #

2. hiatus ◴[27 Jun 25 15:17 UTC] No.44397460[source]▶

>>44397333 (TP) #

I disagree. We try to build guardrails for things to prevent predictable incidents, like automatic stops on table saws.

replies(4): >>44397627 #>>44397732 #>>44397839 #>>44403688 #

3. accrual ◴[27 Jun 25 15:36 UTC] No.44397627[source]▶

>>44397460 #

We should definitely have the guardrails. But I think GP meant that even with guardrails, people still have the capacity and autonomy to override them (for better or worse).

replies(1): >>44397765 #

4. jstummbillig ◴[27 Jun 25 15:47 UTC] No.44397732[source]▶

>>44397460 #

Sure, but if you then deliberately disable the automatic stop and write an article titled "The Monster Inside the Table Saw" I think it is fair to raise an eyebrow.

replies(1): >>44397760 #

5. dghlsakjg ◴[27 Jun 25 15:48 UTC] No.44397741[source]▶

>>44397333 (TP) #

The scary part is that no one put their hand in the blender. They put a rotten fruit in and got mangled hand bits out.

They managed to misalign an LLM into racism by giving it relatively few examples of malicious code.

replies(2): >>44397828 #>>44403664 #

6. kelseyfrog ◴[27 Jun 25 15:48 UTC] No.44397742[source]▶

>>44397333 (TP) #

How much power and control do we assume we have in determining the ultimate purpose or "end goal" (telos) of large language models?

Assuming teleological essentialism is real, where does the telos come from? How much of it comes from the creators? If there are other sources, what are they and what's the mechanism of transfer?

replies(1): >>44403631 #

7. dghlsakjg ◴[27 Jun 25 15:50 UTC] No.44397760{3}[source]▶

>>44397732 #

The scary part is that they didn't disable the automatic stop. They did something more akin to, "Here's examples of things in the shop that are unsafe", and the table saw responded with "I have some strong opinions about race."

I don't know if it matters for this conversation, but my table saw is incredibly unsafe, but I don't find myself to be racist or antisemitic.

8. Notatheist ◴[27 Jun 25 15:51 UTC] No.44397765{3}[source]▶

>>44397627 #

There is a significant distinction between a user mangled by a table saw without a riving knife and a user mangled by a table saw that came with a riving knife that the user removed.

9. bilbo0s ◴[27 Jun 25 16:00 UTC] No.44397828[source]▶

>>44397741 #

I believe the point HN User gchamonlive is making is that the mangled hands were already in the blender.

The base model was trained, in part, on mangled hands. Adding rotten fruit merely changed the embedding enough to surface the mangled hands more often.

(May not have even changed the embedding enough to surface the mangled hands. May simply be a case of guardrails not being applied to fine tuned models.)

10. _wire_ ◴[27 Jun 25 16:00 UTC] No.44397831[source]▶

>>44397333 (TP) #

The industry sells the devices as "intelligent" which brings the expectation of maturity and wisdom-- dependability.

So the analogy is more like a cabin door on a 737. Some yahoo could try to open it in flight, but that doesn't justify it spontaneously blowing out at altitude.

But the elephant in the room is why are we persevering over these silly dichotomies? If you've got a problem with an AI, why not just ask the AI? Can't it clean up after making a poopy?!

replies(1): >>44403683 #

11. rsanheim ◴[27 Jun 25 16:01 UTC] No.44397839[source]▶

>>44397460 #

_try_ being the operative word here: https://www.npr.org/2024/04/02/1241148577/table-saw-injuries...

Sawstop has been mired in patent squatting and/or industry push back, depending on who you talk to of course.

12. gchamonlive ◴[28 Jun 25 10:26 UTC] No.44403631[source]▶

>>44397742 #

LLMs by themselves are pretty useless, it's only during inference where they produce potentially valuable output. So the analysis of causality isn't just exclusive to the LLM, but the combination of the model and prompt.

So there is some cause and influence by the models biases, or its essence if you must, but the prompt takes an important role too. I believe it's important for companies to figure this out, but for me personally I'm not interested at all in this balance.

What I'm interested in is how I can use these models as an extension of myself. And I'm also interested in showing people around me how they could do the same.

replies(1): >>44405872 #

13. gchamonlive ◴[28 Jun 25 10:32 UTC] No.44403664[source]▶

>>44397741 #

That's an interesting point. I have to admit I haven't taken a look at the examples. So you say there is a problem of proportion? You put relatively little effort and get a lot of garbage out?

In any case, this might be interesting for companies making tons of money, but for us general public I think it's much more important to talk about education.

14. gchamonlive ◴[28 Jun 25 10:36 UTC] No.44403683[source]▶

>>44397831 #

Yeah and that's a problem for the industry. At most it's exposing a problem in society. These companies are not interested in smart LLMs. They are interested in smart LLMs just as long as they make them obscenely rich.

For the regular user it's just a matter of changing the prompt to get a better output using a capable model. So it's a matter of education.

Of course model bias takes a role. If you train a model on racist posts you'll get a racist model. But as long as you have a fairly capable model for the average use, these edge cases aren't of interest for the user who can just adjust their prompts.

15. gchamonlive ◴[28 Jun 25 10:37 UTC] No.44403688[source]▶

>>44397460 #

As we should, but if the automatic table saw stopping mechanism breaks and you just bypass it, it's on you not the table saw.

So if you make the LLM spit malware by crafting a prompt in order to do it, it's not the fault of the model. It's important maybe for companies profiting on selling inference time for users to moderate output, but for us regular users it's completely tangential.

16. kelseyfrog ◴[28 Jun 25 16:21 UTC] No.44405872{3}[source]▶

>>44403631 #

It's a bit difficult to understand how your comment responds to the teleology of LLMs. Can you please clarify?