←back to thread

586 points mizzao | 2 comments | | HN request time: 0.411s | source
Show context
k__ ◴[] No.40666893[source]
I played around with Amazon Q and while setting it up, I needed to create an IAM identity center.

Never did this before, so I was asking Q in the AWS docs how to do it.

It refused to help, as it didn't answer security related questions.

thank.

replies(7): >>40666950 #>>40667091 #>>40667339 #>>40669069 #>>40669289 #>>40669327 #>>40671251 #
menacingly ◴[] No.40667091[source]
it’s similar asking the gemini-1.5 models about coding questions that involve auth

one of my questions about a login form also tripped a harassment flag

replies(1): >>40667279 #
michaelt ◴[] No.40667279[source]
I suspect the refusal to answer questions about auth aren't a matter of hacking or offensive material.

I suspect instead the people training these models have identified areas of questioning where their model is 99% right, but because the 1% wrong is incredibly costly they dodge the entire question.

Would you want your LLM to give out any legal advice, or medical advice, or can-I-eat-this-mushroom advice, if you knew due to imperfections in your training process, it sometimes recommended people put glue in their pizza sauce?

replies(1): >>40667649 #
TeMPOraL ◴[] No.40667649[source]
"If you can't take a little bloody nose, maybe you ought to go back home and crawl under your bed. It's not safe out here. It's wondrous, with treasures to satiate desires both subtle and gross... but it's not for the timid."

So sure, the LLM occasionally pranks someone, in a way similar to how random Internet posts do; it is confidently wrong, in a way similar to how most text on the Internet is confidently wrong because content marketers don't give a damn about correctness, that's not what the text is there for. As much as this state of things pains me, general population has mostly adapted.

Meanwhile, people who would appreciate a model that's 99% right on things where the 1% is costly, rightfully continue to ignore Gemini and other models by companies too afraid to play in the field for real.

replies(2): >>40667683 #>>40667933 #
pjc50 ◴[] No.40667933[source]
The only underlying question here is "who is liable for the output of the LLM?"

I just don't think the "nobody is" current solution is going to last in the current litigious environment.

replies(2): >>40668091 #>>40670649 #
1. raxxorraxor ◴[] No.40670649[source]
The person who prompts would be responsible. Everything else doesn't really make sense. This is usually the trivial solution for any form of tool we use.
replies(1): >>40671186 #
2. wumbo ◴[] No.40671186[source]
If there’s going to be a lawsuit, go after Colt before Anthropic.