Uncensor any LLM with abliteration

(huggingface.co)

586 points mizzao | 3 comments | 13 Jun 24 03:42 UTC | HN request time: 0.6s | source

Show context

k__ ◴[13 Jun 24 07:21 UTC] No.40666893[source]▶

>>40665721 (OP) #

I played around with Amazon Q and while setting it up, I needed to create an IAM identity center.

Never did this before, so I was asking Q in the AWS docs how to do it.

It refused to help, as it didn't answer security related questions.

thank.

replies(7): >>40666950 #>>40667091 #>>40667339 #>>40669069 #>>40669289 #>>40669327 #>>40671251 #

lhl ◴[13 Jun 24 08:46 UTC] No.40667339[source]▶

>>40666893 #

I believe Amazon Q is running on Amazon's own Titan G1 model. I recently ran the "Premier" version (their highest end one) through my personal vibecheck test and was quite surprised by its RL. It was the only non-Chinese model I've tested to refuse to answer about Tiananmen Square and the only model I believe I've tested with this eval (over 50 at this point) that refused to answer about the LA riots. It also scored an impressive 0/6 on my reasoning/basic world understanding tests (underperforming most 3B models) but that's more capabilities than RL...

Amazon claims the Titan model is suitable for: "Supported use cases: RAG, agents, chat, chain of thought, open-ended text generation, brainstorming, summarization, code generation, table creation, data formatting, paraphrasing, rewriting, extraction, and Q&A." (it is not, lol)

replies(1): >>40668902 #

1. malfist ◴[13 Jun 24 12:35 UTC] No.40668902[source]▶

>>40667339 #

It is Titian under the hood. And it's absolutely crap.

Also fun fact, Titan's image generator will refuse any prompt that references Bezos because it "violates content policy"

If you want to do something useful on bedrock use Claude

replies(1): >>40669165 #

2. lhl ◴[13 Jun 24 13:06 UTC] No.40669165[source]▶

>>40668902 (TP) #

I've been poking around this week and there's actually quite a few useful models on Bedrock (this is region dependent!) https://docs.aws.amazon.com/bedrock/latest/userguide/models-...

Claude Opus is supposedly only available in us-west-2, but is listed as "Unavailable" for me (Sonnet and Haiku are available). Cohere's Command R+ is also available and while less capable, for instruction following, I believe its superior to Anthropic's models. There's also Llama 3 70B Instruct and Mistral Large, both which are good for general tasks.

For those that haven't been closely following/testing the models available, I think Artificial Analysis' Quality vs Price charts isn't too bad a place to start https://artificialanalysis.ai/models although if you have specific tasks, it's best to eval some models are surprisingly good/bad at specific things.

Titan appears to be bad at everything though.

replies(1): >>40670561 #

3. spmurrayzzz ◴[13 Jun 24 15:00 UTC] No.40670561[source]▶

>>40669165 #

> cohere's Command R+ is also available and while less capable, for instruction following, I believe its superior to Anthropic's models

My experience recently is that its actually noticeably better for instruction following than Claude, but can be finicky if you're not careful about adhering to the prompt template. But between the RAG and multi-step tool use capabilities, even if it was slightly worse on the instruction-following side of things I'd still say, as you do, thats its much better than Claude on average.

Agree on titan as well. I recently was forced into a meeting with our AWS TAM, and they kept shoehorning Q into every conversation. I held my tongue knowing that titan was the model powering it under the hood.

↑