Bypass DeepSeek censorship by speaking in hex

(substack.com)

755 points MedadNewman | 2 comments | 31 Jan 25 19:41 UTC | HN request time: 0s | source

Show context

femto ◴[31 Jan 25 21:11 UTC] No.42892058[source]▶

This bypasses the overt censorship on the web interface, but it does not bypass the second, more insidious, level of censorship that is built into the model.

https://news.ycombinator.com/item?id=42825573

https://news.ycombinator.com/item?id=42859947

Apparently the model will abandon its "Chain of Thought" (CoT) for certain topics and instead produce a canned response. This effect was the subject of the article "1,156 Questions Censored by DeepSeek", which appeared on HN a few days ago.

https://news.ycombinator.com/item?id=42858552

Edit: fix the last link

replies(10): >>42892216 #>>42892648 #>>42893789 #>>42893794 #>>42893914 #>>42894681 #>>42895397 #>>42896346 #>>42896895 #>>42903388 #

pgkr ◴[31 Jan 25 23:51 UTC] No.42893914[source]▶

>>42892058 #

Correct. The bias is baked into the weights of both V3 and R1, even in the largest 671B parameter model. We're currently conducting analysis on the 671B model running locally to cut through the speculation, and we're seeing interesting biases, including differences between V3 and R1.

Meanwhile, we've released the first part of our research including the dataset: https://news.ycombinator.com/item?id=42879698

replies(2): >>42896337 #>>42900659 #

1. mmazing ◴[01 Feb 25 18:20 UTC] No.42900659[source]▶

>>42893914 #

I have not found any censorship running it on my local computer.

https://imgur.com/xanNjun

replies(1): >>42919009 #

2. pgkr ◴[03 Feb 25 15:18 UTC] No.42919009[source]▶

>>42900659 (TP) #

We conducted further research on the full-sized 671B model, which you can read here: https://news.ycombinator.com/item?id=42918935

If you ran it on your computer, then it wasn't R1. It's a very common misconception. What you ran was actually either a Qwen or LLaMA model fine-tuned to behave more like R1. We have a more detailed explanation in our analysis.

↑