←back to thread

755 points MedadNewman | 1 comments | | HN request time: 0.236s | source
Show context
0x7d ◴[] No.42892214[source]
Hi HN! This is my article!

It was great to put together a writeup of a fun evening or two of work. It looks like this goes much deeper.

I'm learning a lot from some of the linked articles, one of the base hypothesise of my work was that the filtering was distinct from the model, due to the cost of training with pre-filtered or censored data at scale: https://arxiv.org/abs/2307.10719, let alone- making it generate a consistent response.

However, it looks like this goes further, a separate comment linked this article: https://news.ycombinator.com/item?id=42858552 on Chain-Of-Thought abandonment when certain topics are discussed.

I'll have to look at served vs trained censorship, in different context.

replies(3): >>42892518 #>>42896675 #>>42919810 #
1. pgkr ◴[] No.42919810[source]
Hi! Thanks for writing this. We conducted some analysis of our own that produced some pretty interesting results from the 671B model: https://news.ycombinator.com/item?id=42918935

Please reach out to us if you'd like to look at the dataset.