←back to thread

LLMs can get "brain rot"

(llm-brain-rot.github.io)
466 points tamnd | 2 comments | | HN request time: 0s | source
Show context
pixelmelt ◴[] No.45657074[source]
Isn't this just garbage in garbage out with an attention grabbing title?
replies(6): >>45657153 #>>45657205 #>>45657394 #>>45657412 #>>45657896 #>>45658420 #
wat10000 ◴[] No.45657205[source]
Considering that the current state of the art for LLM training is to feed it massive amounts of garbage (with some good stuff alongside), it seems important to point this out even if it might seem obvious.
replies(1): >>45657247 #
CaptainOfCoit ◴[] No.45657247[source]
I don't think anyone is throwing raw datasets into LLMs and hoping for high quality weights anymore. Nowadays most of the datasets are filtered one way or another, and some of them highly curated even.
replies(1): >>45657546 #
BoredPositron ◴[] No.45657546[source]
I doubt they are highly created you would need experts in every field to do so. Which gives me more performance anxiety for LLMs because one of the most curated fields should be code...
replies(3): >>45657692 #>>45657999 #>>45659279 #
1. nradov ◴[] No.45657692[source]
OpenAI has been literally hiring human experts in certain targeted subject areas to write custom proprietary training content.
replies(1): >>45657779 #
2. BoredPositron ◴[] No.45657779[source]
I bet the dataset is mostly comprised of certain areas™.