Can someone explain this in laymen terms?
replies(4):
(1) a feed of the most popular tweets based on likes, retweets, and such
(2) an algorithmic feed that looks for clickbait in the text
and blend these in different proportions to a feed of random tweets that are not popular nor clickbait and find that feed (1) has more of damaging effect on the performance of chatbots. That is, they feed that blend of tweets into the model and then they ask the models to do things and get worse outcomes.TL;DR from https://unrav.io/#view/8f20da5f8205c54b5802c2b623702569
Right: in the context of supervised learning, this statement is a good starting point. After all, how can one build a good supervised model if you can't train it on good examples?
But even in that context, it isn't an incisive framing of the problem. Lots of supervised models are resilient to some kinds of error. A better question, I think, is: what kinds of errors at what prevalence tend to degrade performance and why?
Speaking of LLMs and their ingestion processing, there is a lot more going on than purely supervised learning, so it seems reasonable to me that researchers would want to try to tease the problem apart.