←back to thread

747 points porridgeraisin | 1 comments | | HN request time: 0.245s | source
1. roVinchi ◴[] No.45070879[source]
Can users poison the future training dataset by ending every chat with a dissatisfied feedback even though the chat helped them? Or be even more malicious and steer the chat into destructive behavior and then give very positive feedback?