Not a surprise. All the major players have reached the limits of training on existing data—they’re already training on essentially the whole internet plus a bunch of content they allegedly stole (hence various lawsuits). There haven’t been any major breakthroughs in model architecture from the major players recently and thus they’re now in a battle for more data to train on. They need data, and they want YOUR data, now, and are gonna do increasingly shady things to get it.
replies(5):