←back to thread

321 points jhunter1016 | 5 comments | | HN request time: 0s | source
Show context
mikeryan ◴[] No.41878605[source]
While technical AI and LLMs are not something I’m well versed in. So as I sit on the sidelines and see the current proliferation of AI startups I’m starting to wonder where the moats are outside of access to raw computing power. Open AI seemed to have a massive lead in this space but that lead seems to be shrinking every day.
replies(10): >>41878784 #>>41878809 #>>41878843 #>>41880703 #>>41881606 #>>41882000 #>>41885618 #>>41886010 #>>41886133 #>>41887349 #
weberer ◴[] No.41878784[source]
Obtaining high quality training data is the biggest moat right now.
replies(2): >>41882699 #>>41883992 #
segasaturn ◴[] No.41882699[source]
Where are they going to get that data? Everything on the open web after 2023 is polluted with lowquality AI slop that poisons the data sets. My prediction: Aggressive dragnet surveillance of users. As in, Google recording your phone calls on Android, Windows sending screen recordings from Recall to OpenAI, Meta training off Whatsapp messages... It sounds dystopian, but the Line Must Go Up.
replies(3): >>41883095 #>>41883850 #>>41885531 #
1. jazzyjackson ◴[] No.41883095[source]
I'm really curious if Microsoft will ever give in to the urge to train on private business data - since transitioning office to o365, they hold the world's and even governments word documents and emails. I'm pretty sure they've promised never to touch it but they can certainly read it so... Information wants to be free.
replies(3): >>41883349 #>>41886762 #>>41898416 #
2. jhickok ◴[] No.41883349[source]
Microsoft "trains" on business data already, but typically for things like fine-tuning security automation and recognizing malicious signals. It sure would be a big step to reading chats and email and feeding them in to a model.
3. ENGNR ◴[] No.41886762[source]
Slack tried it but the backlash got them, this time anyway.
replies(1): >>41890399 #
4. bossyTeacher ◴[] No.41890399[source]
Elaborate? never heard of this
5. wkat4242 ◴[] No.41898416[source]
Tbh it would also make the model much more accurate for corporate uses. It's not a bad idea for that reason.

But security and DLP teams will never accept it.