Most active commenters

    ←back to thread

    321 points jhunter1016 | 19 comments | | HN request time: 0.821s | source | bottom
    1. idunnoman1222 ◴[] No.41881647[source]
    Yes, they already collected all the data. The same data has had walls put up around it
    replies(4): >>41881678 #>>41882077 #>>41882200 #>>41882333 #
    2. Implicated ◴[] No.41881678[source]
    While I recognize this, I have to assume that the other "big players" already have this same data... ie: anyone with a search engine that's been crawling the web for decades. New entries to the race? Not so much, new walls and such.
    replies(1): >>41881958 #
    3. ◴[] No.41881958[source]
    4. ugh123 ◴[] No.41882077[source]
    Which data? Is that data that Google and/or Meta can't get or doesn't have already?
    replies(2): >>41883016 #>>41883115 #
    5. throwup238 ◴[] No.41882200[source]
    Most of the relevant data is still in the Common Crawl archives, up until people started explicitly opting out of it last couple of years.
    6. lolinder ◴[] No.41882333[source]
    That gives the people who've already started an advantage over newcomers, but it's not a unique advantage to OpenAI.

    The question really should be what if anything gives OpenAI an advantage over Anthropic, Google, Meta, or Amazon? There are at least four players intent on eating OpenAI's market share who already have models in the same ballpark as OpenAI. Is there any reason to suppose that OpenAI keeps the lead for long?

    replies(1): >>41882694 #
    7. XenophileJKO ◴[] No.41882694[source]
    I think their current advantage is willingness to risk public usage of frontier technology. This has been and I predict will be their unique dynamic. It forced the entire market to react, but they are still reacting reluctantly. I just played with Gemini this morning for example and it won't make an image with a person in it at all. I think that is all you need to know about most of the competition.
    replies(1): >>41882907 #
    8. lolinder ◴[] No.41882907{3}[source]
    How about Anthropic?
    replies(3): >>41883041 #>>41883457 #>>41884282 #
    9. jazzyjackson ◴[] No.41883016[source]
    Well, at this point most new data being created is conversations with chatgpt, seeing as how stack overflow and reddit are increasingly useless, so their conversation logs are their moat.
    replies(2): >>41883911 #>>41884055 #
    10. jazzyjackson ◴[] No.41883041{4}[source]
    Aren't they essentially run by safetyists? So they would be less willing to release a model that pushes the boundaries of capability and agency
    replies(1): >>41884000 #
    11. charlieyu1 ◴[] No.41883115[source]
    AI companies have been paying people to create new data for a while
    replies(1): >>41883947 #
    12. llm_trw ◴[] No.41883457{4}[source]
    As an AI model I can't comment on this claim.
    13. staticautomatic ◴[] No.41883911{3}[source]
    There’s tons of human-created data the AI companies aren’t using yet.
    14. ugh123 ◴[] No.41883947{3}[source]
    Do you mean by RLHF? If so, thats not 'data' used by the model in the traditional sense.
    15. caeril ◴[] No.41884000{5}[source]
    From what I've seen, Claude Sonnet 3.5 is decidedly less "safe" than GPT-4o, by the relatively new politicized understanding of "safety".

    Anthropic takes safety to mean "let's not teach people how to build thermite bombs, engineer grey goo nanobots, or genome-targeted viruses", which is the traditional futurist concern with AI safety.

    OpenAI and Google safety teams are far more concerned with revising history, protecting egos, and coddling the precious feelings of their users. As long as no fee-fees are hurt, it's full speed ahead to paperclip maximization.

    replies(2): >>41884407 #>>41885488 #
    16. sangnoir ◴[] No.41884055{3}[source]
    > so their conversation logs are their moat

    Google and Meta aren't exactly lacking in conversation data: Facebook, Messenger, Instagram, Google Talk, Google Groups, Google Plus, Blogspot comments, Youtube Transcripts, &tc. The breadth and and breadth of data those 2 companies are sitting on that goes back for years is mind boggling.

    17. XenophileJKO ◴[] No.41884282{4}[source]
    I think Anthropic is a serious technical competitor and I personally use their product more than OpenAI, BUT again I think their corporate cautiousness will have them always +/- a small delta from OpenAI's models. I just don't see them taking the risk of releasing a step function model before OpenAI or another competitor. I would love to be proven wrong. I am a little curious if the market pressures are getting to them since they updated their "Responsible Scaling Policy".
    18. walleeee ◴[] No.41884407{6}[source]
    Not to dispute your particular comment, which I think is right, but it's worth pointing out we're full steam ahead on paperclips regardless of any AI company. This has been true for some 300 years, longer depending how flexible we are with definitions and where we locate inflection points
    19. derektank ◴[] No.41885488{6}[source]
    This has not been my experience. Twice in the last week I've had Claude refuse to answer questions about a specific racial separatist group (nothing about their ideology, just their name and facts about their membership) and questions about unconventional ways to assess job candidates. Both times I turned to ChatGPT and it gave me an answer immediately