Most active commenters

    ←back to thread

    747 points porridgeraisin | 11 comments | | HN request time: 0.365s | source | bottom
    1. Deegy ◴[] No.45064530[source]
    I guess I'll take the other side of what most are arguing in this thread.

    Isn't it a great thing for to us to collectively allow LLM's to train on past conversations? LLM's probably won't get significantly better without this data.

    That said I do recognize the risk of only a handful of companies being responsible for something as important as the collective knowledge of civilization.

    Is the long term solution self custody? Organizations or individuals may use and train models locally in order to protect and distribute their learnings internally. Of course costs have to come down a ridiculous amount for this to be feasible.

    replies(8): >>45064563 #>>45064781 #>>45064999 #>>45065881 #>>45066363 #>>45068149 #>>45069438 #>>45072552 #
    2. monsieurbanana ◴[] No.45064563[source]
    You mean collectively allow us to train Claude's llm? Pretty big omission there
    replies(1): >>45064621 #
    3. Deegy ◴[] No.45064621[source]
    I believe I addressed that in my third paragraph?

    It does suck that there are only a few companies with enough resources to offer these models. But it's hard to escape the power laws.

    I'm hoping that costs come down to the point where these things are basically a commodity with thousands of providers.

    replies(1): >>45066543 #
    4. freejazz ◴[] No.45064781[source]
    > LLM's probably won't get significantly better without this data.

    Yeah and Facebook couldn't scale without ignoring the harms it causes people. Should we just let that be? Society seems to think so but I don't think it's a good idea at all.

    5. jimbokun ◴[] No.45064999[source]
    It's not clear that most people will benefit from LLMs getting significantly better. It's looking more like a net negative.
    6. cowpig ◴[] No.45065881[source]
    > That said I do recognize the risk of only a handful of companies being responsible for something as important as the collective knowledge of civilization.

    It's not just the risk of irresponsible behaviour (which is extremely important in a situation with so much power imbalance)

    It's also just the basic properties of monopolistic markets: the smaller the number of producers, the closer the equilibrium price of the good maximizes the producers' economic surplus.

    These companies operate for-profit in a market, and so they will naturally trend toward capturing as much value as they can, at the expense of everyone else.

    If every business in the world depends on AI, this effectively becomes a tax on all business activity.

    This is obviously not in the collective interest.

    Of course, this analysis makes simplifying assumptions about the oligopoly. The reality is much worse: the whole system creates an inherent information asymmetry. Try and imagine what the "optimal" pricing strategy is for a product where the producer knows intimate details about every consumer.

    7. lacoolj ◴[] No.45066363[source]
    Proprietary data (your company's app repository, a script for upcoming movie) and sensitive data (health, finance) become exposed
    8. monsieurbanana ◴[] No.45066543{3}[source]
    Save your prompts, anonymize them and offer them to anyone that wants to train a LLM, that is us collectively training LLMs.

    Giving Claude your private data ensures that there will not be thousands of providers, since the limiting factor isn't power but data.

    9. int_19h ◴[] No.45068149[source]
    It is a great thing if it were reciprocated. But when I'm paying $20/mo to access Claude, why should I give training data to Anthropic for free?
    10. mitthrowaway2 ◴[] No.45069438[source]
    I'm okay with LLMs not getting better.
    11. gloosx ◴[] No.45072552[source]
    >LLM's probably won't get significantly better without this data.

    Who told you LLMs will get significantly better with this data? Sam Altman?