Most active commenters

    ←back to thread

    423 points sohkamyung | 25 comments | | HN request time: 0.001s | source | bottom
    Show context
    scarmig ◴[] No.45669929[source]
    If you dig into the actual report (I know, I know, how passe), you see how they get the numbers. Most of the errors are "sourcing issues": the AI assistant doesn't cite a claim, or it (shocking) cites Wikipedia instead of the BBC.

    Other issues: the report doesn't even say which particular models it's querying [ETA: discovered they do list this in an appendix], aside from saying it's the consumer tier. And it leaves off Anthropic (in my experience, by far the best at this type of task), favoring Perplexity and (perplexingly) Copilot. The article also intermingles claims from the recent report and the one on research conducted a year ago, leaving out critical context that... things have changed.

    This article contains significant issues.

    replies(7): >>45669943 #>>45670942 #>>45671401 #>>45672311 #>>45672577 #>>45675250 #>>45679322 #
    afavour ◴[] No.45669943[source]
    > or it (shocking) cites Wikipedia instead of the BBC.

    No... the problem is that it cites Wikipedia articles that don't exist.

    > ChatGPT linked to a non-existent Wikipedia article on the “European Union Enlargement Goals for 2040”. In fact, there is no official EU policy under that name. The response hallucinates a URL but also, indirectly, an EU goal and policy.

    replies(6): >>45670006 #>>45670093 #>>45670094 #>>45670184 #>>45670903 #>>45672812 #
    1. kenjackson ◴[] No.45670093[source]
    Actually there was a Wikipedia article of this name, but it was deleted in June -- because it was AI generated. Unfortunately AI falls for this much like humans do.

    https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletio...

    replies(4): >>45670306 #>>45670779 #>>45671331 #>>45672567 #
    2. Workaccount2 ◴[] No.45670306[source]
    This is likely because of the knowledge cutoff.

    I have seen a few cases before of "hallucinations" that turned out to be things that did exist, but no longer do.

    replies(1): >>45670633 #
    3. 1980phipsi ◴[] No.45670633[source]
    The fix for this is for the AI to double-check all links before providing them to the user. I frequently ask ChatGPT to double check that references actually exist when it gives me them. It should be built in!
    replies(4): >>45670762 #>>45670808 #>>45670935 #>>45673056 #
    4. rideontime ◴[] No.45670762{3}[source]
    But that would mean OpenAI would lose even more money on every query.
    replies(2): >>45672453 #>>45674673 #
    5. bunderbunder ◴[] No.45670779[source]
    The biggest problem with that citation isn't that the article has since been deleted. The biggest problem is that that particular Wikipedia article was never a good source in the first place.

    That seems to be the real challenge with AI for this use case. It has no real critical thinking skills, so it's not really competent to choose reliable sources. So instead we're lowering the bar to just asking that the sources actually exist. I really hate that. We shouldn't be lowering intellectual standards to meet AI where it's at. These intellectual standards are important and hard-won, and we need to be demanding that AI be the one to rise to meet them.

    replies(2): >>45670872 #>>45671358 #
    6. blitzar ◴[] No.45670808{3}[source]
    I have found my self doing the same "citation needed" loop - but with ai this is a dangerous game as it will now double down on whatever it made up and go looking for citations to justify its answer.

    Pre prompting to cite sources is obviously a better way of going about things.

    replies(1): >>45671537 #
    7. gamerDude ◴[] No.45670872[source]
    I think this is a real challenge for everyone. In many ways potentially we need a restart of a wikipedia like site to document all the valid and good sources. This would also hopefully include things like source bias and whether it's a primary/secondary/tertiary source.
    replies(5): >>45671575 #>>45671882 #>>45672162 #>>45673022 #>>45673869 #
    8. janwl ◴[] No.45670935{3}[source]
    I thought people here hated it when LLMs made http requests?
    replies(2): >>45671214 #>>45671608 #
    9. macintux ◴[] No.45671214{4}[source]
    I don't know for certain what you're referring to, but the "bulk downloads" of the Internet that AI companies are executing for training are the problem I've seen cited, and doesn't relate to LLMs checking their sources at query time.
    10. CaptainOfCoit ◴[] No.45671331[source]
    > Actually there was a Wikipedia article of this name, but it was deleted in June -- because it was AI generated. Unfortunately AI falls for this much like humans do.

    A recent Kurzgesagt goes into the dangers of this, and they found the same thing happening with a concrete example: They were researching a topic, tried using LLMs, found they weren't accurate enough and hallucinated, so they continued doing things the manual way. Then some weeks/months later, they noticed a bunch of YouTube videos that had the very hallucinations they were avoiding, and now their own AI assistants started to use those as sources. Paraphrased/remembered by me, could have some inconsistencies/hallucinations.

    https://www.youtube.com/watch?v=_zfN9wnPvU0

    11. kenjackson ◴[] No.45671358[source]
    I get what your saying. But you are now asking for a level of intelligence and critical thinking that I honestly believe is higher than the average person. I think its absolutely doable, but I also feel like we shouldn't make it sound like the current behavior is abhorrent or somehow indicative of a failure in the technology.
    replies(2): >>45671503 #>>45676755 #
    12. exe34 ◴[] No.45671503{3}[source]
    It's actually great from my point of view - it means we're edging our way into limited superintelligence.
    13. ◴[] No.45671537{4}[source]
    14. fullofideas ◴[] No.45671575{3}[source]
    This is pushing the burden of proof on the society. Basically, asking everyone else to pitch in and improve sources so that ai companies can reference these trust worthy sources.
    15. zahlman ◴[] No.45671608{4}[source]
    It's bad when they indiscriminately crawl for training, and not ideal (but understandable) to use the Internet to communicate with them (and having online accounts associated with that etc.) rather than running them locally.

    It's not bad when they use the Internet at generation time to verify the output.

    replies(1): >>45677156 #
    16. bunderbunder ◴[] No.45671882{3}[source]
    Outsourcing due diligence to a tool (or a single unified source) is the problem, not the solution.

    For example, having a single central arbiter of source bias is inescapably the most biased thing you could possibly do. Bias has to be defined within an intellectual paradigm. So you'd have to choose a paradigm to use for that bias evaluation, and de facto declare it to be the one true paradigm for this purpose. But intellectual paradigms are inherently subjective, so doing that is pretty much the most intellectually biased thing you can possibly do.

    17. ishtanbul ◴[] No.45672162{3}[source]
    Maybe we can get AI to do this hard labor
    18. mdhb ◴[] No.45672453{4}[source]
    Almost as though it’s not a sustainable business model and relies of tricking people in order to keep the lights on.
    19. AlienRobot ◴[] No.45672567[source]
    AI-powered citogenesis!
    20. dingnuts ◴[] No.45673022{3}[source]
    I noticed that my local library has a new set of World Book. Maybe it's time to bring back traditional encyclopedias.
    21. dingnuts ◴[] No.45673056{3}[source]
    Gemini will lie to me when I ask it to cite things, either pull up relevant sources or just hallucinate them.

    IDK how you people go through that experience more than a handful of times before you get pissed off and stop using these tools. I've wasted so much time because of believable lies from these bots.

    Sorry, not even lies, just bullshit. The model has no conception of truth so it can't even lie. Just outputs bullshit that happens to be true sometimes.

    22. cogman10 ◴[] No.45673869{3}[source]
    An example of this.

    I've seen a certain sensationalist news source write a story that went like this.

    Site A: Bad thing is happening, cite: article Site B

    * follow the source *

    Site B: Bad thing is happening, cite different article on Site A

    * follow the source *

    Site A: Bad thing is happening, no citation.

    I fear that's the current state of a large news bubble that many people subscribe to. And when these sensationalist stories start circulating there's a natural human tendency to exaggerate.

    I don't think AI has any sort of real good defense to this sort of thing. 1 level of citation is already hard enough. Recognizing that it is citing the same source is hard enough.

    There was another example from the Kagi news stuff which exemplified this. A whole article written which made 3 citations that were ultimately spawned from the same new briefing published by different outlets.

    I've even seen an example of a national political leader who fell for the same sort of sensationalization. One who should have known better. They repeated what was later found to be a lie by a well-known liar but added that "I've seen the photos in a classified debriefing". IDK that it was necessarily even malicious, I think people are just really bad at separating credible from uncredible information and that it ultimately blends together as one thing (certainly doesn't help with ancient politicians).

    23. ModernMech ◴[] No.45674673{4}[source]
    Better make each query count then.
    24. Paracompact ◴[] No.45676755{3}[source]
    The bar for an industry should be the good-faith effort of the average industry professional, not the unconscionably minimal efforts of the average grifter trying to farm content.

    These grifters simply were not attracted to these gigs in these quantities prior to AI, but now the market incentives have changed. Should we "blame" the technology for its abuse? I think AI is incredible, but market endorsement is different from intellectual admiration.

    25. Dylan16807 ◴[] No.45677156{5}[source]
    Also for the most part this verification can use a HEAD request.