Most active commenters

    ←back to thread

    197 points baylearn | 13 comments | | HN request time: 1.897s | source | bottom
    Show context
    empiko ◴[] No.44471933[source]
    Observe what the AI companies are doing, not what they are saying. If they would expect to achieve AGI soon, their behaviour would be completely different. Why bother developing chatbots or doing sales, when you will be operating AGI in a few short years? Surely, all resources should go towards that goal, as it is supposed to usher the humanity into a new prosperous age (somehow).
    replies(9): >>44471988 #>>44471991 #>>44472148 #>>44472874 #>>44473259 #>>44473640 #>>44474131 #>>44475570 #>>44476315 #
    imiric ◴[] No.44473259[source]
    Related to your point: if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?

    This is the main point that proves to me that these companies are mostly selling us snake oil. Yes, there is a great deal of utility from even the current technology. It can detect patterns in data that no human could; that alone can be revolutionary in some fields. It can generate data that mimics anything humans have produced, and certain permutations of that can be insightful. It can produce fascinating images, audio, and video. Some of these capabilities raise safety concerns, particularly in the wrong hands, and important questions that society needs to address. These hurdles are surmountable, but they require focusing on the reality of what these tools can do, instead of on whatever a group of serial tech entrepreneurs looking for the next cashout opportunity tell us they can do.

    The constant anthropomorphization of this technology is dishonest at best, and harmful and dangerous at worst.

    replies(4): >>44473413 #>>44474036 #>>44474147 #>>44474204 #
    1. richk449 ◴[] No.44474147[source]
    > if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?

    Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

    As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.

    The criticism feels about three years out of date.

    replies(10): >>44474186 #>>44474349 #>>44474366 #>>44474767 #>>44475291 #>>44475424 #>>44475442 #>>44475678 #>>44476445 #>>44476449 #
    2. ◴[] No.44474186[source]
    3. leptons ◴[] No.44474349[source]
    Are you hallucinating?? "AI" is still constantly hallucinating. It still writes pointless code that does nothing towards anything I need it to do, a lot more often than is acceptable.
    4. imiric ◴[] No.44474366[source]
    Not at all. The reason it's not talked about as much these days is because the prevailing way to work around it is by using "agents". I.e. by continuously prompting the LLM in a loop until it happens to generate the correct response. This brute force approach is hardly a solution, especially in fields that don't have a quick way of verifying the output. In programming, trying to compile the code can catch many (but definitely not all) issues. In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.

    The other reason is because the primary focus of the last 3 years has been scaling the data and hardware up, with a bunch of (much needed) engineering around it. This has produced better results, but it can't sustain the AGI promises for much longer. The industry can only survive on shiny value added services and smoke and mirrors for so long.

    replies(1): >>44475339 #
    5. natebc ◴[] No.44474767[source]
    > Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

    Last week I had Claude and ChatGPT both tell me different non-existent options to migrate a virtual machine from vmware to hyperv.

    Week before that one of them (don't remember which, honestly) gave me non existent options for fio.

    Both of these are things that the first party documentation or man page has correct but i was being lazy and was trying to save time or be more efficient like these things are supposed to do for us. Not so much.

    Hallucinations are still a problem.

    6. majormajor ◴[] No.44475291[source]
    > Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

    Nonsense, there is a TON of discussion around how the standard workflow is "have Cursor-or-whatever check the linter and try to run the tests and keep iterating until it gets it right" that is nothing but "work around hallucinations." Functions that don't exist. Lines that don't do what the code would've required them to do. Etc. And yet I still hit cases weekly-at-least, when trying to use these "agents" to do more complex things, where it talks itself into a circle and can't figure it out.

    What are you trying to get these things to do, and how are you validating that there are no hallucinations? You hardly ever "hear about it" but ... do you see it? How deeply are you checking for it?

    (It's also just old news - a new hallucination is less newsworthy now, we are all so used to it.)

    Of course, the internet is full of people claiming that they are using the same tools I am but with multiple factors higher output. Yet I wonder... if this is the case, where is the acceleration in improvement in quality in any of the open source software I use daily? Or where are the new 10x-AI-agent-produced replacements? (Or the closed-source products, for that matter - but there it's harder to track the actual code.) Or is everyone who's doing less-technical, less-intricate work just getting themselves hyped into a tizzy about getting faster generation of basic boilerplate for languages they hadn't personally mastered before?

    7. majormajor ◴[] No.44475339[source]
    > In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.

    Even just in industry, I think data functions at companies will have a dicey future.

    I haven't seen many places where there's scientific peer review - or even software-engineering-level code-review - of findings from data science teams. If the data scientist team says "we should go after this demographic" and it sounds plausible, it usually gets implemented.

    So if the ability to validate was already missing even pre-LLM, what hope is there for validation of the LLM-powered replacement. And so what hope is there of the person doing the non-LLM-version of keeping their job (at least until several quarters later when the strategy either proves itself out or doesn't.)

    How many other departments are there where the same lack of rigor already exists? Marketing, sales, HR... yeesh.

    8. taormina ◴[] No.44475424[source]
    ChatGPT constantly hallucinates. At least once per conversation I attempt to happen with it. We all gave up on bitching about it constantly because we would never talk about anything else, but I have no reason to believe that any LLM has vaguely solved this problem.
    9. nunez ◴[] No.44475442[source]
    The few times I've used Google to search for something (Kagi is amazing!), it's Gemini Assistant at the top fabricated something insanely wrong.

    A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)

    Hallucinations are still a massive problem IMO.

    replies(2): >>44475546 #>>44479105 #
    10. amlib ◴[] No.44475678[source]
    How can it not be hallucinating anymore if everything the current crop of generative AI algorithm does IS hallucination? What actually happens is that sometimes the hallucinated output is "right", or more precisely, coherent with the user mental model.
    11. HexDecOctBin ◴[] No.44476445[source]
    I just tried asking ChatGPT on how to "force PhotoSync to not upload images to a B2 bucket that are already uploaded previously", and all it could do is hallucinate options that don't exist and webpages that are irrelevant. This is with the latest model and all the reasoning and researching applied, and across multiple messages in multiple chats. So no, hallucination is still a huge problem.
    12. kevinventullo ◴[] No.44476449[source]
    You don’t hear about it anymore because it’s not worth talking about anymore. Everyone implicitly understands they are liable to make up nonsense.
    13. seanhunter ◴[] No.44479105[source]
    The google AI clippy thing at the top of search has to be one of the most pointless, ill-advised and brand-damaging stunts they could have done. Because compute is expensive at scale (even for them) it’s running a small model, so the suggestions are pretty terrible. That leads people to who don’t understand what’s happening to think their AI is just bad in general.

    That’s not the case in my experience. Gemini is almost as good as Claude for most of the things I try.

    That said, for queries tgat don’t use agentic search or rag, hallucination is as bad a problem as ever and it won’t improve because hallucination is all these models do. In Karpathy’s phrase they “dream text”. Agentic search and rag and similar techniques disguise the issue because they stuff the context of the model with real results, so the scope for it to go noticeably off the rails is less. But it’s still very visible if you ask for references, links etc many/most/sometimes all will be hallucinations depending on the prompt.