Most active commenters

johnnyanmac(5)
cma(3)

Popular/hot comments

>>41898196 #
>>41898961 #
>>41900379 #

←back to thread

The AI Investment Boom

(www.apricitas.io)

1. GolfPopper ◴[20 Oct 24 20:15 UTC] No.41898170[source]▶

>>41895746 (OP) #

I've yet to find an "AI" that doesn't seamlessly hallucinate, and I don't see how "AIs" that hallucinate will ever be useful outside niche applications.

replies(12): >>41898196 #>>41898203 #>>41898630 #>>41898961 #>>41899137 #>>41899339 #>>41900217 #>>41901033 #>>41903589 #>>41903712 #>>41905312 #>>41908344 #

2. Ekaros ◴[20 Oct 24 20:18 UTC] No.41898196[source]▶

>>41898170 (TP) #

I believe that there is lot of content creation where quality really does not matter. And hallucinations don't really matter. Unless they are legally actionable, that is something like hate speech or libel.

Throwing dozens of articles, social media posts and why not even videos. Hallucinations really don't matter at scale. And enough content is already generating enough views to make it somewhat viable strategy.

replies(5): >>41899664 #>>41899850 #>>41900982 #>>41901372 #>>41905356 #

3. dragonwriter ◴[20 Oct 24 20:21 UTC] No.41898203[source]▶

>>41898170 (TP) #

Humans also confabulate (a better metaphor for AI errors than hallucination) when called on to respond without access to the ground truth, and most AI models have limited combination of access and ability to use that access when it comes to checking ground truth.

4. edanm ◴[20 Oct 24 21:29 UTC] No.41898630[source]▶

>>41898170 (TP) #

You don't really need to imagine this though - generative AI is already extremely useful in many non-nice applications.

replies(1): >>41900379 #

5. zone411 ◴[20 Oct 24 22:20 UTC] No.41898961[source]▶

>>41898170 (TP) #

Confabulations are decreasing with newer models. I tested confabulations based on provided documents (relevant for RAG) here: https://github.com/lechmazur/confabulations/. Note the significant difference between GPT-4 Turbo and GPT-4o.

replies(3): >>41900075 #>>41900092 #>>41905577 #

6. sean_pedersen ◴[20 Oct 24 22:52 UTC] No.41899137[source]▶

>>41898170 (TP) #

https://github.com/stanford-oval/WikiChat

7. jacurtis ◴[20 Oct 24 23:22 UTC] No.41899339[source]▶

>>41898170 (TP) #

I've never met a human that doesn't "hallucinate" either (either intentionally or unintentionally). Humans either intentionally lie or will fill in gaps in their knowledge with assumptions or inaccurate information. Most human generated content on social media is inaccurate, to an even higher percentage than what ChatGPT gives me.

I guess humans are worthless as well since they are notoriously unreliable. Or maybe it just means that artificial intelligence is more realistic than we want to admit, since it mimics humans exactly as we are, deficiencies and all.

This is kind of like the self-driving car debate. We don't want to allow self-driving cars until we can guarantee that they have a zero percent failure rate.

Meanwhile we continue to rely on human drivers which leads to 50,000 deaths per year in America alone, all because we refuse to accept a failure rate of even one accident from a self-driving car.

replies(2): >>41899534 #>>41904280 #

8. tim333 ◴[21 Oct 24 00:01 UTC] No.41899534[source]▶

>>41899339 #

It's not quite the case with cars though - people are ok with Waymos which are not zero accident but probably safer than human drivers. The trouble with other systems like Tesla FSD is they are probably not safer than human yet if you don't have a human there nannying them.

Similarly I think people will be ok with other AI if it performs well.

9. flashman ◴[21 Oct 24 00:28 UTC] No.41899664[source]▶

>>41898196 #

> quality really does not matter

What an inspiring vision for the future of news and entertainment.

10. xk_id ◴[21 Oct 24 01:10 UTC] No.41899850[source]▶

>>41898196 #

It amazes me the level of nihilism needed to talk about this with casual indifference.

11. ◴[21 Oct 24 01:57 UTC] No.41900075[source]▶

>>41898961 #

12. tkgally ◴[21 Oct 24 02:00 UTC] No.41900092[source]▶

>>41898961 #

That’s very interesting! Thanks for the link.

13. harimau777 ◴[21 Oct 24 02:36 UTC] No.41900217[source]▶

>>41898170 (TP) #

Aren't humans the main alternative to AI and they seamlessly hallucinate as well.

14. jmathai ◴[21 Oct 24 03:15 UTC] No.41900379[source]▶

>>41898630 #

There's a camp of people who are hyper-fixated on LLM hallucinations as being a barrier for value creation.

I believe that is so far off the mark for a couple reasons:

1) It's possible to work around hallucinations in a more cost effective way than relying on humans to always be correct.

2) There are many use cases where hallucinations aren't such a bad thing (or even a good thing) for which we've never really had a system as powerful as LLMs to build for.

There's absolutely very large use cases for LLMs and it will be pretty disruptive. But it will also create net new value that wasn't possible before.

I say that as someone who thinks we have enough technology as it is and don't need any more.

replies(3): >>41900417 #>>41900450 #>>41904140 #

15. datavirtue ◴[21 Oct 24 03:27 UTC] No.41900417{3}[source]▶

>>41900379 #

Yeah, they just want it to go away. The same way they wish Windows and GUIs and people in general would just go away.

replies(1): >>41904205 #

16. babyent ◴[21 Oct 24 03:32 UTC] No.41900450{3}[source]▶

>>41900379 #

For sure, sending customers into a never ending loop when they want support. That's been my experience with most AI support so far. It sucks. I like Amazon's approach where they have a basic chat bot (probably doesn't even use LLMs) that then escalates to a actual human being in some low cost country.

I kind of like the Chipotle approach. I have a problem with my order, it just refunds me instantly and sometimes gives me a add-on for free.

Honestly I only use LLM for one thing - I give it a set of TS definitions and user input, and ask it to fit those schemas if it can and to not force something if it isn't 100% confident.

I know some people whose whole company is based around the use of AI to send emails or messages, and in reality they're logged into their terminals real time fixing errors before actually sending out the emails. Basically, they are mechanical turks and they even say they're looking at labor in India or Africa to pay them peanuts to address these.

17. ◴[21 Oct 24 05:39 UTC] No.41900982[source]▶

>>41898196 #

18. cma ◴[21 Oct 24 05:52 UTC] No.41901033[source]▶

>>41898170 (TP) #

Still useful for anything you can verify.

replies(1): >>41904232 #

19. ehnto ◴[21 Oct 24 06:50 UTC] No.41901372[source]▶

>>41898196 #

A viable strategy for making money, or providing value to society?

I think for some niches, the former can for a brief period precede the latter. But eventually the market catches up and roots out that which lacks actual value.

More concretely, I suspect the advertising apparatus is going to increasingly devalue unattributed content online, favouring curated platforms and eventually resembling a more hands on media distribution with human platform relationships (where media == the actual medium of distribution not content).

That is already a thing, where for example an instagrammer promoting your product is more valuable than the automated ad-network on instagram itself.

At which point, hopefully, automated content and spam loses legitimacy and value as ad-media.

20. CharlieDigital ◴[21 Oct 24 12:47 UTC] No.41903589[source]▶

>>41898170 (TP) #

It's not a hard problem to solve with even basic retrieval augmented generation.

With good RAG, hallucinations are non-existent.

21. infecto ◴[21 Oct 24 13:01 UTC] No.41903712[source]▶

>>41898170 (TP) #

While they certainly can do that, there are large chunks of workflows where "hallucination" are low to none. Even then, I find LLM quite useful to ask questions in areas I am not familiar with, its easy to verify and I get to the answer much quicker.

Spend some more time working with them and you might realize the value they contain.

22. johnnyanmac ◴[21 Oct 24 13:50 UTC] No.41904140{3}[source]▶

>>41900379 #

the most important aspect of any company worth its salt is liability. If the LLM provider isn't providing liability (and so far they haven't), then hallucinations are a complete deal breaker. You don't want to be on the receiving end of a precedent setting lawsuit just to save some pennies on labor.

There can be uses, but if you you're falling on deaf ears as a B2B if you don't solve this problem. Consumers accept inaccuracies, not businesses. And that's also sadly where it works best and why consumers soured on it. It's being used to work as chatbots that give worse service, and make consumers work more for something an employee could resolve in seconds.

as it's worked for millenia, human have accountability, and any disaster can start the PR spin by reprimanding/firing a human who messes up. We don't have that for AI yet. And obviously, no company wants to bear that burden.

23. johnnyanmac ◴[21 Oct 24 13:57 UTC] No.41904205{4}[source]▶

>>41900417 #

I'm just tired of all the lies and theft. People can use the tech you want. Just don't pretend it's yours when you spent decades strengthening copyright law then you decide to break the laws you helped make.

replies(1): >>41906566 #

24. johnnyanmac ◴[21 Oct 24 13:59 UTC] No.41904232[source]▶

>>41901033 #

verifying needs experts to confirm. experts are expensive and are the people they want to replace. No one on either side of the transaction wants to utilize an expert.

So you see the issue. and the intent.

replies(2): >>41907223 #>>41908177 #

25. johnnyanmac ◴[21 Oct 24 14:02 UTC] No.41904280[source]▶

>>41899339 #

you're missing one big detail. Humans are liable, AI isn't. And AI providers do all they can to deny liability. The businesses using AI sure aren't doing better either.

If you're not confident enough in your tech to be held liable, we're going to have issues. We figured out (sort of) human liability eons ago. So it doesn't matter if it's less safe. It matters that we can make sure to prune out and punish unsafe things. Like firing or jailing a human.

26. n_ary ◴[21 Oct 24 15:37 UTC] No.41905312[source]▶

>>41898170 (TP) #

I find LLMs to be much more friendly for very focused topics and mostly accurate. Anything generated, I can go ahead and check with corresponding source-code or official documentation.

In theory, I save immense amount of time daily talking to Claude/4o when I need to ask something quick, but previously had to search at least x4 different search engines and wade through too many SEO spams disappointing me.

Also, the summarizer while a meme at this point is immensely useful. I put anything interesting looking throughout the day into a db, then a cronjob in cloudflare runs and tries to fetch the text content from each link and generates a summary using 4o and then stores it.

Over the weekend, I scroll through the summary of each links saved, if anything looks decently interesting, I will go and check it out and do further research.

In fact, I actually learned about SolidJS from one random article posted in 4th page of HN with few votes and the summary gave enough info for me to go ahead and check SolidJs instead of having to read through the article ranting about ReactJS.

27. n_ary ◴[21 Oct 24 15:42 UTC] No.41905356[source]▶

>>41898196 #

I read(or watched?) somewhere that, to build your social media reputation and popularity(i.e. follower count) organically, you must post daily, something, anything.

An interesting idea would be to automate a cronjob to ask LLM to generate a random motivational quote(more hallucination is more beneficial) or random status and then post it. Then automate this to generate different posts for X/Bsky/Mastodon/LinkedIn/Insta and you have auto generated presence. There is a saying that, if you let 1000 monkies type on a type writer, you will eventually have a hamlet or something.. forgot the saying, but with an auto generated presence, this could be valuable for a particular crowd.

replies(1): >>41905608 #

28. christianqchung ◴[21 Oct 24 16:05 UTC] No.41905577[source]▶

>>41898961 #

Is 3% supposed to be significant? Or did you mean 4 Turbo and 4o mini?

replies(1): >>41906154 #

29. JohnMakin ◴[21 Oct 24 16:08 UTC] No.41905608{3}[source]▶

>>41905356 #

People are already doing this in a much lazier way en masse on instagram - they'll steal someone's content, lots of times it's shock/violence content that draws in eyeballs, and will post it. Since they need a description for IG's algorithms, they just paste a random response to a LLM prompt into the reel's description. So, you'll be presented a video of the beirut explosion and the caption will be "No problem! Here's some information on the Mercedes blah blah blah."

Once they reach critical mass, they inevitably start posting porn ads. Weird, weird dynamic we're in now.

30. zone411 ◴[21 Oct 24 17:11 UTC] No.41906154{3}[source]▶

>>41905577 #

It is significant because of the other chart that shows MUCH lower non-response rates for GPT-4o.

31. snapcaster ◴[21 Oct 24 17:53 UTC] No.41906566{5}[source]▶

>>41904205 #

You're saying "yours" and "you" but from what I can tell you're describing completely different sets of people as some kind of hypocritical single entity

replies(1): >>41922039 #

32. cma ◴[21 Oct 24 18:58 UTC] No.41907223{3}[source]▶

>>41904232 #

It has been a game changer for code stuff, surfacing libraries and APIs I didn't know about. And I can verify them with the documentation.

And I don't think that's just for assisting experts: it would be extremely helpful to beginners too as long as they have the mindset that it can be wrong.

33. cma ◴[21 Oct 24 20:24 UTC] No.41908177{3}[source]▶

>>41904232 #

And there are many other things it can do, throw your code into it and ask it to look for bugs/oversights/potential optimizations. Then use your reasoning ability to see if it is right on what it gives back.

34. Mabusto ◴[21 Oct 24 20:43 UTC] No.41908344[source]▶

>>41898170 (TP) #

I think the goal of minimizing hallucinations needs to be adjusted. When a human "lies", there is a familiarity to it - "I think the restaurant is here." "Wasn't he in Inception?", like humans are good at conveying which information they're certain of and what they're uncertain of, either with vocal tone, body language or signals in the writing style. I've been trying to use Gemini to just ask simple questions and it's hallucinations really put me off. It will confidently tell me lies and now my lizard brain just sees it as unreliable and I'm less likely to ask it things, only because it's not at all able to indicate which information it's certain of. We're never going to get rid of hallucinations because of the probabilistic nature in which LLMs work, but we can get better at adjusting how these are presented to humans.

35. johnnyanmac ◴[23 Oct 24 05:15 UTC] No.41922039{6}[source]▶

>>41906566 #

"you" is the companies who benefitted from the copyright. Until it became inconvenient for them. If nothing else. Those companies are cohesively greedy and hypocritical, yes.

↑