Most active commenters

Popular/hot comments

>>44474147 #
>>44471991 #
>>44473259 #
>>44472058 #
>>44472148 #
>>44475570 #

Problems the AI industry is not addressing adequately

(www.thealgorithmicbridge.com)

1. empiko ◴[05 Jul 25 11:20 UTC] No.44471933[source]▶

Observe what the AI companies are doing, not what they are saying. If they would expect to achieve AGI soon, their behaviour would be completely different. Why bother developing chatbots or doing sales, when you will be operating AGI in a few short years? Surely, all resources should go towards that goal, as it is supposed to usher the humanity into a new prosperous age (somehow).

replies(9): >>44471988 #>>44471991 #>>44472148 #>>44472874 #>>44473259 #>>44473640 #>>44474131 #>>44475570 #>>44476315 #

2. delusional ◴[05 Jul 25 11:32 UTC] No.44471988[source]▶

>>44471933 (TP) #

Continuing in the same vain. Why would they force their super valuable, highly desirable, profit maximizing chat-bots down your throat?

Observations of reality is more consistent with company FOMO than with actual usefulness.

replies(1): >>44472094 #

3. rvz ◴[05 Jul 25 11:32 UTC] No.44471991[source]▶

>>44471933 (TP) #

Exactly. For example, Microsoft was building data centers all over the world since "AGI" was "around the corner" according to them.

Now they are cancelling those plans. For them "AGI" was cancelled.

OpenAI claims to be closer and closer to "AGI" as more top scientists left or are getting poached by other labs that are behind.

So why would you leave if the promise of achieving "AGI" was going to produce "$100B dollars of profits" as per OpenAI's and Microsoft's definition in their deal?

Their actions tell more than any of their statements or claims.

replies(4): >>44472058 #>>44472138 #>>44473043 #>>44474336 #

4. zaphirplane ◴[05 Jul 25 11:43 UTC] No.44472058[source]▶

>>44471991 #

I’m not commenting on the whole just the rhetorical question of why would people leave.

They are leaving for more money, more seniority or because they don’t like their boss. 0 about AGI

replies(3): >>44472121 #>>44472173 #>>44472278 #

5. Touche ◴[05 Jul 25 11:49 UTC] No.44472094[source]▶

>>44471988 #

Because it's valuable training data. Like how having Google Maps on everyone's phone made their map data better.

Personally I think AGI is ill-defined and won't happen as a new model release. Instead the thing to look for is how LLMs are being used in AI research and there are some advances happening there.

6. Touche ◴[05 Jul 25 11:52 UTC] No.44472121{3}[source]▶

>>44472058 #

Yeah I agree, this idea that people won't change jobs if they are on the verge of a breakthrough reads like a silicon valley fantasy where you can underpay people by selling them on vision or something. "Make ME rich, but we'll give you a footnote on the Wikipedia page"

replies(1): >>44473769 #

7. cm277 ◴[05 Jul 25 11:55 UTC] No.44472138[source]▶

>>44471991 #

Yes, this. Microsoft has other businesses that can make a lot of money (regular Azure) and tons of cash flow. The fact that they are pulling back from the market leader (OpenAI) whom they mostly owned should be all the negative signal people need: AGI is not close and there is no real moat even for OpenAI.

replies(1): >>44472979 #

8. pu_pe ◴[05 Jul 25 11:57 UTC] No.44472148[source]▶

>>44471933 (TP) #

I don't think it's as simple as that. Chatbots can be used to harvest data, and sales are still important before and after you achieve AGI.

replies(3): >>44473172 #>>44473301 #>>44473787 #

9. rvz ◴[05 Jul 25 12:03 UTC] No.44472173{3}[source]▶

>>44472058 #

> They are leaving for more money, more seniority or because they don’t like their boss. 0 about AGI

Of course, but that's part of my whole point.

Such statements and targets about how close we are to "AGI" has only become nothing but false promises and using AGI as the prime excuse to continue raising more money.

10. Game_Ender ◴[05 Jul 25 12:19 UTC] No.44472278{3}[source]▶

>>44472058 #

I think the implicit take is that if your company hits AGI your equity package will do something like 10x-100x even if the company is already big. The only other way to do that is join a startup early enough to ride its growth wave.

Another way to say it is that people think it’s much more likely for each decent LLM startup grow really strongly first several years then plateau vs. then for their current established player to hit hyper growth because of AGI.

replies(1): >>44472816 #

11. leoc ◴[05 Jul 25 13:53 UTC] No.44472816{4}[source]▶

>>44472278 #

A catch here is that individual workers may have priorities which are altered due to the strong natural preference for assuring financial independence. Even if you were a hot AI researcher who felt (and this is just a hypothetical) that your company was the clear industry leader and had, say, a 75% chance of soon achieving something AGI-adjacent and enabling massive productivity gains, you might still (and quite reasonably) prefer to leave if that was what it took to make absolutely sure of getting of your private-income screw-you money (and/or private-investor seed capital). Again this is just a hypothetical: I have no special insight, and FWIW my gut instinct is that the job-hoppers are in fact mostly quite cynical about the near-term prospects for "AGI".

replies(2): >>44474352 #>>44474498 #

12. redhale ◴[05 Jul 25 14:01 UTC] No.44472874[source]▶

>>44471933 (TP) #

> Why bother developing chatbots or doing sales, when you will be operating AGI in a few short years?

To fund yourself while building AGI? To hedge risk that AGI takes longer? Not saying you're wrong, just saying that even if they did believe it, this behavior could be justified.

replies(1): >>44473964 #

13. whynotminot ◴[05 Jul 25 14:19 UTC] No.44472979{3}[source]▶

>>44472138 #

Well, there’s clauses in their relationship with OpenAI that sever the relationship when AGI is reached. So it’s actually not in Microsoft’s interests for OpenAI to get there

replies(1): >>44473030 #

14. PessimalDecimal ◴[05 Jul 25 14:27 UTC] No.44473030{4}[source]▶

>>44472979 #

I haven't heard of this. Can you provide a reference? I'd love to see how they even define AGI crisply enough for a contract.

replies(1): >>44473162 #

15. computerphage ◴[05 Jul 25 14:30 UTC] No.44473043[source]▶

>>44471991 #

Wait, aren't they cancelling leases on non-ai data centers that aren't under Microsoft's control, while spending much more money to build new AI focused data centers that that own? Do you have a source that says they're canceling their own data centers?

replies(1): >>44473116 #

16. PessimalDecimal ◴[05 Jul 25 14:43 UTC] No.44473116{3}[source]▶

>>44473043 #

https://www.datacenterfrontier.com/hyperscale/article/552705... might fit the bill of what you are looking for.

Microsoft itself hasn't said they're doing this because of oversupply in infrastructure for it's AI offerings, but they very likely wouldn't say that publicly even if that's the reason.

replies(1): >>44473171 #

17. diggan ◴[05 Jul 25 14:49 UTC] No.44473162{5}[source]▶

>>44473030 #

> I'd love to see how they even define AGI crisply enough for a contract.

Seems to be about this:

> As per the current terms, when OpenAI creates AGI - defined as a "highly autonomous system that outperforms humans at most economically valuable work" - Microsoft's access to such a technology would be void.

https://www.reuters.com/technology/openai-seeks-unlock-inves...

18. computerphage ◴[05 Jul 25 14:51 UTC] No.44473171{4}[source]▶

>>44473116 #

Thank you!

19. worldsayshi ◴[05 Jul 25 14:52 UTC] No.44473172[source]▶

>>44472148 #

It could also be the case that they think that AGI could arrive at any moment but it's very uncertain when and only so many people can work on it simultaneously. So they spread out investments to also cover low uncertainty areas.

20. imiric ◴[05 Jul 25 15:06 UTC] No.44473259[source]▶

>>44471933 (TP) #

Related to your point: if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?

This is the main point that proves to me that these companies are mostly selling us snake oil. Yes, there is a great deal of utility from even the current technology. It can detect patterns in data that no human could; that alone can be revolutionary in some fields. It can generate data that mimics anything humans have produced, and certain permutations of that can be insightful. It can produce fascinating images, audio, and video. Some of these capabilities raise safety concerns, particularly in the wrong hands, and important questions that society needs to address. These hurdles are surmountable, but they require focusing on the reality of what these tools can do, instead of on whatever a group of serial tech entrepreneurs looking for the next cashout opportunity tell us they can do.

The constant anthropomorphization of this technology is dishonest at best, and harmful and dangerous at worst.

replies(4): >>44473413 #>>44474036 #>>44474147 #>>44474204 #

21. energy123 ◴[05 Jul 25 15:12 UTC] No.44473301[source]▶

>>44472148 #

Besides, there is Sutskever's SSI which is avoiding customers.

replies(1): >>44474583 #

22. deadbabe ◴[05 Jul 25 15:28 UTC] No.44473413[source]▶

>>44473259 #

Data from the future is tunneling into the past to mess up our weights and ensure we never achieve AGI.

23. bluGill ◴[05 Jul 25 16:01 UTC] No.44473640[source]▶

>>44471933 (TP) #

The people who make the money in gold rushes sold shovels, not mined the gold. Sure some random people found gold and made a lot of money, but many others didn't strike it rich.

As such even if there is a lot of money AI will make, it can still be the right decision to sell tools to others who will figure out how to use it. And of course if it turns out another pointless fad with no real value you still make money. (I'd predict the answer is in between - we are not going to get some AGI that takes over the world, but there will be niches where it is a big help and those niches will be worth selling tools into)

replies(2): >>44474561 #>>44480471 #

24. LtWorf ◴[05 Jul 25 16:25 UTC] No.44473769{4}[source]▶

>>44472121 #

I think you're being very optimistic with the footnote.

25. pests ◴[05 Jul 25 16:27 UTC] No.44473787[source]▶

>>44472148 #

OpenAI considers money to be useless post-agi. They’ve even made statements that any investments are basically donations once agi is achieved

26. krainboltgreene ◴[05 Jul 25 16:54 UTC] No.44473964[source]▶

>>44472874 #

There is no chat bot so feature rich that it would fund the billions being burned on a monthly basis.

27. ozim ◴[05 Jul 25 17:04 UTC] No.44474036[source]▶

>>44473259 #

anthropomorphization definitely sucks, hype is over the board.

But it is far from snake oil as it actually is useful and does a lot of stuff really.

28. richk449 ◴[05 Jul 25 17:18 UTC] No.44474131[source]▶

>>44471933 (TP) #

> If they would expect to achieve AGI soon, their behaviour would be completely different. Why bother developing chatbots or doing sales, when you will be operating AGI in a few short years?

What if chatbots and user interactions ARE the path to AGI? Two reasons they could be: (1) Reinforcement learning in AI has proven to be very powerful. Humans get to GI through learning too - they aren’t born with much intelligence. Interactions between AI and humans may be the fastest way to get to AGI. (2) The classic Silicon Valley startup model is to push to customers as soon as possible (MVP). You don’t develop the perfect solution in isolation, and then deploy it once it is polished. You get users to try it and give feedback as soon as you have something they can try.

I don’t have any special insight into AI or AGI, but I don’t think OpenAI selling useful and profitable products is proof that there won’t be AI.

replies(1): >>44474172 #

29. richk449 ◴[05 Jul 25 17:22 UTC] No.44474147[source]▶

>>44473259 #

> if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?

Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.

The criticism feels about three years out of date.

replies(10): >>44474186 #>>44474349 #>>44474366 #>>44474767 #>>44475291 #>>44475424 #>>44475442 #>>44475678 #>>44476445 #>>44476449 #

30. ◴[05 Jul 25 17:25 UTC] No.44474172[source]▶

>>44474131 #

31. ◴[05 Jul 25 17:27 UTC] No.44474186{3}[source]▶

>>44474147 #

32. xoralkindi ◴[05 Jul 25 17:30 UTC] No.44474204[source]▶

>>44473259 #

> It can generate data that mimics anything humans have produced...

No, it can generate data that mimics anything humans have put on the WWW

replies(1): >>44475915 #

33. tuatoru ◴[05 Jul 25 17:49 UTC] No.44474336[source]▶

>>44471991 #

> Their actions tell more than any of their statements or claims.

At Microsoft, "AI" is spelled "H1-B".

34. leptons ◴[05 Jul 25 17:51 UTC] No.44474349{3}[source]▶

>>44474147 #

Are you hallucinating?? "AI" is still constantly hallucinating. It still writes pointless code that does nothing towards anything I need it to do, a lot more often than is acceptable.

35. andrew_lettuce ◴[05 Jul 25 17:51 UTC] No.44474352{5}[source]▶

>>44472816 #

You're right, but the narrative out of these companies directly refutes this position. They're explicitly saying that 1. AGI changes everything, 2. It's just around the corner, 3. They're completely dedicated to achieving it; nothing is more important.

Then they leave for more money.

replies(1): >>44474507 #

36. imiric ◴[05 Jul 25 17:54 UTC] No.44474366{3}[source]▶

>>44474147 #

Not at all. The reason it's not talked about as much these days is because the prevailing way to work around it is by using "agents". I.e. by continuously prompting the LLM in a loop until it happens to generate the correct response. This brute force approach is hardly a solution, especially in fields that don't have a quick way of verifying the output. In programming, trying to compile the code can catch many (but definitely not all) issues. In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.

The other reason is because the primary focus of the last 3 years has been scaling the data and hardware up, with a bunch of (much needed) engineering around it. This has produced better results, but it can't sustain the AGI promises for much longer. The industry can only survive on shiny value added services and smoke and mirrors for so long.

replies(1): >>44475339 #

37. sdenton4 ◴[05 Jul 25 18:16 UTC] No.44474498{5}[source]▶

>>44472816 #

Additionally, if you've already got vested stock in Company A from your time working there, jumping ship to Company B (with higher pay and a stock package) is actually a diversification. You can win whichever ship pulls in first.

The 'no one jumps ship if agi is close' assumption is really weak, and seemingly completely unsupported in TFA...

38. sdenton4 ◴[05 Jul 25 18:18 UTC] No.44474507{6}[source]▶

>>44474352 #

Don't conflate labor's perspective with capital's started position... The companies aren't leaving the companies, the workers are leaving the companies.

39. convolvatron ◴[05 Jul 25 18:26 UTC] No.44474561[source]▶

>>44473640 #

its so good that people seem to automatically exclude the middle. its either the arrival of the singularity or complete fakery. I think you've expressed the most likely outcome by far - that there will be some really interesting tools and use cases, and some things will be changed forever - but very unlikely that _everything_ will

40. timy2shoes ◴[05 Jul 25 18:29 UTC] No.44474583{3}[source]▶

>>44473301 #

Of course they are. Why would you want revenue? If you show revenue, people will ask 'HOW MUCH?' and it will never be enough. The company that was the 100xer, the 1000xer is suddenly the 2x dog. But if you have NO revenue, you can say you're pre-revenue! You're a potential pure play... It's not about how much you earn, it's about how much you're worth. And who is worth the most? Companies that lose money!

41. natebc ◴[05 Jul 25 19:01 UTC] No.44474767{3}[source]▶

>>44474147 #

> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

Last week I had Claude and ChatGPT both tell me different non-existent options to migrate a virtual machine from vmware to hyperv.

Week before that one of them (don't remember which, honestly) gave me non existent options for fio.

Both of these are things that the first party documentation or man page has correct but i was being lazy and was trying to save time or be more efficient like these things are supposed to do for us. Not so much.

Hallucinations are still a problem.

42. majormajor ◴[05 Jul 25 20:27 UTC] No.44475291{3}[source]▶

>>44474147 #

> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

Nonsense, there is a TON of discussion around how the standard workflow is "have Cursor-or-whatever check the linter and try to run the tests and keep iterating until it gets it right" that is nothing but "work around hallucinations." Functions that don't exist. Lines that don't do what the code would've required them to do. Etc. And yet I still hit cases weekly-at-least, when trying to use these "agents" to do more complex things, where it talks itself into a circle and can't figure it out.

What are you trying to get these things to do, and how are you validating that there are no hallucinations? You hardly ever "hear about it" but ... do you see it? How deeply are you checking for it?

(It's also just old news - a new hallucination is less newsworthy now, we are all so used to it.)

Of course, the internet is full of people claiming that they are using the same tools I am but with multiple factors higher output. Yet I wonder... if this is the case, where is the acceleration in improvement in quality in any of the open source software I use daily? Or where are the new 10x-AI-agent-produced replacements? (Or the closed-source products, for that matter - but there it's harder to track the actual code.) Or is everyone who's doing less-technical, less-intricate work just getting themselves hyped into a tizzy about getting faster generation of basic boilerplate for languages they hadn't personally mastered before?

43. majormajor ◴[05 Jul 25 20:33 UTC] No.44475339{4}[source]▶

>>44474366 #

> In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.

Even just in industry, I think data functions at companies will have a dicey future.

I haven't seen many places where there's scientific peer review - or even software-engineering-level code-review - of findings from data science teams. If the data scientist team says "we should go after this demographic" and it sounds plausible, it usually gets implemented.

So if the ability to validate was already missing even pre-LLM, what hope is there for validation of the LLM-powered replacement. And so what hope is there of the person doing the non-LLM-version of keeping their job (at least until several quarters later when the strategy either proves itself out or doesn't.)

How many other departments are there where the same lack of rigor already exists? Marketing, sales, HR... yeesh.

44. taormina ◴[05 Jul 25 20:48 UTC] No.44475424{3}[source]▶

>>44474147 #

ChatGPT constantly hallucinates. At least once per conversation I attempt to happen with it. We all gave up on bitching about it constantly because we would never talk about anything else, but I have no reason to believe that any LLM has vaguely solved this problem.

45. nunez ◴[05 Jul 25 20:51 UTC] No.44475442{3}[source]▶

>>44474147 #

The few times I've used Google to search for something (Kagi is amazing!), it's Gemini Assistant at the top fabricated something insanely wrong.

A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)

Hallucinations are still a massive problem IMO.

replies(2): >>44475546 #>>44479105 #

46. Lichtso ◴[05 Jul 25 21:09 UTC] No.44475570[source]▶

>>44471933 (TP) #

> Why bother developing chatbots

Maybe it is the reverse? It is not them offering a product, it is the users offering their interaction data. Data which might be harvested for further training of the real deal, which is not the product. Think about it: They (companies like OpenAI) have created a broad and diverse user base which without a second thought feeds them with up-to-date info about everything happening in the world, down to the individual life and even their inner thoughts. No one in the history of mankind ever had such a holistic view, almost gods eye. That is certainly something a super intelligence would be interested in. They may have achieved it already and we are seeing one of its strategies playing out. Not saying they have, but this observation would not be incompatible or indicate they haven't.

replies(3): >>44476075 #>>44476079 #>>44478319 #

47. amlib ◴[05 Jul 25 21:25 UTC] No.44475678{3}[source]▶

>>44474147 #

How can it not be hallucinating anymore if everything the current crop of generative AI algorithm does IS hallucination? What actually happens is that sometimes the hallucinated output is "right", or more precisely, coherent with the user mental model.

48. nradov ◴[05 Jul 25 21:59 UTC] No.44475915{3}[source]▶

>>44474204 #

The frontier model developers have licensed access to a huge volume of training data which isn't available on the public WWW.

49. ysofunny ◴[05 Jul 25 22:24 UTC] No.44476075[source]▶

>>44475570 #

that possibility makes me feel weird about paying a subscription... they should pay me!

or the best models should be free to use. if it's free to use then I think I can live with it

50. blibble ◴[05 Jul 25 22:25 UTC] No.44476079[source]▶

>>44475570 #

> No one in the history of mankind ever had such a holistic view, almost gods eye.

I distinctly remember search engines 30 years ago having a "live searches" page (with optional "include adult searches" mode)

replies(2): >>44477058 #>>44477604 #

51. grafmax ◴[05 Jul 25 23:13 UTC] No.44476315[source]▶

>>44471933 (TP) #

> it is supposed to usher the humanity into a new prosperous age (somehow).

More like usher in climate catastrophe way ahead of schedule. AI-driven data center build outs are a major source of new energy use, and this trend is only intensifying. Dangerously irresponsible marketing cloaks the impact of these companies on our future.

replies(1): >>44477198 #

52. HexDecOctBin ◴[05 Jul 25 23:43 UTC] No.44476445{3}[source]▶

>>44474147 #

I just tried asking ChatGPT on how to "force PhotoSync to not upload images to a B2 bucket that are already uploaded previously", and all it could do is hallucinate options that don't exist and webpages that are irrelevant. This is with the latest model and all the reasoning and researching applied, and across multiple messages in multiple chats. So no, hallucination is still a huge problem.

53. kevinventullo ◴[05 Jul 25 23:43 UTC] No.44476449{3}[source]▶

>>44474147 #

You don’t hear about it anymore because it’s not worth talking about anymore. Everyone implicitly understands they are liable to make up nonsense.

54. kylecazar ◴[06 Jul 25 01:37 UTC] No.44477058{3}[source]▶

>>44476079 #

I'n the mid 90's? What did the "live searches" feature do?

replies(1): >>44477614 #

55. Redoubts ◴[06 Jul 25 02:06 UTC] No.44477198[source]▶

>>44476315 #

Incredibly bizarre take. You can build more capacity without frying the planet. Many ai companies are directly investing in nuclear plants for this reason, for example.

replies(1): >>44479778 #

56. ◴[06 Jul 25 03:32 UTC] No.44477604{3}[source]▶

>>44476079 #

57. sllabres ◴[06 Jul 25 03:33 UTC] No.44477614{4}[source]▶

>>44477058 #

Show what queries are send to the search engine (by other users) right now

58. visarga ◴[06 Jul 25 06:27 UTC] No.44478319[source]▶

>>44475570 #

It's not about achieving AGI as a final product, it's about building a perpetual learning machine fueled by real-time human interaction. I call it the human-AI experience flywheel.

People bring problems to the LLM, the LLM produces some text, people use it and later return to iterate. This iteration functions as a feedback for earlier responses from the LLM. If you judge an AI response by the next 20 rounds of interaction or more you can gauge if it was useful or not. They can create RLHF data this way, using hindsight or extra context from other related conversations of the same user on the same topic. That works because users try the LLM ideas in reality and bring outcome results back to the model, or they simply recall from their personal experience if that approach would work or not. The system isn't just built to be right; it's built to be correctable by the user base, at scale.

OpenAI has 500M users, if they generate 1000 tokens/user/day that means 0.5T interactive tokens/day. The chat logs dwarf the original training set in size and are very diverse, targeted to our interests, and mixed with feedback. They are also "on policy" for the LLM, meaning they contain corrections to mistakes the LLM made, not generic information like web scrape.

You're right that LLMs eventually might not even need to crawl the web, they have the whole society dump data into their open mouths. That did not happen with web search engines, only social networks did that in the past. But social networks are filled with our cultural wars and self conscious posing, while the chat room is an environment where we don't need to signal our group alignment.

Web scraping gives you humanity's external productions - what we chose to publish. But conversational logs capture our thinking process, our mistakes, our iterative refinements. Google learned what we wanted to find, but LLMs learn how we think through problems.

replies(1): >>44478430 #

59. FuckButtons ◴[06 Jul 25 06:52 UTC] No.44478430{3}[source]▶

>>44478319 #

I see where you’re coming from, but I think teasing out something that looks like a clear objective function that generalizes to improved intelligence from llm interaction logs is going to be hellishly difficult. Consider, that most of the best llm pre training comes from being very very judicious with the training data, selecting the right corpus of llm interaction logs and then defining an objective function that correctly models…? Being helpful? From that sounds far harder than just working from scratch with rlhf.

replies(1): >>44479317 #

60. seanhunter ◴[06 Jul 25 09:18 UTC] No.44479105{4}[source]▶

>>44475442 #

The google AI clippy thing at the top of search has to be one of the most pointless, ill-advised and brand-damaging stunts they could have done. Because compute is expensive at scale (even for them) it’s running a small model, so the suggestions are pretty terrible. That leads people to who don’t understand what’s happening to think their AI is just bad in general.

That’s not the case in my experience. Gemini is almost as good as Claude for most of the things I try.

That said, for queries tgat don’t use agentic search or rag, hallucination is as bad a problem as ever and it won’t improve because hallucination is all these models do. In Karpathy’s phrase they “dream text”. Agentic search and rag and similar techniques disguise the issue because they stuff the context of the model with real results, so the scope for it to go noticeably off the rails is less. But it’s still very visible if you ask for references, links etc many/most/sometimes all will be hallucinations depending on the prompt.

61. visarga ◴[06 Jul 25 10:00 UTC] No.44479317{4}[source]▶

>>44478430 #

The way I see it is to use hindsight, not to come with predefined criteria. The criteria is usefulness of one LLM response in the interactions that follow it down the line.

For example, the model might propose "try doing X", and I come back later and say "I tried X but this and that happened", it can use that asa feedback. It might be a feedback generated from the real world outcomes of the X suggestion, or even from my own experience, maybe I have seen X in practice and know if it works or not. The longitudinal analysis can span multiple days, the more context the better for self analysis.

The cool thing is that generating preference scores for LLM responses, training a judge model on them, and then doing RLHF with this judge model on the base LLM ensures isolation. So personal data leaks might not be an issue. Another beneficial effect is that the judge model learns to transfer judgements skills across similar contexts, so there might be some generalization going on.

Of course there is always the risk of systematic bias and random noise in the data, but I believe AI researchers are equipped to deal with it. It won't be as simple as I described, but the size of the interaction dataset and the human in the loop, and real world testing are certainly useful for LLMs.

62. grafmax ◴[06 Jul 25 11:29 UTC] No.44479778{3}[source]▶

>>44477198 #

Several companies investing in AI Have made commitments to renewable and clean energy. However at most half of this increased energy demand is expected to come from renewables through 2030 and fossil fuels will continue to be heavily utilized for the massive data center build outs occurring beyond that, according to the International Energy Agency’s April 2025 report. Nuclear energy has long build outs of 10+ years. For the forseeable future the AI industry will continue to contribute to climate change, at a point in history where immediate drastic action is needed to mitigate the impending climate catastrophe.

63. mathgeek ◴[06 Jul 25 13:02 UTC] No.44480471[source]▶

>>44473640 #

> The people who make the money in gold rushes sold shovels, not mined the gold. Sure some random people found gold and made a lot of money, but many others didn't strike it rich.

Another important group to remember is those who owned the infrastructure necessary for the prospectors to survive. The folks who owned (or strong-armed their way into) the services around housing, food, alcohol, etc. made off like bandits.

↑