Adaptive LLM routing under budget constraints

(arxiv.org)

Show context

andrewflnr ◴[01 Sep 25 17:48 UTC] No.45094933[source]▶

Is this really the frontier of LLM research? I guess we really aren't getting AGI any time soon, then. It makes me a little less worried about the future, honestly.

Edit: I never actually expected AGI from LLMs. That was snark. I just think it's notable that the fundamental gains in LLM performance seem to have dried up.

replies(7): >>45094979 #>>45094995 #>>45095059 #>>45095198 #>>45095374 #>>45095383 #>>45095463 #

1. jibal ◴[01 Sep 25 18:18 UTC] No.45095198[source]▶

>>45094933 #

LLMs are not on the road to AGI, but there are plenty of dangers associated with them nonetheless.

replies(2): >>45095419 #>>45095531 #

2. nicce ◴[01 Sep 25 18:42 UTC] No.45095419[source]▶

>>45095198 (TP) #

Just 2 days ago Gemini 2.5 Pro tried to recommend me tax evasion based on non-existing laws and court decisions. The model was so charming and convincing, that even after I brought all the logic flaws and said that this is plain wrong, I started to doubt myself, because it is so good at pleasing, arguing and using words.

And most would have accept the recommendation because the model sold it as less common tactic, while sounding very logical.

replies(2): >>45095524 #>>45095785 #

3. roywiggins ◴[01 Sep 25 18:54 UTC] No.45095524[source]▶

>>45095419 #

> even after I brought all the logic flaws and said that this is plain wrong

Once you've started to argue with an LLM you're already barking up the wrong tree. Maybe you're right, maybe not, but there's no point in arguing it out with an LLM.

replies(1): >>45096510 #

4. andrewflnr ◴[01 Sep 25 18:54 UTC] No.45095531[source]▶

>>45095198 (TP) #

Agreed, broadly. I never really thought they were, but seeing people work on stuff like this instead of even trying to improve the architecture really makes it obvious.

5. nutjob2 ◴[01 Sep 25 19:25 UTC] No.45095785[source]▶

>>45095419 #

Or you could understand the tool you are using and be skeptical of any of its output.

So many people just want to believe, instead of the reality of LLMs being quite unreliable.

Personally it's usually fairly obvious to me when LLMs are bullshitting probably because I have lots of experience detecting it in humans.

replies(1): >>45096499 #

6. nicce ◴[01 Sep 25 20:52 UTC] No.45096499{3}[source]▶

>>45095785 #

LLM is only useful if it gives shortcut to information with reasonable accuracy. If I need to double check everything, it is just extra step.

In this case I just happened to be domain expert and knew it was wrong. It would have required significant effort to verify everything with some less experienced person.

7. nicce ◴[01 Sep 25 20:53 UTC] No.45096510{3}[source]▶

>>45095524 #

There are cases when they are actually correct, instead of the human.

replies(1): >>45096538 #

8. roywiggins ◴[01 Sep 25 20:56 UTC] No.45096538{4}[source]▶

>>45096510 #

Yes, and there's a substantial chance they'll apologize to you anyway even when they were right. There's no reason to expect them to be more likely to apologize when they're actually right vs actually wrong- their agreeableness is really orthogonal to their correctness.

replies(1): >>45096612 #

9. nicce ◴[01 Sep 25 21:07 UTC] No.45096612{5}[source]▶

>>45096538 #

Yes, they over-apologize. But my main reason for using LLMs is seeking out things that I missed myself or my own argumentation was not good. Sometimes they are really good at bringing new perspectives. Whether they are correct or incorrect is not the point - are they giving argument or perspective that is worth inspecting more with my own brains?

↑