(www.productcurious.com)

205 points umangsehgal93 | 3 comments | 04 Sep 25 16:45 UTC | HN request time: 0.204s | source

1. barbazoo ◴[04 Sep 25 18:04 UTC] No.45130243[source]▶

> Confidence calibration: When your agent says it's 60% confident, it should be right about 60% of the time. Not 90%, not 30%. Actual 60%.

With current technology (LLM), how can an agent ever be sure about its confidence?

replies(2): >>45130587 #>>45131981 #

2. esafak ◴[04 Sep 25 18:30 UTC] No.45130587[source]▶

>>45130243 (TP) #

I was about to say "Using calibrated models", then I found this interesting paper:

Calibrated Language Models Must Hallucinate

https://arxiv.org/abs/2311.14648

https://www.youtube.com/watch?v=cnoOjE_Xj5g

3. fumeux_fume ◴[04 Sep 25 20:41 UTC] No.45131981[source]▶

>>45130243 (TP) #

The author's inner PM comes out here and makes some wild claims. Calibration is something we can do with traditional, classification models, but not with most off-the-shelf LLMs. Even if you devised a way to determine if the LLM's confidence claim matched it's actual performance, you wouldn't be able to calibrate or tune it like you would a more traditional model.

↑

A PM's Guide to AI Agent Architecture