(arxiv.org)

204 points tdchaitanya | 2 comments | 01 Sep 25 16:57 UTC | HN request time: 0s | source

1. QuadmasterXLII ◴[01 Sep 25 18:00 UTC] No.45095027[source]▶

The framing in the headline is interesting. As far as I recall, spending 4x more compute on a model to improve performance by 7% is the move that has worked over and over again up to this point. 101 % of GPT-4 performance (potentially at any cost) is what I would expect an improved routing algorithm to achieve.

replies(1): >>45095379 #

2. dang ◴[01 Sep 25 18:38 UTC] No.45095379[source]▶

>>45095027 (TP) #

(The submitted title was "93% of GPT-4 performance at 1/4 cost: LLM routing with weak bandit feedback")

↑

Adaptive LLM routing under budget constraints