Small language models are the future of agentic AI

(arxiv.org)

112 points favoboa | 2 comments | 01 Jul 25 03:33 UTC | HN request time: 0.001s | source

Show context

bryant ◴[01 Jul 25 06:27 UTC] No.44431158[source]▶

A few weeks ago, I processed a product refund with Amazon via agent. It was simple, straightforward, and surprisingly obvious that it was backed by a language model based on how it responded to my frustration about it asking tons of questions. But in the end, it processed my refund without ever connecting me with a human being.

I don't know whether Amazon relies on LLMs or SLMs for this and for similar interactions, but it makes tons of financial sense to use SLMs for narrowly scoped agents. In use cases like customer service, the intelligence behind LLMs is all wasted on the task the agents are trained for.

Wouldn't surprise me if down the road we start suggesting role-specific SLMs rather than general LLMs as both an ethics- and security-risk mitigation too.

replies(5): >>44431884 #>>44431916 #>>44432173 #>>44433836 #>>44441923 #

automatic6131 ◴[01 Jul 25 08:42 UTC] No.44431884[source]▶

>>44431158 #

You can (used to?) get a refund on Amazon with normal CRUD app flow. Putting an SLM and a conversational interface over it is a backwards step.

replies(3): >>44432408 #>>44432734 #>>44433817 #

1. DebtDeflation ◴[01 Jul 25 13:38 UTC] No.44433817[source]▶

>>44431884 #

Likewise, we've been building conversational interfaces for stuff like this for over a decade using traditional NLP techniques and orchestration rather than LLMs. Intent classification, named entity extraction, slot filling, and API calling.

Processing returns is pretty standardized - identify the order and the item within it being returned, capture the reason for the return, check eligibility, capture whether they want a refund or a new item, and execute either. When you have a fully deterministic workflow with 5-6 steps, maybe 1 or 2 if-then branches, and then a single action at the end, I don't see the value of running an LLM in a loop, burning a crazy amount of tokens, and hoping it works at least 80% of the time when there are far simpler and cheaper ways of doing it that will work almost 100% of the time.

replies(1): >>44434002 #

2. zsyllepsis ◴[01 Jul 25 14:00 UTC] No.44434002[source]▶

>>44433817 (TP) #

True, we have been building conversational interfaces with traditional NLP. In my experience, they’ve been fairly fragile.

Extending the example you gave, nicely packaged, fully deterministic workflows work great in demos. Then customers start going off the paved path. They ask about returning 3 items all at once, or a whole order. They get confused and provide a shipping number instead of an order number. They switch language part of the way through the conversation because they get frustrated by all these follow-up questions.

All of these absolutely can be handled through traditional NLP, but require system designers to account for them, model the conversation, and design their system accordingly to react accordingly. And suddenly the 5-6 step deterministic workflows with a couple of if-branches… isn’t.

↑