←back to thread

277 points gk1 | 6 comments | | HN request time: 0.82s | source | bottom
Show context
rossdavidh ◴[] No.44400209[source]
Anyone who has long experience with neural networks, LLM or otherwise, is aware that they are best suited to applications where 90% is good enough. In other words, applications where some other system (human or otherwise) will catch the mistakes. This phrase: "It is not entirely clear why this episode occurred..." applies to nearly every LLM (or other neural network) error, which is why it is usually not possible to correct the root cause (although you can train on that specific input and a corrected output).

For some things, like say a grammar correction tool, this is probably fine. For cases where one mistake can erase the benefit of many previous correct responses, and more, no amount of hardware is going to make LLM's the right solution.

Which is fine! No algorithm needs to be the solution to everything, or even most things. But much of people's intuition about "AI" is warped by the (unmerited) claims in that name. Even as LLM's "get better", they won't get much better at this kind of problem, where 90% is not good enough (because one mistake can be very costly), and problems need discoverable root causes.

replies(4): >>44401352 #>>44401613 #>>44402343 #>>44406687 #
1. bigstrat2003 ◴[] No.44401352[source]
This is an insightful post, and I think maybe highlights the gap between AI proponents and me (very skeptical about AI claims). I don't have any applications where I'm willing to accept 90% as good enough. I want my tools to work 100% of the time or damn close to it, and even 90% simply is not acceptable in my book. It seems like maybe the people who are optimistic about AI simply are willing to accept a higher rate of imperfections than I am.
replies(3): >>44401977 #>>44401995 #>>44402048 #
2. beering ◴[] No.44401977[source]
It’s not hard to find applications where 90% success or even 50% success rate is incredibly useful. For example, hooking up ChatGPT Codex to your repo and asking it to find and fix a bug. If it succeeds in 50% of the attempts, you would hit that button over and over until its success rate drops to a much lower figure. Especially as costs trend towards zero.
replies(1): >>44402969 #
3. signatoremo ◴[] No.44401995[source]
If you have a surgery you already accept less than perfect success rate. In fact you have no way to know how badly it can go. The surgeon or their assistants may have a bad day.
replies(1): >>44402086 #
4. nlawalker ◴[] No.44402048[source]
It's very scenario dependent. I wish my dishwasher got all the dishes perfectly clean every time, and I wish that I could simply put everything in there without having to consider that the wood stuff will get damaged or the really fragile stuff will get broken, but in spite of those imperfections I still use it every day because I come out way ahead, even in the cases where I have to get the dishes to 100% clean myself with some extra scrubbing.

Another good example might be a paint roller - absolutely useless in the edges and corners, but there are other tools for those, and boy does it make quick work of the flat walls.

If you think of and try to use AI as a tool in the same way as, say, a compiler or a drill, then yes, the imperfections render it useless. But it's going to be an amazing dishwasher or paint roller for a whole bunch of scenarios we are just now starting to consider.

5. apical_dendrite ◴[] No.44402086[source]
Typically you accept the risk of surgical complications because the alternative is much worse. Given a scenario where you have an aggressive tumor that will most likely kill you in six months, and surgery presents a 90% chance of gaining you at least a few more years of life, but a 10% chance of serious complications or death, most people would take the surgery. But if it's a purely elective procedure, very few people would take that chance.

If your business has an opportunity to save millions in labor costs by replacing humans with AI, but there's a 10% chance that the AI will screw up and destroy the business, will business owners accept that risk? It will be interesting to find out.

6. 3vidence ◴[] No.44402969[source]
I agree there are good examples of 90% being good enough but what you purposed doesn't sound like a good one.

This assumes that AI can't also introduce new bugs into the code causing a negative.

A case of 90% being good enough sound more like story boarding or giving note summaries.