←back to thread

277 points gk1 | 8 comments | | HN request time: 0.835s | source | bottom
1. due-rr ◴[] No.44399955[source]
Would you ever trust an AI agent running your business? As hilarious as this small experiment is, is there ever a point where you can trust it to run something long term? It might make good decisions for a day, month or a year and then one day decide to trash your whole business.
replies(3): >>44400017 #>>44400031 #>>44400053 #
2. marinmania ◴[] No.44400017[source]
It does seem far more straight forward to say "Write code that deterministically orders food items that people want and sends invoices etc."

I feel like that's more the future. Having an agent sorta make random choices feel like LLMs attempting to do math, instead of LLMs attempting to call a calculator.

replies(2): >>44400108 #>>44400297 #
3. keymon-o ◴[] No.44400031[source]
I’ve just written a small anecdote with GPT3.5, where it lost count of some trivial item quantity incremental in just a few prompts. It might get better for the orders of magnitude from now on, but who’s gonna pay for ‘that one eventual mistake’.
replies(1): >>44400089 #
4. throwacct ◴[] No.44400053[source]
I don't think any decision maker will let LLMs run their business. If the LLMs fail, you could potentially lose your livelihood.
5. croemer ◴[] No.44400089[source]
GPT3.5? Did you mean to send this 2 years ago?
replies(1): >>44400146 #
6. keymon-o ◴[] No.44400108[source]
Every output that is going to be manually verified by a professional is a safe bet.

People forget that we use computers for accuracy, not smarts. Smarts make mistakes.

7. keymon-o ◴[] No.44400146{3}[source]
Maybe. Did LLMs stop with hallucinations and errors 2 years ago?
8. standardUser ◴[] No.44400297[source]
Right, but if we limit the scope too much we quickly arrive at the point where 'dumb' autonomy is sufficient instead of using the world's most expensive algorithms.