←back to thread

440 points pseudolus | 1 comments | | HN request time: 0.352s | source
Show context
Havoc ◴[] No.45063050[source]
Not sure what these guys are studying but can tell you in the real world - essentially zero AI rollout in accounting world for anything serious.

We've got access to some fancy enterprise copilot version, deep research, MS office integration and all that jazz. I use it diligently every day...to make me a summary of today's global news.

When I try to apply it to actual accounting work. It hallucinates left, right & center on stuff that can't be wrong. Millions and millions off. That's how you get the taxman to kick down your door. Even simple "are these two numbers the same" get false positives so often that it's impossible to trust. So now I've got a review tool that I can't trust the output of? It's like a programming language where the equality (==) symbol has a built in 20% random number generator and you're supposed to write mission critical code with it.

replies(14): >>45063417 #>>45063575 #>>45063964 #>>45064042 #>>45064413 #>>45064732 #>>45065017 #>>45065089 #>>45065569 #>>45065576 #>>45068813 #>>45069627 #>>45076092 #>>45093899 #
ecshafer ◴[] No.45063417[source]
There seems to be this dream of Tax AI Software that will just do all of the taxes. But other than using AI as a fancy text search, I don't see it happening for a long long time. LLMs can't do arithmetic or count.
replies(1): >>45063577 #
Havoc ◴[] No.45063577[source]
Yeah - classifying an invoice into building rent or say printer ink it'll have some success. So we'll see some of it at the very bottom end.

>LLMs can't do arithmetic or count.

Yes. The fancy copilot stuff does use pandas/python to look at excel files so stuff like add up a table does work sometimes, but the parameters going into the pandas code need to make sense too in the garbage in garbage out sense. The base LLM doesn't seem to understand the grid nature of Excel so it ends up looking at the wrong cells or misunderstands how headings relate to the numbers etc.

It'll get better but there doesn't seem to be the equivalent of "use LLM to write boilerplate code" in this world.

replies(1): >>45067427 #
rwmj ◴[] No.45067427[source]
We use Concur (SAP? expenses software), and it can scan your paper receipts and fill in the fields for you. I'd say it's about 30% accurate. Occasionally it'll be incredible. But mostly you end up having to manually adjust fields. It even gets categories completely wrong, like classifying a train ticket as a phone bill. All this means you spend a lot of time checking everything. It'd be hard for me to say honestly that it saves any time, and probably it takes a bit more time.
replies(1): >>45067738 #
ecshafer ◴[] No.45067738[source]
Concur might be the worst software I have ever used.
replies(1): >>45069329 #
1. rwmj ◴[] No.45069329[source]
Ha ha, yes it's bad, but somehow slightly better than the enterprise alternatives.