←back to thread

418 points speckx | 6 comments | | HN request time: 0s | source | bottom
Show context
IgorPartola ◴[] No.44974700[source]
What are the actual use cases that can generate revenue or at least save costs today? I can think of:

1. Generate content to create online influence. This is at this point probably way oversaturated and I think more sophisticated models will not make it better.

2. Replace junior developers with Claude Code or similar. Only sort of works. After all, you can only babysit one of these at a time no matter how senior you are so realistically it will make you, what, 50% more productive?

3. Replace your customer service staff. This may work in the long run but it saves money instead of making money so its impact has a hard ceiling (of spending just the cost of electricity).

4. Assistive tools. Someone to do basic analysis, double check your writing to make it better, generate secondary graphic assets. Can save a bit of money but can’t really make you a ton because you are still the limiting factor.

Aside: I have tried it for editing writing and it works pretty well but only if I have it do minimal actual writing. The more words it adds, the worse the essay. Having it point out awkward phrasing and finding missing parts of a theme is genuinely helpful.

5. AI for characters in video games, robot dogs, etc. Could be a brave new frontier for video games that don’t have such a rigid cause/effect quest based system.

6. AI girlfriends and boyfriends and other NSFW content. Probably a good money maker for a decade or so before authentic human connections swing back as a priority over anxiety over speaking to humans.

What use cases am I missing?

replies(12): >>44974774 #>>44974790 #>>44974807 #>>44974858 #>>44974923 #>>44975121 #>>44975152 #>>44975248 #>>44975302 #>>44975853 #>>44975875 #>>44976266 #
spogbiper ◴[] No.44974858[source]
I am working on a project that uses LLM to pull certain pieces of information from semi-structured documents and then categorize/file them under the correct account. it's about 95% accurate and we haven't even begun to fine tune it. i expect it will require human in the loop checks for the foreseeable future, but even with a human approval of each item, its going to save the clerical staff hundreds of hours per year. There are a lot of opportunities in automating/semi-automating processes like this, basically just information extraction and categorization tasks.
replies(5): >>44975000 #>>44975079 #>>44975108 #>>44975215 #>>44975349 #
1. systemerror ◴[] No.44975108[source]
The big issue with LLMs is that they’re usually right — like 90% of the time — but that last 10% is tough to fix. A 10% failure rate might sound small, but at scale, it's significant — especially when it includes false positives. You end up either having to live with some bad results, build something to automatically catch mistakes, or have a person double-check everything if you want to bring that error rate down.
replies(2): >>44975162 #>>44976001 #
2. spogbiper ◴[] No.44975162[source]
yes, the entire design relies on a human to check everything. basically it presents what it thinks should be done, and why. the human then agrees or does not. much work is put into streamlining this but ultimately its still human controlled
replies(1): >>44975256 #
3. wredcoll ◴[] No.44975256[source]
At the risk of being obvious, this seems set up for failure in the same way expecting a human to catch an automated car's mistakes is. Although I assume mistakes here probably don't matter very much.
replies(2): >>44975394 #>>44975953 #
4. LPisGood ◴[] No.44975394{3}[source]
This reminds me the issue with the old windows access control system.

If those prompts pop up constantly asking for elevated privileges, this is actually worse because it trains people to just reflexively allow elevation.

5. spogbiper ◴[] No.44975953{3}[source]
yes, mistakes are not a huge problem. they will become evident farther down the process and they happen now with the human only system. worst case is the LLM fails and they just have to do the manual work that they are doing now
6. f3b5 ◴[] No.44976001[source]
Depending on the use case, a 10% failure rate can be quite acceptable. This is of course for non-critical applications, like e.g. top-of-funnel sales automation. In practice, for simple uses like labeling data at scale, I'm actually reaching 95-99% accuracy in my startup.