←back to thread

The AI Investment Boom

(www.apricitas.io)
271 points m-hodges | 1 comments | | HN request time: 0s | source
Show context
apwell23 ◴[] No.41896263[source]
> AI products are used ubiquitously to generate code, text, and images, analyze data, automate tasks, enhance online platforms, and much, much, much more—with usage expected only to increase going forward.

Why does every hype article start with this. Personally my copilot usage has gone down while coding. I tried and tried but it always gets lost and starts spitting out subtle bugs that takes me more time to debug than if i had written it myself.

I always have this feeling of 'this might fail in production in unknown ways' because i might have missed checking the code throughly . I know i am not the only one, my coworkers and friends have expressed similar feelings.

I even tried the new 'chain of thought' model, which for some reason seems to be even worse.

replies(10): >>41896295 #>>41896310 #>>41896325 #>>41896327 #>>41896363 #>>41896380 #>>41896400 #>>41896497 #>>41896670 #>>41898703 #
bongodongobob ◴[] No.41896295[source]
Well I have the exact opposite experience. I don't know why people struggle to get good results with llms.
replies(4): >>41896332 #>>41896335 #>>41896492 #>>41897988 #
hnthrowaway6543 ◴[] No.41896335[source]
LLMs are great for simple, common tasks, i.e. CRUD apps, RESTful web endpoints, unit tests, for which there's an enormous amount of examples and not much unique complexity. There's a lot of developers whose day mostly involves these repetitive, simple tasks. There's also a lot of developers who work on things that are a lot more niche and complicated, where LLMs don't provide much help.
replies(3): >>41896464 #>>41896611 #>>41896681 #
danenania ◴[] No.41896611[source]
In my experience this underrates them. They can do pretty complex tasks that go well beyond your examples if prompted correctly.

The real limiting factor is not so much task complexity as the level of abstraction and indirection. If you have code that requires following a long chain of references to understand, LLMs will struggle to work with it.

For similar reasons, they also struggle with:

- generic types

- inheritance hierarchies

- long function call chains

- dependency injection

- deeply nested structures

They're also bad at counting, which can be an issue when dealing with concurrency—i.e. you started 5 operations concurrently at different points in your program and now need to block while waiting for 5 corresponding success or failure messages. Unless your code explicitly uses the number 5 somewhere, an LLM is often going to fail at counting the operations.

All in all, the main question I think in determining how well an LLM can do a task is whether the limiting factor for your task is knowledge or abstraction. If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well with good prompting even on quite large and complex tasks. If it's abstraction, it's likely to fail in all kinds of seemingly obvious ways.

replies(1): >>41899607 #
layer8 ◴[] No.41899607[source]
> If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well

Only if that knowledge is sufficiently represented in the training data or on the web. If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

replies(2): >>41901007 #>>41907553 #
1. danenania ◴[] No.41907553[source]
> If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

It's true enough that there are many tasks like this. But there are also many relatively arcane APIs/protocols/domains that LLMs do a surprisingly good job with. I tend to think it's worth checking which bucket a task falls into before spending hours or days hammering something out myself.

I think many devs are underestimating how arcane the knowledge needs to be before an LLM will be hopeless at a knowledge-based task. There's a lot of code on the internet.