Most active commenters

danenania(4)

The AI Investment Boom

(www.apricitas.io)

Show context

apwell23 ◴[20 Oct 24 16:04 UTC] No.41896263[source]▶

> AI products are used ubiquitously to generate code, text, and images, analyze data, automate tasks, enhance online platforms, and much, much, much more—with usage expected only to increase going forward.

Why does every hype article start with this. Personally my copilot usage has gone down while coding. I tried and tried but it always gets lost and starts spitting out subtle bugs that takes me more time to debug than if i had written it myself.

I always have this feeling of 'this might fail in production in unknown ways' because i might have missed checking the code throughly . I know i am not the only one, my coworkers and friends have expressed similar feelings.

I even tried the new 'chain of thought' model, which for some reason seems to be even worse.

replies(10): >>41896295 #>>41896310 #>>41896325 #>>41896327 #>>41896363 #>>41896380 #>>41896400 #>>41896497 #>>41896670 #>>41898703 #

bongodongobob ◴[20 Oct 24 16:07 UTC] No.41896295[source]▶

>>41896263 #

Well I have the exact opposite experience. I don't know why people struggle to get good results with llms.

replies(4): >>41896332 #>>41896335 #>>41896492 #>>41897988 #

1. hnthrowaway6543 ◴[20 Oct 24 16:13 UTC] No.41896335[source]▶

>>41896295 #

LLMs are great for simple, common tasks, i.e. CRUD apps, RESTful web endpoints, unit tests, for which there's an enormous amount of examples and not much unique complexity. There's a lot of developers whose day mostly involves these repetitive, simple tasks. There's also a lot of developers who work on things that are a lot more niche and complicated, where LLMs don't provide much help.

replies(3): >>41896464 #>>41896611 #>>41896681 #

2. 101008 ◴[20 Oct 24 16:26 UTC] No.41896464[source]▶

>>41896335 (TP) #

Yeah, exactly this. If I ask Cursor to write the serializer for a new Django model, it does it (although sometimes it invents fields that do not exist). It saves me 2 minute.

When I ask him to write a function that should do something much more complex, it usually do something so bad it takes me more time because it confuses me and now I have to back to my original reasoning (after trying to understand what it did).

What I found useful is to ask him to explain me what a function does in a new codebase I am exploring, although I have to be very careful because a lot of time invents or skips steps that are crucial.

replies(1): >>41896590 #

3. dartos ◴[20 Oct 24 16:40 UTC] No.41896590[source]▶

>>41896464 #

See, I recently picked up the Ash framework for elixir and it does all that too, but in a declarative, precise language which codegens the implementation in a deterministic way.

It just does the job that cursor does there, but better.

Maybe us programmers should focus on making higher order programming tools instead of black box text generators for existing tools.

4. danenania ◴[20 Oct 24 16:42 UTC] No.41896611[source]▶

>>41896335 (TP) #

In my experience this underrates them. They can do pretty complex tasks that go well beyond your examples if prompted correctly.

The real limiting factor is not so much task complexity as the level of abstraction and indirection. If you have code that requires following a long chain of references to understand, LLMs will struggle to work with it.

For similar reasons, they also struggle with:

- generic types

- inheritance hierarchies

- long function call chains

- dependency injection

- deeply nested structures

They're also bad at counting, which can be an issue when dealing with concurrency—i.e. you started 5 operations concurrently at different points in your program and now need to block while waiting for 5 corresponding success or failure messages. Unless your code explicitly uses the number 5 somewhere, an LLM is often going to fail at counting the operations.

All in all, the main question I think in determining how well an LLM can do a task is whether the limiting factor for your task is knowledge or abstraction. If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well with good prompting even on quite large and complex tasks. If it's abstraction, it's likely to fail in all kinds of seemingly obvious ways.

replies(1): >>41899607 #

5. apwell23 ◴[20 Oct 24 16:49 UTC] No.41896681[source]▶

>>41896335 (TP) #

> LLMs are great for simple, common tasks, i.e. CRUD apps, RESTful web endpoints

i gave it a yaml and asked it to generate a json call to rest api . It missed a bunch of keys and made up a random new key. I threw out the whole thing and did it with awk/sed.

6. layer8 ◴[21 Oct 24 00:16 UTC] No.41899607[source]▶

>>41896611 #

> If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well

Only if that knowledge is sufficiently represented in the training data or on the web. If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

replies(2): >>41901007 #>>41907553 #

7. Terr_ ◴[21 Oct 24 05:47 UTC] No.41901007{3}[source]▶

>>41899607 #

And if it's really well represented, then it's hopefully already in a superior library or documentation/guide, and the LLM is acting as an (untrustworthy) middleman.

replies(1): >>41907462 #

8. danenania ◴[21 Oct 24 19:21 UTC] No.41907462{4}[source]▶

>>41901007 #

If the code can be generated correctly, is it controversial to say that generating it will be more efficient than reading through documentation and/or learning how to use a new library?

If you grant that, the next question is how high the accuracy has to be before it's quicker than doing the research and writing the code yourself. If it's 100%, then it's clearly better, since doing the research and implementation oneself generally takes an hour or so in the best scenario (this can expand to multiple hours or days depending on the task). If it's 99%, it's still probably (much) better, since it will be faster to fix the minor issues than to implement from scratch. If it's 90%, 80%, 70% it becomes a more interesting question.

replies(1): >>41907605 #

9. danenania ◴[21 Oct 24 19:28 UTC] No.41907553{3}[source]▶

>>41899607 #

> If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

It's true enough that there are many tasks like this. But there are also many relatively arcane APIs/protocols/domains that LLMs do a surprisingly good job with. I tend to think it's worth checking which bucket a task falls into before spending hours or days hammering something out myself.

I think many devs are underestimating how arcane the knowledge needs to be before an LLM will be hopeless at a knowledge-based task. There's a lot of code on the internet.

10. Terr_ ◴[21 Oct 24 19:32 UTC] No.41907605{5}[source]▶

>>41907462 #

Compare to: "If you can copy-paste from a Stack-overflow answer, is it controversial to say that copy-pasting is more efficient than reading through documentation and/or learning how to use a new library?"

replies(1): >>41907706 #

11. danenania ◴[21 Oct 24 19:41 UTC] No.41907706{6}[source]▶

>>41907605 #

If I understand the code and it does exactly what I need, should I type the whole thing out rather than copy-pasting? Sounds like a waste of time to me.

↑