Most active commenters

johnnyanmac(7)
FeepingCreature(5)
danenania(4)
(3)
Terr_(3)

Popular/hot comments

>>41896295 #
>>41896310 #
>>41896327 #
>>41896335 #
>>41896359 #
>>41896400 #
>>41896492 #

←back to thread

The AI Investment Boom

(www.apricitas.io)

1. apwell23 ◴[20 Oct 24 16:04 UTC] No.41896263[source]▶

>>41895746 (OP) #

> AI products are used ubiquitously to generate code, text, and images, analyze data, automate tasks, enhance online platforms, and much, much, much more—with usage expected only to increase going forward.

Why does every hype article start with this. Personally my copilot usage has gone down while coding. I tried and tried but it always gets lost and starts spitting out subtle bugs that takes me more time to debug than if i had written it myself.

I always have this feeling of 'this might fail in production in unknown ways' because i might have missed checking the code throughly . I know i am not the only one, my coworkers and friends have expressed similar feelings.

I even tried the new 'chain of thought' model, which for some reason seems to be even worse.

replies(10): >>41896295 #>>41896310 #>>41896325 #>>41896327 #>>41896363 #>>41896380 #>>41896400 #>>41896497 #>>41896670 #>>41898703 #

2. bongodongobob ◴[20 Oct 24 16:07 UTC] No.41896295[source]▶

>>41896263 (TP) #

Well I have the exact opposite experience. I don't know why people struggle to get good results with llms.

replies(4): >>41896332 #>>41896335 #>>41896492 #>>41897988 #

3. sksxihve ◴[20 Oct 24 16:08 UTC] No.41896310[source]▶

>>41896263 (TP) #

Because they all use AI to write the articles.

replies(3): >>41898037 #>>41898327 #>>41904296 #

4. bugbuddy ◴[20 Oct 24 16:11 UTC] No.41896325[source]▶

>>41896263 (TP) #

This just reminded I forgot I have had a Copilot subscription. It has not made any useful code suggestions in months to the point of fading from my memory. I just logged in to cancel it. Now, I need to check my other subscriptions that I can cancel or reduce to a lower tier.

replies(1): >>41900775 #

5. falcor84 ◴[20 Oct 24 16:12 UTC] No.41896327[source]▶

>>41896263 (TP) #

From my experience, it is getting better over time, and I believe that there's still a lot of relatively low hanging fruit, particularly in terms of integrating the LLM with the language server protocol and other tooling. But having said that, at this point in time, it's just not good enough for independent work, so I would suggest using it only as you would pair-program with a mid-level human developer who doesn't have much context on the project, and has a short attention span. In particular, I generally only have the AI help me with one function/refactoring at a time, and in a way that is easy for me to test as we go, and am finding immense value.

replies(3): >>41896513 #>>41898284 #>>41900997 #

6. thuuuomas ◴[20 Oct 24 16:12 UTC] No.41896332[source]▶

>>41896295 #

Would you feel comfortable pushing generated code to production unaudited?

replies(2): >>41896359 #>>41896360 #

7. hnthrowaway6543 ◴[20 Oct 24 16:13 UTC] No.41896335[source]▶

>>41896295 #

LLMs are great for simple, common tasks, i.e. CRUD apps, RESTful web endpoints, unit tests, for which there's an enormous amount of examples and not much unique complexity. There's a lot of developers whose day mostly involves these repetitive, simple tasks. There's also a lot of developers who work on things that are a lot more niche and complicated, where LLMs don't provide much help.

replies(3): >>41896464 #>>41896611 #>>41896681 #

8. bongodongobob ◴[20 Oct 24 16:15 UTC] No.41896359{3}[source]▶

>>41896332 #

Would you feel comfortable pushing human code to production unaudited?

replies(3): >>41896393 #>>41896438 #>>41904561 #

9. charrondev ◴[20 Oct 24 16:15 UTC] No.41896360{3}[source]▶

>>41896332 #

For my I have a company subscription for Copilot and I just use the line based autocomplete. It’s mildly better than the built in autocomplete. I never have it do more than though and probably wouldn’t buy a license for myself.

10. drowsspa ◴[20 Oct 24 16:15 UTC] No.41896363[source]▶

>>41896263 (TP) #

Yeah, it's actually frustrating that even when writing Go code, which is statically typed, it keeps messing up the arguments order. That would seem to me a pretty easy thing to generate.

Although it's much better when writing standard REST and gRPC APIs

11. righthand ◴[20 Oct 24 16:17 UTC] No.41896380[source]▶

>>41896263 (TP) #

I see the same results as my TabNine+Template generator+language server as I do with things like CoPilot. I get TabNine issues when the code base isn’t huge. I think also tossing away language servers and template generators for just LLM will just lead to seeking “proper predictive path”. Most of the time the LLM will spit out the create-express/react-template for you, when you ask it to customize it will guess using the most common patterns. Do you need something to guess for you?

It’s also getting worse because people are poisoning the well.

12. candiddevmike ◴[20 Oct 24 16:18 UTC] No.41896393{4}[source]▶

>>41896359 #

Only on Fridays before a three day weekend.

13. badgersnake ◴[20 Oct 24 16:19 UTC] No.41896400[source]▶

>>41896263 (TP) #

It’s kinda true though. They are increasingly used for those things. Sure, the results are terrible and doing it without AI almost always yields better results but that doesn’t seem to stop people.

Look at this nonsense for example: https://intouch.family/en

replies(3): >>41896543 #>>41898392 #>>41902344 #

14. dijksterhuis ◴[20 Oct 24 16:23 UTC] No.41896438{4}[source]▶

>>41896359 #

depends on the human.

but i would never push llm generated code. never.

edit to add some substance:

if it’s someone who

* does a lot of manual local testing

* adds good unit / integration tests

* writes clear and well documented PRs

* knows the code style, and when to break it

* tests themselves in a staging environment, independent of any QA team or reviews

* monitors the changes after they’ve gone out

* has repeatedly found things in their own PRs and asked to hold off release to fix them

* is reviewing other people’s PRs and spotting things before they go out

yea, sure, i’ll release the changes. they’re doing the auditing work for me.

they clearly care about the software. and i’ve seen enough to trust them.

and if they got it wrong, well, shit, they did everything good enough. i’m sure they’ll be on the ball when it comes to rolling it back and/or fixing it.

an llm does not do those things. an llm *does not care about your software* and never will.

i’ll take people who give a shit any day of the week.

replies(1): >>41896687 #

15. 101008 ◴[20 Oct 24 16:26 UTC] No.41896464{3}[source]▶

>>41896335 #

Yeah, exactly this. If I ask Cursor to write the serializer for a new Django model, it does it (although sometimes it invents fields that do not exist). It saves me 2 minute.

When I ask him to write a function that should do something much more complex, it usually do something so bad it takes me more time because it confuses me and now I have to back to my original reasoning (after trying to understand what it did).

What I found useful is to ask him to explain me what a function does in a new codebase I am exploring, although I have to be very careful because a lot of time invents or skips steps that are crucial.

replies(1): >>41896590 #

16. amonith ◴[20 Oct 24 16:29 UTC] No.41896492[source]▶

>>41896295 #

Seriously though, what are you doing? Every single example everywhere throughout the internet that tries to show how good AI is at programming shows so mindbogglingly simplistic examples that it's getting annoying. It sure is a great learning tool when you're trying to do something experimental in a new stack or completely new project, I'll give you that, but once you reach the skill level where someone would hire you to be an X developer (which most developers disagreeing with you are, mid+ developers of some stack X) the thing becomes a barely useful autocomplete. Maybe that's the problem? It's just not a tool for professional developers?

replies(3): >>41896542 #>>41897047 #>>41898131 #

17. ◴[20 Oct 24 16:29 UTC] No.41896497[source]▶

>>41896263 (TP) #

18. ◴[20 Oct 24 16:31 UTC] No.41896513[source]▶

>>41896327 #

19. ◴[20 Oct 24 16:34 UTC] No.41896542{3}[source]▶

>>41896492 #

20. anon7725 ◴[20 Oct 24 16:34 UTC] No.41896543[source]▶

>>41896400 #

That’s one of the saddest bits of AI enshittification yet.

replies(2): >>41902433 #>>41904441 #

21. dartos ◴[20 Oct 24 16:40 UTC] No.41896590{4}[source]▶

>>41896464 #

See, I recently picked up the Ash framework for elixir and it does all that too, but in a declarative, precise language which codegens the implementation in a deterministic way.

It just does the job that cursor does there, but better.

Maybe us programmers should focus on making higher order programming tools instead of black box text generators for existing tools.

22. danenania ◴[20 Oct 24 16:42 UTC] No.41896611{3}[source]▶

>>41896335 #

In my experience this underrates them. They can do pretty complex tasks that go well beyond your examples if prompted correctly.

The real limiting factor is not so much task complexity as the level of abstraction and indirection. If you have code that requires following a long chain of references to understand, LLMs will struggle to work with it.

For similar reasons, they also struggle with:

- generic types

- inheritance hierarchies

- long function call chains

- dependency injection

- deeply nested structures

They're also bad at counting, which can be an issue when dealing with concurrency—i.e. you started 5 operations concurrently at different points in your program and now need to block while waiting for 5 corresponding success or failure messages. Unless your code explicitly uses the number 5 somewhere, an LLM is often going to fail at counting the operations.

All in all, the main question I think in determining how well an LLM can do a task is whether the limiting factor for your task is knowledge or abstraction. If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well with good prompting even on quite large and complex tasks. If it's abstraction, it's likely to fail in all kinds of seemingly obvious ways.

replies(1): >>41899607 #

23. osigurdson ◴[20 Oct 24 16:49 UTC] No.41896670[source]▶

>>41896263 (TP) #

My feeling is (current) AI is more of a teacher than an implementor. It really does help when learning about something new or to give you ideas about directions to take. The actual code however still needs to be written by humans for the most part it seems.

AI is a great tool and does speed things up massively, it just doesn't align with the magical thought that we provide the ideas and AI does all of the grunt work. In general, always better to form mental models about things based on actual evidence as opposed to fantasy (and there is a lot of fantasy involved at the moment). This doesn't mean being pessimistic about potential future advancements however. It is just very hard to predict what the shape of those improvements will be.

24. apwell23 ◴[20 Oct 24 16:49 UTC] No.41896681{3}[source]▶

>>41896335 #

> LLMs are great for simple, common tasks, i.e. CRUD apps, RESTful web endpoints

i gave it a yaml and asked it to generate a json call to rest api . It missed a bunch of keys and made up a random new key. I threw out the whole thing and did it with awk/sed.

25. amonith ◴[20 Oct 24 16:50 UTC] No.41896687{5}[source]▶

>>41896438 #

I'd say it depends more on "the production" than the human. There are legal means to hold all people accountable for their actions ("Gross neglience" and all that). So you can basically always trust that people will fix what they messed up given the possibility. So if you can afford for the production to be broken (e.g. the downtime will just annoy some people) you might as well allow your team to deploy straight to prod without audits. It's not that rare actually.

26. Viliam1234 ◴[20 Oct 24 17:36 UTC] No.41897047{3}[source]▶

>>41896492 #

I am happy with the LLMs, but I only tried them on small projects done at my free time.

As a back end developer I am not familiar with the latest trends in JavaScript and CSS, and frankly I do not want to spend my time studying these. A LLM can generate an interactive web game based on my description. I review the code, it is usually okay, sometimes I suggest an improvement. I could have done all of that -- but it would take me a week, and the LLM does it in seconds. So it is a difference between a hobby project done or not done.

I also tried a LLM at work, not to code, but to explain some complex topics that were new to me. Once it provided a great high-level description that was very useful. And once it provided a great explanation... which was a total lie, as I found out when I tried to do a hello-world example. I still think the 50% success rate is great, as long as you can quickly verify it.

Shortly, we need to know the strengths and the weaknesses, and use the LLMs accordingly. Too much trust will get you burned. But properly used, they can save a lot of time.

27. threeseed ◴[20 Oct 24 19:47 UTC] No.41897988[source]▶

>>41896295 #

I just asked Claude to generate some code using the SAP SuccessFactors API.

Every single example was completely useless. The code wouldn't compile, it would invent methods and variables and the instructions to go along with it were incoherent. All whilst gaslighting along with the way.

I have also previously tried using it with some Golang code and it would constantly add weird statements e.g. locking on non-concurrent operations.

LLMs are great when you are doing the same things as everyone else. Step outside of that and it's far more trouble than it's worth.

replies(2): >>41900666 #>>41900693 #

28. Ekaros ◴[20 Oct 24 19:55 UTC] No.41898037[source]▶

>>41896310 #

There is a market for AI. And it is exactly these articles and maybe pictures attached to them. Soon could be some videos as well. But how far beyond that. Is very good question.

29. FeepingCreature ◴[20 Oct 24 20:08 UTC] No.41898131{3}[source]▶

>>41896492 #

I mean, let me just throw in an example here: I am currently working on https://guesspage.github.io , which is basically https://getguesstimate.com but for flowtext instead of a spreadsheet. The site is ... 99.9% Claude Sonnet written. I have literally only been debugging and speccing.

Sonnet can absolutely get very confused and break things. And there were tasks where I had a really hard time getting it to do the right thing, or understand what I wanted. But I need you to understand: Sonnet made this thing for me in two and a half days of part-time prompting. That is probably ten times faster than it would have taken me on my own, especially as I have absolutely no design ability.

Now, is this a big project? No, it's like 2kloc. But I don't think you can call it "simple" exactly. It's potentially useful technology. This sort of "just make this small tool exist for me" is where I see most of the value for AI in the next year. And the definition of "small tool" can stretch surprisingly far.

replies(2): >>41898445 #>>41900028 #

30. dangerwill ◴[20 Oct 24 20:40 UTC] No.41898284[source]▶

>>41896327 #

I think some of the consternation we see from the anti LLM crowd (of which I'm one) is this line of reasoning. These LLMs produce fine code when the code you are asking for is in its training set. So they can be better than a mid level dev and much faster in narrow, unknown contexts. But with no feedback to warn you, if you ask it for code that it has no or only a bit of data on, it is much worse than a rubber duck.

That and tech's status inflation means when we are talking about "mid level" engineers, really we are talking about engineers with a couple years of experience who have just graduated to the training wheels phase of producing production code. LLMs are still broadly aimed at removing the need for what I would just call junior engineers.

replies(2): >>41898719 #>>41904375 #

31. __MatrixMan__ ◴[20 Oct 24 20:47 UTC] No.41898327[source]▶

>>41896310 #

AI trained on a web that's primarily about selling things

32. 123yawaworht456 ◴[20 Oct 24 20:56 UTC] No.41898392[source]▶

>>41896400 #

holy shit, if that isn't satire... wow, just fucking wow.

33. hnthrowaway6543 ◴[20 Oct 24 21:04 UTC] No.41898445{4}[source]▶

>>41898131 #

This is a simple project. Nobody is disputing that GenAI can automate a large chunk of the initial setup work, which dominates the time spent on small projects like this. But 99.999% of professional, paid software development is not working on the basic React infrastructure for a 2,000 loc javascript app.

Also your Google Drive API key is easily discoverable with about 15 seconds of looking at the JS source code -- this is something a professional software developer would (hopefully) have picked up without you asking, but an LLM isn't going to tell you that you shouldn't ship the `const API_KEY = ...` code as a file to the client, because you didn't ask.

replies(1): >>41898572 #

34. FeepingCreature ◴[20 Oct 24 21:21 UTC] No.41898572{5}[source]▶

>>41898445 #

> This is a simple project.

I mean, it would have taken me a lot longer on my own. Sure it's not a huge project, I agree; I wouldn't call it entirely trivial.

> Also your Google Drive API key is easily discoverable with about 15 seconds of looking at the JS source code

No, I'm aware of that. That's deliberate. There's no way to avoid it for a serverless webapp. (Note that Guesspage is entirely hosted on Github Pages.) All the data stored is public anyways, the key is limited to only have permission to access the stored data, and you still have to log in and grab a token that is only stored in your browser and cannot be accessed from other sites. Literally the only unique thing you can do with it is trigger a login request on your own site that looks like it comes from Guesspage; and you can do that just as easily by creating a new API key and setting its name to "Guesspage".

The AI actually told me that was unsafe, and I corrected it. To the best of my understanding, the only thing that you can do with the API key is do Google Drive uploads to your own drive or that of someone who lets you that look to Google as if my app is triggering them. If there's a danger that can arise from that, and I don't think there is, then it's on me, not on Sonnet.

(It's also referer domain limited, but that's worthless. If only there was a way to cryptographically sign a referer...)

replies(1): >>41900008 #

35. whiplash451 ◴[20 Oct 24 21:39 UTC] No.41898703[source]▶

>>41896263 (TP) #

My experience is similar. I used Claude for a coding task recently and it drove me into an infinite number of rabbit holes, each one seeming worse than the previous one. All the while being enable to stop and say: I’m sorry, I actually don’t know how to help you.

36. whiplash451 ◴[20 Oct 24 21:41 UTC] No.41898719{3}[source]▶

>>41898284 #

That and the fact that code does not live in a standalone bubble, but in a complex setup of OSes, APIs, middleware and other languages. My experience trying to use Claude to help me with that was disappointing.

replies(1): >>41905443 #

37. layer8 ◴[21 Oct 24 00:16 UTC] No.41899607{4}[source]▶

>>41896611 #

> If it's knowledge (the intricacies of some arcane OS API, for example), an LLM can do very well

Only if that knowledge is sufficiently represented in the training data or on the web. If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

replies(2): >>41901007 #>>41907553 #

38. WgaqPdNr7PGLGVW ◴[21 Oct 24 01:44 UTC] No.41900008{6}[source]▶

>>41898572 #

> I wouldn't call it entirely trivial.

It just doesn't represent a realistic codebase. It is significantly smaller than a lot of college projects.

The current software system I'm working on now is ~2 million lines of code split across a dozen services.

AI has been pretty good for search across the codebases and absolutely hopeless for code gen.

LLMs just aren't that good yet for writing code on a decent sized system.

replies(1): >>41901386 #

39. mvdtnz ◴[21 Oct 24 01:48 UTC] No.41900028{4}[source]▶

>>41898131 #

This is a ludicrously simple app and also - the code[0] is of very poor quality.

[0] https://github.com/Guesspage/guesspage.github.io/blob/master...

replies(1): >>41901393 #

40. dankwizard ◴[21 Oct 24 04:21 UTC] No.41900666{3}[source]▶

>>41897988 #

"LLMs are great when you are doing the same things as everyone else. Step outside of that and it's far more trouble than it's worth."

If you're doing something in a way it's not in the training data set, maybe your way of approaching the problem is wrong?

replies(2): >>41902364 #>>41904523 #

41. attentive ◴[21 Oct 24 04:30 UTC] No.41900693{3}[source]▶

>>41897988 #

for obscure API or SDK, upload docs and/or examples to Claude projects.

replies(1): >>41904540 #

42. csomar ◴[21 Oct 24 04:50 UTC] No.41900775[source]▶

>>41896325 #

They have massively nerfed Copilot. I'm keeping my subscription for a couple more months but at this point, it has the same intelligence as the llma3.2 which I can run on my laptop.

43. Terr_ ◴[21 Oct 24 05:44 UTC] No.41900997[source]▶

>>41896327 #

> better over time

The problem is all the most reliable code it can give you is stuff which ought to be (or already is) a documentation example or a reusable library, instead of "copy paste as a service".

replies(1): >>41904325 #

44. Terr_ ◴[21 Oct 24 05:47 UTC] No.41901007{5}[source]▶

>>41899607 #

And if it's really well represented, then it's hopefully already in a superior library or documentation/guide, and the LLM is acting as an (untrustworthy) middleman.

replies(1): >>41907462 #

45. FeepingCreature ◴[21 Oct 24 06:53 UTC] No.41901386{7}[source]▶

>>41900008 #

I mean, I agree with that. That certainly matches my experience. I just don't think the deciding factor is "simpleness" so much as an inability to handle large scale at all.

My point is more that LLMs can handle (some) projects that are useful. It's not just oneliners and hello worlds. There's a region in between "one-page demos" and "medium-sized codebases and up" where useful work can already happen.

46. FeepingCreature ◴[21 Oct 24 06:54 UTC] No.41901393{5}[source]▶

>>41900028 #

Eh, it's a bit hacked together sure. I find it easy to read?

replies(1): >>41906085 #

47. walterbell ◴[21 Oct 24 09:43 UTC] No.41902344[source]▶

>>41896400 #

No detail on founding team or investors. Unbounded liability. Compare with real organizations in eldercare:

Commercial: https://lotsahelpinghands.com

Non-profit: https://www.caringbridge.org

replies(1): >>41902585 #

48. threeseed ◴[21 Oct 24 09:45 UTC] No.41902364{4}[source]▶

>>41900666 #

Sorry but some of us aren't building the ten millionth CRUD app.

SuccessFactors is a popular HR platform and I was asking it any question and getting the wrong answer every time.

49. CaptainFever ◴[21 Oct 24 09:55 UTC] No.41902433{3}[source]▶

>>41896543 #

Not what enshittification means.

50. badgersnake ◴[21 Oct 24 10:20 UTC] No.41902585{3}[source]▶

>>41902344 #

They appear to be a brand of a Czech company AI Touch - https://aitouch.cz/

I could not find them on TechCrunch

51. johnnyanmac ◴[21 Oct 24 14:04 UTC] No.41904296[source]▶

>>41896310 #

I wish they used AI. It'd feel less artificial than the scripts investors give them to keep the boom booming.

It's a gold rush and they are inspectors. They have an incentive to keep the rush flowing.

52. johnnyanmac ◴[21 Oct 24 14:07 UTC] No.41904325{3}[source]▶

>>41900997 #

If AI could generate better documentation for my domain tools, I'd take back maybe 75% of my criticisms for it.

But alas, this rush means they want to pitch to replace people like me, not actually make me more productive.

53. johnnyanmac ◴[21 Oct 24 14:11 UTC] No.41904375{3}[source]▶

>>41898284 #

it's a tangent, but the title inflation and Years of Experience really are horrible metrics these days to judge engineers. Especially in an age where employers actively plan for 2-3 year churn instead of long term retention.

I have no clue how you get 5 years of experience in any meaningful way on any given tech. You sure won't get that only from the workplace's day to day activities. YoE is more a metric of how much of a glutton for punishment you have more than anything.

54. johnnyanmac ◴[21 Oct 24 14:17 UTC] No.41904441{3}[source]▶

>>41896543 #

Most of my complaints with AI are ethical and legal. But damn me if "products" like this doesn't bring out the bits of luddite in me. Not just to the seller but anyone considering to buy this.

Everyone's dreams will differ, but I got into tech to make people more efficient, and in turn enable more of the human element and less pencil pushing. Not replace it entirely.

55. johnnyanmac ◴[21 Oct 24 14:24 UTC] No.41904523{4}[source]▶

>>41900666 #

>If you're doing something in a way it's not in the training data set

in my industry, the "training data set" won't get much farther from public code than the barebones, generated doxygen comments we call "documentation".

But in a way you're also right. The industry's approach is fundamentally wrong, making 20 solutions to a problem with plenty of room to standardize a proper approach (plenty of room where you need proprietary techniques, but that's getting less true by the month). But an LLM isn't going to fix that cultural issue and will suffer from it.

replies(1): >>41905978 #

56. johnnyanmac ◴[21 Oct 24 14:26 UTC] No.41904540{4}[source]▶

>>41900693 #

That sounds like you might potentially break some copyright of your tools. Not all our tools are FOSS (and even some FOSS licenses may not allow that).

replies(1): >>41922020 #

57. johnnyanmac ◴[21 Oct 24 14:28 UTC] No.41904561{4}[source]▶

>>41896359 #

Nope. But AI's sales pitch is that it's an oracle to lean on. Which is part of the problem.

As a start, let me know when an AI can fail test cases, re-iterate on its code to correct the test case, and re-submit. But I suppose that starts to approach AGI territory.

58. falcor84 ◴[21 Oct 24 15:52 UTC] No.41905443{4}[source]▶

>>41898719 #

Could you please give an example of what you wanted it to help you with, what you expected and what you got?

59. warkdarrior ◴[21 Oct 24 16:49 UTC] No.41905978{5}[source]▶

>>41904523 #

> The industry's approach is fundamentally wrong, making 20 solutions to a problem with plenty of room to standardize a proper approach [...]. But an LLM isn't going to fix that cultural issue and will suffer from it.

LLM-powered development may push the industry towards standardization. "Oh, CoPilot cannot generate proper code for your SDK/API/service? Sorry, all my developers use CoPilot, so we will not integrate with your SDK/API/service until you provide better, CoPilot-friendly docs and examples."

60. mvdtnz ◴[21 Oct 24 17:02 UTC] No.41906085{6}[source]▶

>>41901393 #

Good code isn't just easy to read, it's easy to change. The code in this app is brittle, tightly coupled and likely to break if the app is changed.

replies(1): >>41908383 #

61. danenania ◴[21 Oct 24 19:21 UTC] No.41907462{6}[source]▶

>>41901007 #

If the code can be generated correctly, is it controversial to say that generating it will be more efficient than reading through documentation and/or learning how to use a new library?

If you grant that, the next question is how high the accuracy has to be before it's quicker than doing the research and writing the code yourself. If it's 100%, then it's clearly better, since doing the research and implementation oneself generally takes an hour or so in the best scenario (this can expand to multiple hours or days depending on the task). If it's 99%, it's still probably (much) better, since it will be faster to fix the minor issues than to implement from scratch. If it's 90%, 80%, 70% it becomes a more interesting question.

replies(1): >>41907605 #

62. danenania ◴[21 Oct 24 19:28 UTC] No.41907553{5}[source]▶

>>41899607 #

> If, on the other hand, it’s knowledge that isn’t well (or at all) represented, and instead requires experience or experimentation with the relevant system, LLMs don’t do very well. I regularly fail with applying LLMs to tasks that turn out to require such “hidden” knowledge.

It's true enough that there are many tasks like this. But there are also many relatively arcane APIs/protocols/domains that LLMs do a surprisingly good job with. I tend to think it's worth checking which bucket a task falls into before spending hours or days hammering something out myself.

I think many devs are underestimating how arcane the knowledge needs to be before an LLM will be hopeless at a knowledge-based task. There's a lot of code on the internet.

63. Terr_ ◴[21 Oct 24 19:32 UTC] No.41907605{7}[source]▶

>>41907462 #

Compare to: "If you can copy-paste from a Stack-overflow answer, is it controversial to say that copy-pasting is more efficient than reading through documentation and/or learning how to use a new library?"

replies(1): >>41907706 #

64. danenania ◴[21 Oct 24 19:41 UTC] No.41907706{8}[source]▶

>>41907605 #

If I understand the code and it does exactly what I need, should I type the whole thing out rather than copy-pasting? Sounds like a waste of time to me.

65. FeepingCreature ◴[21 Oct 24 20:48 UTC] No.41908383{7}[source]▶

>>41906085 #

Eh. Honestly, so far Sonnet hasn't had any trouble with it. The thing is that every time it changes anything, it rewrites every line of code anyways just because I ask it "please give me the complete changed file(s) for easy copypasting."

The effort tradeoff is different for AIs than humans. Easy-to-understand-locally is more important than cheap-to-change, because it can do "read and check every line in the project" for like 20 cents. Making AIs code like humans is not playing to their strengths.

I don't think it's that bad anyways.

66. attentive ◴[23 Oct 24 05:12 UTC] No.41922020{5}[source]▶

>>41904540 #

You either feed your code into API or you don't.

If you can't share your code with Anthropic then there is nothing to talk about. Expecting it to know non-public SDK's and docs isn't reasonable. But if you want it to help you with FOSS libs/docs, this is the way.

↑