AI makes tech debt more expensive

(www.gauge.sh)

467 points 0x63_Problems | 2 comments | 14 Nov 24 16:01 UTC | HN request time: 0.391s | source

Show context

perrygeo ◴[14 Nov 24 16:50 UTC] No.42138092[source]▶

> Companies with relatively young, high-quality codebases benefit the most from generative AI tools, while companies with gnarly, legacy codebases will struggle to adopt them. In other words, the penalty for having a ‘high-debt’ codebase is now larger than ever.

This mirrors my experience using LLMs on personal projects. They can provide good advice only to the extent that your project stays within the bounds of well-known patterns. As soon as your codebase gets a little bit "weird" (ie trying to do anything novel and interesting), the model chokes, starts hallucinating, and makes your job considerably harder.

Put another way, LLMs make the easy stuff easier, but royally screws up the hard stuff. The gap does appear to be widening, not shrinking. They work best where we need them the least.

replies(24): >>42138267 #>>42138350 #>>42138403 #>>42138537 #>>42138558 #>>42138582 #>>42138674 #>>42138683 #>>42138690 #>>42138884 #>>42139109 #>>42139189 #>>42140096 #>>42140476 #>>42140626 #>>42140809 #>>42140878 #>>42141658 #>>42141716 #>>42142239 #>>42142373 #>>42143688 #>>42143791 #>>42151146 #

irrational ◴[14 Nov 24 17:31 UTC] No.42138674[source]▶

>>42138092 #

I was recently assigned to work on a huge legacy ColdFusion backend service. I was very surprised at how useful AI was with code. It was even better, in my experience, than I've seen with python, java, or typescript. The only explanation I can come up with is there is so much legacy ColdFusion code out there that was used to train Copilot and whatever AI jetbrains uses for code completion that this is one of the languages they are most suited to assist with.

replies(4): >>42139225 #>>42139249 #>>42139393 #>>42139543 #

randomdata ◴[14 Nov 24 18:11 UTC] No.42139225[source]▶

>>42138674 #

Perhaps it is the reverse: That ColdFusion training sources are limited, so it is more likely to converge on a homogenization?

While, causally, we usually think of a programming language as being one thing, but in reality a programming language generally only specifies a syntax. All of the other features of a language emerge from the people using them. And because of that, two different people can end up speaking two completely different languages even when sharing the same syntax.

This is especially apparent when you witness someone who is familiar with programming in language X, who then starts learning language Y. You'll notice, at least at first, they will still try to write their programs in language X using Y syntax, instead of embracing language Y in all its glory. Now, multiply that by the millions of developers who will touch code in a popular language like Python, Java, or Typescript and things end up all over the place.

So while you might have a lot more code to train on overall, you need a lot more code for the LLM to be able to discern the different dialects that emerge out of the additional variety. Quantity doesn't imply quality.

replies(1): >>42139415 #

1. cpeterso ◴[14 Nov 24 18:26 UTC] No.42139415[source]▶

>>42139225 #

I wonder what a language designed as a target for LLM-generated code would look like? What semantics and syntax would help the LLM generate code that is more likely to be correct and maintainable by humans?

replies(1): >>42143160 #

2. eru ◴[15 Nov 24 01:35 UTC] No.42143160[source]▶

>>42139415 (TP) #

Perhaps something like Cobol? (Shudder.)

↑