←back to thread

466 points 0x63_Problems | 3 comments | | HN request time: 0.494s | source
Show context
perrygeo ◴[] No.42138092[source]
> Companies with relatively young, high-quality codebases benefit the most from generative AI tools, while companies with gnarly, legacy codebases will struggle to adopt them. In other words, the penalty for having a ‘high-debt’ codebase is now larger than ever.

This mirrors my experience using LLMs on personal projects. They can provide good advice only to the extent that your project stays within the bounds of well-known patterns. As soon as your codebase gets a little bit "weird" (ie trying to do anything novel and interesting), the model chokes, starts hallucinating, and makes your job considerably harder.

Put another way, LLMs make the easy stuff easier, but royally screws up the hard stuff. The gap does appear to be widening, not shrinking. They work best where we need them the least.

replies(24): >>42138267 #>>42138350 #>>42138403 #>>42138537 #>>42138558 #>>42138582 #>>42138674 #>>42138683 #>>42138690 #>>42138884 #>>42139109 #>>42139189 #>>42140096 #>>42140476 #>>42140626 #>>42140809 #>>42140878 #>>42141658 #>>42141716 #>>42142239 #>>42142373 #>>42143688 #>>42143791 #>>42151146 #
irrational ◴[] No.42138674[source]
I was recently assigned to work on a huge legacy ColdFusion backend service. I was very surprised at how useful AI was with code. It was even better, in my experience, than I've seen with python, java, or typescript. The only explanation I can come up with is there is so much legacy ColdFusion code out there that was used to train Copilot and whatever AI jetbrains uses for code completion that this is one of the languages they are most suited to assist with.
replies(4): >>42139225 #>>42139249 #>>42139393 #>>42139543 #
1. cpeterso ◴[] No.42139393[source]
But where did these companies get the ColdFusion code for their training data? Since ColdFusion is an old language and used for backend services, how much ColdFusion code is open source and crawlable?
replies(2): >>42140959 #>>42141919 #
2. irrational ◴[] No.42140959[source]
That's a good question. I presume there is some way to check github for how much code in each language is available on it.
3. PeterisP ◴[] No.42141919[source]
I'm definitely assuming that they don't limit their training data to what is open source and crawlable.