←back to thread

66 points zdw | 9 comments | | HN request time: 0.756s | source | bottom
1. siliconc0w ◴[] No.46187816[source]
I was working on a new project and I wanted to try out a new frontend framework (data-star.dev). What you quickly find out is that LLMs are really tuned to like react and their frontend performance drops pretty considerably if you aren't using it. Like even pasting the entire documentation in context, and giving specific examples close to what I wanted, SOTA models still hallucinated the correct attributes/APIs. And it isn't even that you have to use Framework X, it's that you need to use X as of the date of training.

I think this is one of the reasons we don't see huge productivity gains. Most F500 companies have pretty proprietary gnarly codebases which are going to be out-of-distribution. Context-engineering helps but you still don't get near the performance you get with in-distribution. It's probably not unsolvable but it's a pretty big problem ATM.

replies(6): >>46188076 #>>46188172 #>>46188177 #>>46188540 #>>46188662 #>>46189279 #
2. NewsaHackO ◴[] No.46188076[source]
I use it with Angular and Svelte and it works pretty well. I used to use Lit, which at least the older models did pretty bad at, but it is less known so expected.
replies(1): >>46188131 #
3. JimDabell ◴[] No.46188131[source]
Yes, Claude Opus 4.5 recently scored 100% on SvelteBench:

https://khromov.github.io/svelte-bench/benchmark-results-mer...

I found that LLMs sometimes get confused by Lit because they don’t understand the limitations of the shadow DOM. So they’ll do something like throw an event and try to catch it from a parent and treat it normally, not realising that the shadow DOM screws that all up, or they assume global / reset CSS will apply globally when you actually need to reapply it to every single component.

What I find interesting is all the platforms like Lovable etc. seem to be choosing Supabase, and LLMs are pretty terrible with that – constantly getting RLS wrong etc.

4. pan69 ◴[] No.46188172[source]
> What you quickly find out is that LLMs are really tuned to like react

Sounds to me like that there is simply more React code to train the model on.

5. ehnto ◴[] No.46188177[source]
That is the "big issue" I have found as well. Not only are enterprise codebases often proprietary, ground up architectures, the actual hard part is business logic, locating required knowledge, and taking into account a decade of changing business requirements. All of that information is usually inside a bunch of different humans heads and by the time you get it all out and processed, code is often a small part of the task.
replies(1): >>46189060 #
6. Teknoman117 ◴[] No.46188540[source]
As someone who works at an F100 company with massive proprietary codebases that also requires our users to sign NDAs even see API docs and code examples, to say that the output of LLMs for work tasks is comically bad would be an understatement even with feeding it code and documentation as memory items for projects...
7. runako ◴[] No.46188662[source]
To be fair, it looks like that fronted framework may have had its initial release after the training cutoffs for most of the foundation models. (I looked, because I have not had this experience using less-popular frameworks like Stimulus.)
8. theshrike79 ◴[] No.46189060[source]
AI is an excellent reason/excuse to have resources allocated to documenting these things

“Hey boss we can use AI more if we would document these business requirements in a concise and clear way”

Worst case: humans get proper docs :)

9. OhSoHumble ◴[] No.46189279[source]
I ended up building out a "spec" for Opus 4.5 to consume. I just copy-pasted all of the documentation into a markdown file and added it to the context window. Did fine after that. I also had the LLM write any "gotchas" to the spec file. Works great.