←back to thread

250 points lewq | 1 comments | | HN request time: 0.208s | source
Show context
croes ◴[] No.42136942[source]
Is that really AI engineering or Software engineering with AI?

If a model goes sideways how do you fix that? Could you find and fix flaws in the base model?

replies(3): >>42137298 #>>42137468 #>>42138706 #
1. billmalarky ◴[] No.42138706[source]
You find issues when they surface during your actual use case (and by "smoke testing" around your real-world use case). You can often "fix" issues in the base model with additional training (supervised fine-tuning, reinforcement learning w/ DPO, etc).

There's a lot of tooling out there making this accessible to someone with a solid full-stack engineering background.

Training an LLM from scratch is a different beast, but that knowledge honestly isn't too practical for everyday engineers given even if you had the knowledge you wouldn't necessarily have the resources necessary to train a competitive model. Of course you could command a high salary working for the orgs who do have these resources! One caveat is there are orgs doing serious post-training even with unsupervised techniques to take a base model and reeaaaaaally bake in domain-specific knowledge/context. Honestly I wonder if even that is unaccessible to pull off. You get a lot of wiggle-room and margin for error when post-training a well-built base model because of transfer learning.