←back to thread

251 points lewq | 5 comments | | HN request time: 0.604s | source
1. croes ◴[] No.42136942[source]
Is that really AI engineering or Software engineering with AI?

If a model goes sideways how do you fix that? Could you find and fix flaws in the base model?

replies(3): >>42137298 #>>42137468 #>>42138706 #
2. sigmar ◴[] No.42137298[source]
Agree that the use of "AI engineers" is confusing. Think this blog should use the term "engineering software with AI-integration" which is different from "AI engineering" (creating/designing AI models) and different from "engineering with AI" (using AI to assist in engineering)
replies(1): >>42137726 #
3. liampulles ◴[] No.42137468[source]
I wonder if either could be really be called engineering.
4. crimsoneer ◴[] No.42137726[source]
The term AI engineer is now pretty well recognised in the field (https://www.latent.space/p/ai-engineer), and is very much not the same as an AI researcher (which would be involved in training and building new models). I'd expect an AI engineer to be primarily a software developer, but with an excellent understanding of how to implement, use and evaluate LLMs in a production environment, including skills like evaluation and fine-tuning. This is not some dataset you can just bundle in software developer.
5. billmalarky ◴[] No.42138706[source]
You find issues when they surface during your actual use case (and by "smoke testing" around your real-world use case). You can often "fix" issues in the base model with additional training (supervised fine-tuning, reinforcement learning w/ DPO, etc).

There's a lot of tooling out there making this accessible to someone with a solid full-stack engineering background.

Training an LLM from scratch is a different beast, but that knowledge honestly isn't too practical for everyday engineers given even if you had the knowledge you wouldn't necessarily have the resources necessary to train a competitive model. Of course you could command a high salary working for the orgs who do have these resources! One caveat is there are orgs doing serious post-training even with unsupervised techniques to take a base model and reeaaaaaally bake in domain-specific knowledge/context. Honestly I wonder if even that is unaccessible to pull off. You get a lot of wiggle-room and margin for error when post-training a well-built base model because of transfer learning.