The barriers to AI engineering are crumbling fast

1. croes ◴[14 Nov 24 15:13 UTC] No.42136942[source]▶

Is that really AI engineering or Software engineering with AI?

If a model goes sideways how do you fix that? Could you find and fix flaws in the base model?

replies(3): >>42137298 #>>42137468 #>>42138706 #

2. sigmar ◴[14 Nov 24 15:42 UTC] No.42137298[source]▶

Agree that the use of "AI engineers" is confusing. Think this blog should use the term "engineering software with AI-integration" which is different from "AI engineering" (creating/designing AI models) and different from "engineering with AI" (using AI to assist in engineering)

replies(1): >>42137726 #

3. liampulles ◴[14 Nov 24 15:56 UTC] No.42137468[source]▶

>>42136942 (TP) #

I wonder if either could be really be called engineering.

4. crimsoneer ◴[14 Nov 24 16:20 UTC] No.42137726[source]▶

>>42137298 #

The term AI engineer is now pretty well recognised in the field (https://www.latent.space/p/ai-engineer), and is very much not the same as an AI researcher (which would be involved in training and building new models). I'd expect an AI engineer to be primarily a software developer, but with an excellent understanding of how to implement, use and evaluate LLMs in a production environment, including skills like evaluation and fine-tuning. This is not some dataset you can just bundle in software developer.

5. billmalarky ◴[14 Nov 24 17:34 UTC] No.42138706[source]▶

>>42136942 (TP) #

You find issues when they surface during your actual use case (and by "smoke testing" around your real-world use case). You can often "fix" issues in the base model with additional training (supervised fine-tuning, reinforcement learning w/ DPO, etc).

There's a lot of tooling out there making this accessible to someone with a solid full-stack engineering background.

Training an LLM from scratch is a different beast, but that knowledge honestly isn't too practical for everyday engineers given even if you had the knowledge you wouldn't necessarily have the resources necessary to train a competitive model. Of course you could command a high salary working for the orgs who do have these resources! One caveat is there are orgs doing serious post-training even with unsupervised techniques to take a base model and reeaaaaaally bake in domain-specific knowledge/context. Honestly I wonder if even that is unaccessible to pull off. You get a lot of wiggle-room and margin for error when post-training a well-built base model because of transfer learning.