>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet...
> Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do.
You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this:
1. Learn how the subject matter experts do the work.
2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...)
3. Evaluation, iteration, improvement
4. Scale up to production
In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done.
But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM.
There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens.
When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents.
This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it.