In the meantime keep learning and practicing cs fundamentals, ignore hype and build something interesting.
In the meantime keep learning and practicing cs fundamentals, ignore hype and build something interesting.
I don't really agree with the reasoning [1], and I don't think we can expect this same rate of progress indefinitely, but I do understand the concern.
Inflation, end of ZIRP, and IRS section 174 kicked this off back in 2022 before AI coding was even a thing.
Junior devs won't lose jobs to AI. They'll lose jobs to the global market.
American software developers have lost the stranglehold on the job market.
If software falls, everything falls.
But as we've seen, these models can't do the job themselves. They're best thought of as an exoskeleton that requires a pilot. They make mistakes, and those mistakes multiply into a mess if a human isn't around. They don't get the big picture, and it's not clear they ever will with the current models and techniques.
The only field that has truly been disrupted is graphics design and art. The image and video models are sublime and truly deliver 10,000x speed, cost, and talent reductions.
This is probably for three reasons:
1. There's so much straightforward training data
2. The laws of optics and structure seem correspondingly easier than the rules governing intelligence. Simple animals evolved vision hundreds of millions of years ago, and we have all the math and algorithmic implementations already. Not so, for intelligence.
3. Mistakes don't multiply. You can brush up the canvas easily and deliver the job as a smaller work than, say, a 100k LOC program with failure modes.
ETA:
You updated your post and I think I agree with most of what you said after you updated.
All relevant and recent evidence points to logarithmic improvement, not the exponential we were told (promised) in the beginning.
We're likely waiting at this point for another breakthrough on the level of the attention paper. That could be next year, it could be 5-10 years from now, it could be 50 years from now. There's no point in prediction.
People like to assume that progress is this steady upward line, but I think it's more like a staircase. Someone comes up with something cool, there's a lot of amazing progress in the short-to-mid term, and then things kind of level out. I mean, hell, this isn't even the first time that this has happened with AI [1].
The newer AI models are pretty cool but I think we're getting into the "leveling out" phase of it.
I don’t think that follows at all. Robotics is notably much, much, much harder than AI/ML. You can replace programmers without robotics. You can’t replace trades without them.
Are you so sure?
Almost every animal has solved locomotion, some even with incredibly primitive brains. Evolution knocked this out of the park hundreds of millions of years ago.
Drosophila can do it, and we've mapped their brains.
Only a few animals have solved reasoning.
I'm sure the robotics videos I've seen lately have been cherry picked, but the results are nothing short of astounding. And there are now hundreds of billions of dollars being poured into solving it.
I'd wager humans stumble across something evolution had a cake walk with before they stumble across the thing that's only happened once in the known universe.
Your exponential problems have exponential problems. Scaling this system is factorially hard.
https://en.m.wikipedia.org/wiki/Moravec%27s_paradox
https://harimus.github.io/2024/05/31/motortask.html
Edit: just to specifically address your argument, doing something evolution has optimized for hundreds of millions of years is much harder than something evolution “came up with” very recently (abstract thought).
Anyone who tells you they know what the future looks like five years from now is lying.
You've got this backwards.
If evolution stumbled upon locomotion early -- and several times independently through convergent evolution --, that means it's an easy problem, relatively speaking.
We've come up with math and heuristics for robotics (just like vision and optics). We're turning up completely empty for intelligence.
On a codebase of 10,000 lines any action will cost 100,000,000 AI units. One with 1,000,000 it will cost 1,000,000,000,000 AI units.
I work on these things for a living and no one else seems to ever think two steps ahead on what the mathematical limitations of the transformer architecture mean for transformer based applications.
If it had been ZIRP and low interest, companies would have just borrowed to cover the amortization that 174 introduced. But unfortunately money doesn't grow on trees anymore.
Humans also keep struggling with context, so while large contexts may limit AI performance, they won't necessarily prevent them from being strongly superhuman.
> Five years from now AI might still break down at even a small bit of complexity, or it might be installing air conditioners, or it might be colonizing Mercury and putting humans in zoos.
do all these seem logically consistent possibilities to you?
OK, I will bite.
So "Sparsely-gated MoE" isn’t some new intelligence, it's a sharding trick. You trade parameter count for FLOPs/latency with a router. And MoE predates transformers anyway.
RLHF is packaging. Supervised finetune on instructions, learn a reward model, then nudge the policy. That’s a training objective swap plus preference data. It's useful, but not breakthrough.
CoT is a prompting hack to force the same model to externalize intermediate tokens. The capability was there, you’re just sampling a longer trajectory. It’s UX for sampling.
Scaling laws are an empirical fit telling you "buy more compute and data" That’s a budgeting guideline, not new math or architecture. https://www.reddit.com/r/ProgrammerHumor/comments/8c1i45/sta...
LoRA is linear algebra 101, low rank adapters to cut training cost and avoid touching the full weights. The base capability still comes from the giant pretrained transformer.
AlphaFold 2’s magic is mostly attention + A LOT of domain data/priors (MSAs, structures, evolutionary signal). Again attention core + data engineering.
"DeepSeek’s cost breakthrough" is systems engineering.
Agentic software dev/MCP is orchestration, that’s middleware and protocols, it helps use the model, it doesn’t make the model smarter.
Video generation? Diffusion with temporal conditioning and better consistency losses. It’s DALL-E style tech stretched across time with tons of data curation and filtering.
Most headline "wins" are compiler and kernel wins: FlashAttention, paged KV-cache, speculative decoding, distillation, quantization (8/4 bit), ZeRO/FSDP/TP/PP... These only move the cost curve, not the intelligence.
The biggest single driver the last few years has been the data so de dup, document quality scores, aggressive filtration, mixture balancing (web/code/math), synthetic bootstrapping, eval driven rewrites etc etc. You can swap half a dozen training "tricks" and get similar results if your data mix and scale are right.
For me a real post attention "breakthrough", would be something like: training that learns abstractions with sample efficiency far beyond scaling laws, reliable formal reasoning, causal/world-model learning that transfers out of distribution. None of the things you listed do that.
Almost everything since attention is optimization, ops, and data curation. I mean give me exact pretrain mix, filtering heuristics, and finetuning datasets for Claude/GPT-5 and without peeking at the secret sauce architecture I can get close just by matching tokens, quality filters and training schedule. The "breakthroughs" are mostly better ways to spend compute and clean data, not new ways to think.
>Only a few animals have solved reasoning.
the assumption here seems to be that reasoning will be able to do what evolution did hundreds of millions of years ago (with billions of years of work being put into that doing) much easier than evolution did for.. some reason that is not exactly expressed?
logically also I should note that given the premises laid out by the first quoted paragraph the second quoted paragraph should not be "only a few animals have solved reasoning" it should be "evolution has only solved reasoning a few times"
> AI might still break down at even a small bit of complexity, or it might be installing air conditioners, or it might be colonizing Mercury and putting humans in zoos.
that each of these things, being logically consistent, have equal chances of being the case 5 years from now?
It’s like asking a college student 4th grade math questions and then being impressed they knew the answer.
I’ve use copilot a lot. Faster then google, gives great results.
Today I asked it for the name of a French restaurant that closed in my area a few years ago. The first answer was a Chinese fusion place… all the others were off too.
Sure, keep questions confined to something it was heavily trained on, answers will be great.
But yeah, AI going to get rid of a lot of low skilled labor.
Not necessarily a bad approach but feels like something is missing for it to be “intelligent”.
Should really be called “artificial knowledge” instead.
No, it's more like asking a 4th-grader college math questions, and then desperately looking for ways to not be impressed when they get it right.
Today I asked it for the name of a French restaurant that closed in my area a few years ago. The first answer was a Chinese fusion place… all the others were off too.
What would have been impressive is if the model had replied, "WTF, do I look like Google? Look it up there, dumbass."
>There’s a significant difference between predicting what it will specifically look like, and predicting sets of possibilities it won’t look like
which I took to mean there are probability distributions around what things will happen, and it seemed to be your assertion that there wasn't, that a number of things only one of which seemed especially probable, were equally probable. I'm glad to learn you don't think this as it seems totally crazy, especially for someone praising LLMs which after all spend their time making millions of little choices based on probability.
What's the point of this anecdote? That it's not omniscient? Nobody is should be thinking that it is.
I can ask it how many coins I have in my pocket and I bet you it won't know that either.
Any citations for this pretty strong assertion? And please don't reply with "oh you can just tell by feel".
We also should end the exploitative nature of globalization. Outsourced work should be held to the same standards as laborers in modern countries (preferably EU, rather than American, standards).
It’s not that it knows grammar, it just was trained on a dataset that applied proper capitalization.
Humans learn from seeing patterns. I suspect AI only repeats them, more like a parrot.