If o3 can design it, that means it’s using open source schedulers as reference. Did you think about opening up a few open source projects to see how they were doing things in those two weeks you were designing?
AI research has a thing called "the bitter lesson" - which is that the only thing that works is search and learning. Domain-specific knowledge inserted by the researcher tends to look good in benchmarks but compromise the performance of the system[0].
The bitter-er lesson is that this also applies to humans. The reason why humans still outperform AI on lots of intelligence tasks is because humans are doing lots and lots of search and learning, repeatedly, across billions of people. And have been doing so for thousands of years. The only uses of AI that benefit humans are ones that allow you to do more search or more learning.
The human equivalent of "inserting domain-specific knowledge into an AI system" is cultural knowledge, cliches, cargo-cult science, and cheating. Copying other people's work only helps you, long-term, if you're able to build off of that into something new; and lots of discoveries have come about from someone just taking a second look at what had been considered to be generally "known". If you are just "taking shortcuts", then you learn nothing.
[0] I would also argue that the current LLM training regime is still domain-specific knowledge, we've just widened the domain to "the entire Internet".
I wonder how many programmers have assembly code skill atrophy?
Few people will weep the death of the necessity to use abstract logical syntax to communicate with a computer. Just like few people weep the death of having to type out individual register manipulations.
Even if LLMs make "plain English" programming viable, programmers still need to write, test, and debug lists of instructions. "Vibe coding" is different; you're telling the AI to write the instructions and acting more like a product manager, except without any of the actual communications skills that a good manager has to develop. And without any of the search and learning that I mentioned before.
For that matter, a lot of chatbots don't do learning either. Chatbots can sort of search a problem space, but they only remember the last 20-100k tokens. We don't have a way to encode tokens that fall out of that context window into some longer-term weights. Most of their knowledge comes from the information they learned from training data - again, cheated from humans, just like humans can now cheat off the AI. This is a recipe for intellectual stagnation.
[0] e.g. for malware analysis or videogame modding