This should be possible today and surely Linus would also see this in the future.
I'm very interested.
In my experience ChatGPT and Gemini are absolutely terrible at these types of things. They are constantly wrong. I know I'm not saying anything new, but I'm waiting to personally experience an LLM that does something useful with any of the code I give it.
These tools aren't useless. They're great as search engines and pointing me in the right direction. They write dumb bash scripts that save me time here and there. That's it.
And it's hilarious to me how these people present these tools. It generates a bunch of code, and then you spend all your time auditing and fixing what is expected to be wrong.
That's not the type of code I'm putting in my company's code base, and I could probably write the damn code more correctly in less time than it takes to review for expected errors.
What am I missing?
Still super confusing, though!
I feel like companies working with and shipping LLMs would do well to remember that it's not just humans who get confused by this, but LLMs themselves... it makes for a painful time, sending off a request and noting that a third of the way into its reasoning that the model has gotten tow things with almost-identical names confused.
Feels like codex is for product managers to fix bugs without touching any developer resources. Then it’s insanely surprising!
personally, i've always operated in a codebase in a way that i _need_ to understand how things work for me to be productive and make the right decisions. I operate the same way with AI. every change is carefully reviewed, if it's dumb, i make it redo it and explain why it's dumb. and if it gets caught in a loop, i reset the context and try to reframe the problem. overall, i'm definitely more productive, but if you truly want to be hands off--you're in for a very bad time. i've been there.
lastly, some codebases don't work well with AI. I was working on a problem that was a bit more novel/out there and no model could solve it. Just yapped endlessly about these complex, very potentially smart sounding solutions that did absolutely nothing. went all the way to o1-pro. the craziest part to me was the fact that across claude, deepseek and openai, they used the same specific vernacular for this particular problem which really highlights how a lot of these models are just a mish-mash of the same underlying architecture/internet data. some of these models use responses from other models for their training data, which to me is like incest. you won't get good genetical results
That you are trying to use LLMs to create giant sprawling codebase feature packed software packages that define the modern software landscape. What's being missed is that any one user might only utilize 5% of the code base on any given day. Software is written to accommodate every need every user could have in one package. Then the users just use the small slice that accommodates their specific needs.
I have now created 5 hyper narrow programs that are used daily by my company to do work. I am not a programmer and my company is not a tech company located in a tech bubble. We are a tiny company that does old school manufacturing.
To give a quick general example, Betty uses Excel to manage payroll. A list of employees, a list of wages, a list of hours worked (which she copys from the time clock software .csv that she imports to excel).
Excel is a few million LOC program and costs ~$10/mo. Betty needs maybe 2k LOC to do what she uses excel for. Something an LLM can do easily, a python GUI wrapper on an SQLite DB. And she would be blown away at how fast it is, and how it is written for her use specifically.
How software is written and how it is used will change to accommodate LLMs. We didn't design cars to drive on horse paths, we put down pavement.
A recent example from a C# project I was working in. The project used builder classes that were constructed according to specified rules, but all of these builders were written by hand. I wanted to automatically generate these builders, and not using AI, just good old meta-programming.
Now I knew enough to know that I needed a C# source generator, but I had absolutely no experience with writing them. Could I have figured this out in an hour or two? Probably. Did I write a prompt in less than five minutes and get a source generator that worked correctly in the first shot? Also yes. I then spent some time cleaning up that code and understanding the API it uses to hook into everything and was done in half an hour and still learnt something from it.
You can make the argument that this source generator is in itself "boilerplate", because it doesn't contain any special sauce, but I still saved significant time in this instance.
I can't even fathom how frustrating such tools would be with poorly written confusing Clojure code using some niche dependency.
That being said, I can imagine a whole class of problems where this could succeed very well at and provide value. Then again, the type of problems that I feel these systems could get right 99% of the time are problems that a skilled developer could fix in minutes.
For example, in the last month or so, I added a job queue plugin. The ability to run multiple tasks that they demoed today is quite similar. The issue I ran into with users is that without Enterprise plans, complex tasks run into rate limits when trying to run concurrently.
So I am adding an ability to have multiple queues, with each possibly using different models and/or providers, to get around rate limits.
By the way, my system has features that are somewhat similar not only to this tool they are showing but also things like Manus. It is quite rough around the edges though because I am doing 100% of it myself.
But it is MIT Licensed and it would be great if any developer on the planet wanted to contribute anything.
I made one for github action but it's not as realtime and is 2 years old now: https://github.com/asadm/chota
I can't say I am a big fan of neutering these paradigm-shifting tools according to one culture's code of ethics / way of doing business / etc.
One man's revolutionary is another's enemy combatant and all that. What if we need top-notch malware to take down the robot dogs lobbing mortars at our madmaxian compound?!
sigh
This project is not your typical Webdev project, so maybe that's an interesting case-study. It takes a C-API spec in JSON, loads and processes it in Python and generates a C-library that turns a UI marked up YAML/JSON into C-Api calls to render that UI. [1]
The result is pretty hacky code (by my design, can't/won't use FFI) that's 90% written by Gemini 2.5 Pro Pre/Exp but it mostly worked. It's around 7k lines of Python that generate a 30-40k loc C-library from a JSON LVGL-API-spec to render an LVGL UI from YAML/JSON markup.
I probably spent 2-3 weeks on this, I might have been able to do something similar in maybe 2x the time but this is about 20% of the mental overhead/exhaustion it would have taken me otherwise. Otoh, I would have had a much better understanding of the tradeoffs and maybe a slightly cleaner architecture if I would have to write it. But there's also a chance I would have gotten lost in some of the complexity and never finished (esp since it's a side-project that probably no-one else will ever see).
What worked well:
* It mostly works(!). Unlike previous attempts with Gemini 1.5 where I had to spend about as much or more time fixing than it'd have taken me to write the code. Even adding complicated features after the fact usually works pretty well with minor fixing on my end.
* Lowers mental "load" - you don't have to think so much about how to tackle features, refactors, ...
Other stuff:
* I really did not like Cursor or Windsurf - I half-use VSCode for embedded hobby projects but I don't want to then have another "thing" on top of that. Aider works, but it would probably require some more work to get used to the automatic features. I really need to get used to the tooling, not an insignificant time investment. It doesn't vibe with how I work, yet.
* You can generate a *significant* amount of code in a short time. It doesn't feel like it's "your" code though, it's like joining a startup - a mountain of code, someone else's architecture, their coding style, comment style, ... and,
* there's this "fog of code", where you can sorta bumble around the codebase but don't really 100% understand it. I still have mid/low confidence in the changes I make by hand, even 1 week after the codebase has largely stabilized. Again, it's like getting familiar with someone else's code.
* Code quality is ok but not great (and partially my fault). Probably depends on how you got to the current code - ie how clean was your "path". But since it is easier to "evolve" the whole project (I changed directions once or twice when I sort of hit a wall) it's also easier to end up with a messy-ish codebase. Maybe the way to go is to first explore, then codify all the requirements and start afresh from a clean slate instead of trying to evolve the code-base. But that's also not an insignificant amount of work and also mental load (because now you really need to understand the whole codebase or trust that an LLM can sufficiently distill it).
* I got much better results with very precise prompts. Maybe I'm using it wrong, ie I usually (think I) know what I want and just instruct the LLM instead of having an exploratory chat but the more explicit I am, the more closely the output is to what I'd like to see. I've tried to discuss proposed changes a few times to generate a spec to implement in another session but it takes time and was not super successful. Another thing to practice.
* A bit of a later realization, but modular code and short, self-contained modules are really important though this might depend on your workflow.
To summarize:
* It works.
* It lowers initial mental burden.
* But to get really good results, you still have to put a lot of effort into it.
* At least right now, it seems you will still eventually have to put in the mental effort at some point, normally it's "front-loaded" where you have to do the design and think about it hard, whereas the AI does all the initial work but it becomes harder to cope with the codebase once you reach a certain complexity. Eventually you will have to understand it though even if just to instruct the LLM to make the exact changes you want.
I imagine many engineers are like myself in that they got into programming because they liked tinkering and hacking and implementation details, all of which are likely to be abstracted over in this new era of prompting.
A not open-source option this looks close to is also https://githubnext.com/projects/copilot-workspace (released April 2024, but I'm not sure it's gotten any significant updates since)
But yes, I hope we get away from the giant conglomeration of everything, ESPECIALLY the reality of people doing 90% of their business inside a Google Chrome widow. Move towards the UNIX philosophy of tiny single-purpose programs.
When I'm using aider, after it make a commit what I do, I then immediately run git reset HEAD^ and then git diff (actually I use github desktop client to see the diff) to evaluate what exactly it did, and if I like it or not. Then I usually make some adjustments and only after that commit and push.
In short, the tools work. I've built things 10x faster than doing it from scratch. I also have a sense of what else I'll be able to build in a year. I also enjoy not having to add cycles to communicate with external contributors -- I think, then I do, even if there's a bit of wrestling. Wrangling with a coding agent feels a bit like "compile, test, fix, re-compile". Re-compiling generally got faster in subsequent generations of compiler releases.
My company is building internal business functions using AI right now. It works too. We're not putting that stuff in front of our customers yet, but I can see that it'll come. We may put agents into the product that let them build things for themselves.
I get the grumpiness & resistance, but I don't see how it's buying you anything. The puck isn't underfoot.
There's so many of these "vibe coding" tools and there has to be real engineering rigor at some point. I saw them demo "find the bug" but the bugs they found were pretty superficial and thats something we've seen in our internal benchmark from both Devin and Cursor. A lot of noise and false positives or superficial fixes.
We had to tinker piece by piece to build a miniature castle. Over many hours.
Now I can tinker concept by concept, and build much larger castles, much faster. Like waving a wand, seeing my thoughts come to fruition in near real time.
No vanity lost in my opinion. Possibly more to be gained.
This sparked a thought in how a large part of the job is often the work needed to demonstrate impact. I think this aspect is often overlooked by some of the good engineers not yet taking advantage of the AI tooling. LLM loops may not yet be good enough to produce shippable code by themselves, but they sure are capable to help reduce the overhead of these up and out communicative tasks.
But they aren't moving nearly as fast as OpenAI. And it remains to be seen if first mover will mean anything.
Lots missing here, but I had the same issues, it takes iteration and practice. I use claude code in terminal windows, and text expander to save explicit reminders that I have to inject super regularly because anthropic obscures access to system prompts.
For example, I have 3 to 8 paragraph long instructions I will place regularly about not assuming, checking deterministically etc. and for most things I have the agents write a report with a specific instruction set.
I pop the instructions into text expander so I just type - docs when saying go figure this out, and give me the path to the report when done.
They come back with a path, and I copy it and search vscode
It opens as an md and i use preview mode, its similar to a google doc.
And ill review it. always, things will be wrong, tons of assumptions, failures to check determistically, etc... but I see that in the doc and have it fix it. correct misunderstandings, update the doc until its perfect.
From there ill say add a plan in a table with status for each task based on this ( another text expander snippet with instructions )
And WHEN thats 100% right, Ill say implement and update as you go. The update as you go forces it to recognize and remember the scope of the task.
Greatest points of failure in the system is misalignment. Ethics teams got that right. It compounds FAST if allowed. you let them assume things, they state assumptions as facts, that becomes what other agents read and you get true chaos unchecked.
I started rebuilding claude code from scratch literally because they block us from accessing system prompts and I NEED these agents to stop lying to me about things that are not done or assumed, which highlights the true chaos possible when applied to system critical operations in governance or at scale.
I also built my own tool like codex for managing agent tasks and making this simpler, but getting them to use it without getting confused is still a gap.
Let me know if you have any other questions. I am performing the work of 20 Engineers as of today, rewrote 2 years of back end code that required a team of 2 engineers full time work in 4 weeks by myself with this system... so I am, I guess quite good at it.
I need to push my edges further into this latest tech, have not tried codex cli or the new tool yet.
We’ve long used local agents like Cursor and Claude Code, so we didn’t expect too much. But Codex shines in a few areas:
Parallel task execution: You can batch dozens of small edits (refactors, tests, boilerplate) and run them concurrently without context juggling. It's super nice to run a bunch of tasks at the same time (something that's really hard to do in Cursor, Cline, etc.)
It kind of feels like a junior engineer on steroids, you just need to point it at a file or function, specify the change, and it scaffolds out most of a PR. You still need to do a lot of work to get it production ready, but it's as if you have an infinite number of junior engineers at your disposal now all working on different things.
Model quality is good, but hard to say it's that much better than other models. In side-by-side tests with Cursor + Gemini 2.5-pro, naming, style and logic are relatively indistinguishable, so quality meets our bar but doesn’t yet exceed it.
I wouldn't sweat it. According to it's developers, Codex understands 'malicious software', it has just been trained to say, "But I won't do that" when such requests are made to it. Judging from the recent past [1][2] getting LLMs to bypass such safeguards is pretty easy.
1.https://hiddenlayer.com/innovation-hub/novel-universal-bypas... 2.https://cyberpress.org/researchers-bypass-safeguards-in-17-p...
its a pain but it works.
Even TDD it will hallucinate the mocks without management. and hallucinate the requirements. Each layer has to be checked atomically, but the text expander snippets done right can get it close to 75% right.
My main project faces 5000 users so I cant let the agents run freely, whereas with isolated projects in separate repos I can let them run more freely, then review in gitkraken before committing.
One issue with junior devs is that because they’re not fully autonomous, you have to spend a non trivial amount of time guiding them and reviewing their code. Even if I had easy access to a lot of them, pretty quickly that overhead would become the bottleneck.
Did you think that managing a lot of these virtual devs could get overwhelming or are they pretty autonomous?
It also seems very telling they have not mentioned o4-high benchmarks at all. o4-mini exists, so logically there is an o4 full model right?
> It kind of feels like a junior engineer on steroids, you just need to point it at a file or function, specify the change, and it scaffolds out most of a PR. You still need to do a lot of work to get it production ready, but it's as if you have an infinite number of junior engineers at your disposal now all working on different things.
What's the benefit of this? It sounds like it's just a gimmick for the "AI will replace programmers" headlines. In reality, LLMs complete their tasks within seconds, and the time consuming part is specifying the tasks and then reviewing and correcting them. What is the point of parallelizing the fastest part of the process?
If you don't mind, what were the strengths and limitations of Claude Code compared to Codex? You mentioned parallel task execution being a standout feature for Codex - was this a particular pain point with Claude Code? Any other insights on how Claude Code performed for your team would be valuable. We are pleased with Claude Code at the moment and were a bit underwhelmed by comparable Codex CLI tool OAI released earlier this month.
There must be room for a Modal/Cloudflare/etc infrastructure company that focuses only on providing full-fledged computer environments specifically for AI with forking/snapshotting (pause/resume), screen access, human-in-the-loop support, and so forth, and it would be very lucrative. We have browser-use, etc, but they don't (yet) capture the whole flow.
rinse and repeat once task done, update #1 and cycle again. Add in another CC window if need more tasks concurrently.
downside is cost but if not an issue, it's great for getting stuff done across distributed teams..
As long as I spend less time reviewing and guiding than doing it myself it's a win for me. I don't have any fun doing these things and I'd rather yelling at a bunch of "agents". For those who enjoy doing bunch of small edits I guess it's the opposite.
My kid recently graduated from a very good school with a degree in computer science and what she's told me about the job market is scary. It seems that, relatively speaking, there's a lot of postings for senior engineers and very little for new grads.
My employer has hired recently and the flood of resumes after posting for a relatively low level position was nuts. There was just no hope of giving each candidate a fair chance and that really sucks.
My kid's classmates who did find work did it mostly through personal connections.
It's probably over for these folks.
There will likely(?, hopefully?) be new adjacent gradients for people to climb.
In any case, I would worry more about your own job prospects. It's coming for everyone.
Unfortunately this is not how companies think. I read somewhere more than 20 years ago about outsourcing and manufacturing offshoring. The author basically asked the same: if we move out the so-called low-end jobs, where do we think we will get the senior engineers? Yet companies continued offshoring, and the western lost talent and know-how, while watching our competitor you-know-who become the world leader in increasingly more industries.
They'll probably just need to learn for longer and if companies ever get so desperate for senior engineers then just take the most able/experienced junior/mid level dev.
But I'd argue before they do that if companies can't find skilled labour domestically they should consider bringing skilled workers from abroad. There are literally hundreds of millions of Indians who got connected to the internet over the last decade. There's no reason a company should struggle to find senior engineers.
Money number must always go up. Hiring people costs money. "Oh hey I just read this article, sez you can have A.I. code your stuff, for pennies?"
What about using it for AI / developing models that compete with our new overlords?
Seems like using this is just asking to get rug pulled for competing with em when they release something that competes with your thing. Am I just an old who’s crowing about nothing? It’s ok for them to tell us we own outputs we can’t use to compete with em?
This is also part of a recent update to Zed. I typically use Zed with my own Claude API key.
So the benefit is really that during this "down" time, you can do multiple useful things in parallel. Previously, our engineers were waiting on the Cursor agent to finish, but the parallelization means you're explicitly turning your brain off of one task and moving on to a different task.
Preview video from Open AI: https://www.youtube.com/watch?v=hhdpnbfH6NU&t=878s
As I think about what "AI-native" or just the future of building software loos like, its interesting to me that - right now - developers are still just reading code and tests rather than looking at simulations.
While a new(ish) concept for software development, simulations could provide a wider range of outcomes and, especially for the front end, are far easier to evaluate than just code/tests alone. I'm biased because this is something I've been exploring but it really hit me over the head looking at the Codex launch materials.
You mean like automated test suites?
Look at the results from multi swe bench - https://multi-swe-bench.github.io/#/
swe polybench - https://amazon-science.github.io/SWE-PolyBench/
Kotlin bench - https://firebender.com/leaderboard
That said, it might be possible to tell each agent to create a branch and do work there? I haven't tried that.
I haven't seen anything about Zed using containers, but again you might be able to tell each agent to use some container tooling you have in place since it can run commands if you give it permission.
If you derive enjoyment from actually assembling the castle, you lose out on that by using the wand that makes it happen instantly. Sure wand's castles may be larger, but you don't put a Lego castle together for the finished product.
Counter-point B: AI does not get tired, does not need space, does not need catering to their experience. AI is fine being interrupted and redirected. AI is fine spending two days on something that gets overwritten and thrown away (no morale loss).
You can customize the system prompts, baseline propmts, and models used for every single mode and have as many or as few as you want.
As you say, happens all the time. Also doesn’t make sense because so few people are buying individual stocks anyway. Goal should be to consistently outperform over the long term. Wall street tends to be very myopic.
Thinking long term is a hard concept for the bean counters at these tech companies i guess…
Ambitious idea, but I like it.
Advancements in general AI knowledge over time will not correlate to improvements in remembering any matters as colloquial as this.
Counter-counter-point B: AI absolutely needs catering to their experience. Prompter must always learn how to phrase things so that the AI will understand them, adjust things when they get stuck in loops by removing confusing elements from the prompt, etc.
If you want one idiot's perspective, please hyper-focus on model quality. The barrier right now is not tooling, it's the fact that models are not good enough for a large amount of work. More importantly, they're still closer to interns than junior devs: you must give them a ton of guidance, constant feedback, and a very stern eye for them to do even pretty simple tasks.
I'd like to see something with an o1-preview/pro level of quality that isn't insanely expensive, particularly since a lot of programming isn't about syntax (which most SotA modls have down pat) but about understanding the underlying concepts, an area in which they remain weak.
Atp I really don't care if the tooling sucks. Just give me really, really good mdoels that don't cost a kidney.
I think some people are betting on the fact that AI can replace junior devs in 2-5 years and seniors in 10-20, when the old ones are largely gone. But that's sort of beside the point as far as most corporate decision-making.
Facebook has been caught in recent DOJ hearings breaking the law with how they run their business, just as one example. They claimed under oath, previously, to not be doing X, and then years later there was proof they did exactly that.
https://youtu.be/7ZzxxLqWKOE?si=_FD2gikJkSH1V96r
A companies “word” means nothing imo. None of this makes sense if i’m being honest. Unless you personally have a negotiated contract with the provider, and can somehow be certain they are doing what they claim, and can later sue for damages, all of this is just crossing your fingers and hoping for the best.
https://huggingface.co/blog/smolvlm
recently both llama.cpp and ollama got better support for them too, which makes this kind of integration with local/self-hosted models now more attainable/less expensive
Top-tier engineers who integrate a deep understanding of business and user needs into technical design will likely be safe until we get full-fledged AGI.
That said, back in the early 00s there was much more of a culture of everyone is expected to be self-taught and doing real web dev probably before they even get to college, so by the time they graduate they are in reality quite senior. This was true for me and a lot of my friends, but I feel like these days there are many CS grads who haven't done a lot of applied stuff. But at the same time, to be fair, this was a way easier task in the early 00s because if you knew JS/HTML/CSS/SQL, C++ and maybe some .NET language that was pretty much it you could do everything (there were virtually no frameworks), now there are thousands of frameworks and languages and ecosystems and you could spend 5+ years learning any one of them. It is no longer possible for one person to learn all of tech, people are much more specialized these days.
But I agree that eventually someone is going to have to start hiring juniors again or there will be no seniors.
But also, I think this underestimates significantly what junior engineers do. Junior engineers are people who have spent 4 to 6 years receiving a specialised education in a university - and they normally need to be already good at school math. All they lack is experience applying this education on a job - but they are professionals - educated, proactive and mostly smart.
The market is tough indeed, and as much it is tough for a senior engineer like myself, I don't envy the current cohort of fresh grads. It being tough is only tangentially related to the AI though. Main factor is the general economic slowdown, with AI contributing by distracting already scarce investment from non-AI companies and producing a lot of uncertainty in how many and what employees companies will need in the future. Their current capabilities are nowhere near to having a real economic impact.
Wish your kid and you a lot of patience, grit and luck.
I have mentored junior developers and found it to be a rewarding part of the job. My colleagues mostly ignore juniors, provide no real guidance, couldn't care less. I see this attitude from others in the comments here, relieved they don't have to face that human interaction anymore. There are too many antisocial weirdos in this industry.
Without a strong moral and cultural foundation the AGI paradigm will be a dystopia. Humans obsolete across all industries.
On the other hand, if your job was writing code at certain companies whose profits were based on shoving ads in front of people then I would agree that no one will care if it is written by a machine or not. The days of those jobs making >$200k a year are numbered.
Does this mean people will be less incentivized to contribute to open source as time goes by?
P.S., I think the current trend is a wakeup call to us software engineers. We thought we were doing highly creative work, but in reality we spend a lot of time doing the basic job of knowledge workers: retrieving knowledge and interpolating some basic and highly predictable variations. Unfortunately, the current AI is really good at replacing this type of work.
My optimistic view is that in long term we will have invent or expand into more interesting work, but I'm not sure how long we will have to wait. The current generation of software engineers may suffer high supply but low demand of our profession for years to come.
Most of the waking hours of most creative work have this type of drudgery. Professional painters and designers spend most of their time replicating ideas that are well fleshed-out. Musicians spend most of their time rehearsing existing compositions.
There is a point to be made that these repetitive tasks are a prerequisite to come up with creative ideas.
Case 1: you keep training engineers.
Case 1.1: AGI soon, you don't need juniors or seniors besides a very few. You cost yourself a ton of money that competitors can reinvest into R&D, use to undercut your prices, or return to keep their investors happy.
Case 1.2: No AGI. Wages rise, a lot. You must remain in line with that to avoid losing those engineers you trained.
Case 2: You quit training juniors and let AI do the work.
Case 2.1: AGI soon, you have saved yourself a bundle of cash and remain mostly in in line with the market.
Case 2.2: no AGI, you are in the same bidding war for talent as everyone else, the same place you'd have been were you to have spent all that cash to train engineers. You now have a juicier balance sheet with which to enter this bidding war.
The only way out of this, you can probably see, is some sort of external co-ordination, as is the case with most of these situations. The high-EV move is to quit training juniors, by a mile, independently of whether AI can replace senior devs in a decade.
I was running a quick errand between engineering meetings and saw the first few lines about hiring juniors, and I wrote a couple of comments about how I feel about all of this.
I'm not always guilty of skimming, but today I was.
I think instead we should focus on getting rid of managers and product owners.
If you extrapolate and generalize further... what is at risk is any task that involves taking information input (text, audio, images, video, etc.), and applying it to create some information output or perform some action which is useful.
That's basically the definition of work. It's not just knowledge work, it's literally any work.
For that reason all my silly little side projects are now in private repos. I dont care the chance somebody builds a business around them is slim to none. Dont think putting a license will protect you either. You'd have to know somebody is violating your license before you can even think about doing anything and that's basically impossible if it gets ripped into a private codebase and isnt obvious externally.
That's really awesome. I hope my daughter finds a job somewhere that values professional development. I'd hate for her to quit the industry before she sees just how interesting and rewarding it can be.
I didn't have many mentors when starting out, but the ones I had were so unbelievably helpful both professionally and personally. If I didn't have their advice and encouragement, I don't think I'd still be doing what I'm doing.
Can totally relate. Unfortunately the trend for all-senior teams and companies has started long before ChatGPT, so the opportunities have been quite scarce, at least in a professional environment.
Is this still rolling out? I dont need the team plan too do I?
I have been using openAI products for years now and I am keen to try but I have no idea what I am doing wrong.
This seems imply that the software engineering as a profession has been quite mature and saturated for a while, to the point that a model can predict most of the output. Yes, yes, I know there are thousands of advanced algorithms and amazing systems in production. It's just that the market does not need millions of engineers for such advanced skills.
Unless we get yet another new domain like cloud or like internet, I'm afraid the core value of software engineers: trailblazing for new business scenarios, will continue diminishing and being marginalized by AI. As a result, we get way less demand for our job, and many of us will either take a lower pay, or lose our jobs for extended time.
I'm quite conflicted on this assessment. On one hand, I was wondering if we would get better job market if there were not much open-sourced systems. We may have had a much slower growth, but we would see our growth last for a lot more years, which mean we may enjoy our profession until our retirement and more. On the other hand, open source did create large cakes, right? Like the "big data" market, the ML market, the distributed system market, and etc. Like the millions of data scientists who could barely use Pandas and scipy, or hundreds of thousands of ML engineers who couldn't even bother to know what semi positive definite matrix is.
Interesting times.
See that never was the purpose.. going bigger and faster, towards what exactly? Chaos? By the way we never managed to fully tackle manual software development by trained professionals and we now expect Shangri-La by throwing everything and the kitchen sink into giant inscrutable matrices. This time by amateurs as well. I'm sure this will all turn out very well and very, very productive.
All the same principles apply as before: smart, driven, high ownership engineers make a huge difference to a company's success, and I find that the trend is even stronger now than before because of all the tools that these early career engineers have access to. Many of the folks we've hired have been able to spin up on our codebase much faster than in the past.
We're mainly helping them develop taste for what good code / good practices look like.
> without sacrificing quality
Right..
> it's your responsibility to use that tool
Again, it's actually not. It's my responsibility to do my job, not to make my boss' - or his boss' - car nicer. I know that's what we all know will create "job security" but let's not conflate these things. My job is to do my end of the bargain. My boss' job is paying me for doing that. If he deems it necessary to force me to use AI bullshit, I will of course, but it is definitely not my responsibility to do so autonomously.
That's really great to hear.
Your experience that a new engineer equipped with modern tools is more effective and productive than in the past is important to highlight. It makes total sense.
It's not long ago when the correction of the tech job market started, because it got blown up during and after covid. The geopolitical situation is very unstable.
I also think there is way too much FUD around AI, including coding assistants, than necessary. Typically coming either from people who want to sell it or want to get in on the hype.
Things are shifting and moving, which creates uncertainty. But it also opens new doors. Maybe it's a time for risk takers, the curious, the daring. Small businesses and new kinds of services might rise from this, like web development came out of the internet revolution. To me, it seems like things are opening up and not closing down.
Besides that, I bet there are more people today who write, read or otherwise deal directly with assembly code than ever before, even though we had higher level languages for many decades.
As for the job market specifically: SWE and CS (adjacent) jobs are still among the fastest growing, coming up in all kinds of lists.
There’s still quite a bit of a gap in terms of trust.
To contrast, CH and GER are known to have very robust and regulated apprenticeship programs. Meaning you start working at a much earlier age (16) and go to vocational school at the same time for about 4 years. This path is then supported with all kinds of educational stepping stones later down the line.
There are many software developers who went that route in CH for example, starting with an application development apprenticeship, then getting to technical college in their mid 20's and so on.
I think this model has a lot of advantages. University is for kids who like school and the academic approach to learning. Apprenticeships plus further education or an autodidactic path then casts a much broader net, where you learn practical skills much earlier.
There are several advantages and disadvantages of both paths. In summary I think the academic path provides deeper CS knowledge which can be a force multiplier. The apprenticeship path leads to earlier high productivity and pragmatism.
My opinion is that in combination, both being strongly supported paths, creates more opportunities for people and strengthens the economy as a whole.
Junior engineers are not cattle. They are the future senior ones, they bring new insights into teams, new perspectives; diversity. I can tell you the times I have learnt so many valuable things from so-called junior engineers (and not only tech-wise things).
LLMs have their place, but ffs, stop with the "junior engineer replacement" shit.
IMO LLMs are actually pretty good at writing small scripts. First, it's much more common for a small script to be in the LLM's training data, and second, it's much easier to find and fix a bug. So the LLM actually does allow a non-programmer to write correct code with minimal effort (for some simple task), and then they are blown away thinking writing software is a solved problem. However, these kinds of people have no idea of the difference between a hundred line script where an error is easily found and isn't a big deal and a million line codebase where an error can be invisible and shut everything down.
Worst of all is when the two sides of tech-giants and non-programmers meet. These two sides may sound like opposites but they really aren't. In particular, there are plenty of non-programmers involved at the C-level and the HR levels of tech companies. These people are particularly vulnerable to being wowed by LLMs seemingly able to do complex tasks that in their minds are the same tasks their employees are doing. As a result, they stop hiring new people and tell their current people to "just use LLMs", leading to the current hiring crisis.
Vocational training focusing on immediate fit for the market is great for companies that want to extract maximal immediate value from labour for minimal cost, but longer term is not good for engineers themselves.
Today startups mostly wrap LLMs as this is what VCs expect. Larger companies have smaller IT budgets than before (adjusted for inflation). This is the real problem that causes the jobs shortage.
The best junior I've hired was a big contributor to an open source library we were starting to use.
I think there's still lots of opportunity for honing your skill, and showing it off, outside of schools.
But this does not happen in industry verticals that are protected by regulation (banks) or national interest (Boring).
I've been using the codex agent since before this announcement btw along with most of the latest LLMs. I literally work in the AI/ML tooling space. We're entering a dangerous world now where there's super useful technology but people are trying to use it to replace others instead of enhance them. And that's causing the wrong tools to be built.
It made changes in TS/Next.js given just the boiletplate from create-next-app, ran `yarn dev` then opened its mini LLM browser and navigated to localhost to verify everything looked correct.
It found 1 mistake and fixed the issue then ran `yarn dev` again, opened a new browser, navigated to localhost (pointing at the original server it brought up, not the new one at another port) and confirmed the change was correct.
I was very impressed but still laughed at how it somehow backed its way into a flow the worked, but only because Next has hot-reloading.
E.g.:
If we have 10 PMs and 90 devs today, that could be hypothetically be replace by 8 PM+Dev, 20 specialized devs, and 2 specialized PMs in the future.
But I also wonder (I'm thinking out loud here, so pardon the raw unfiltered thoughts), if being a junior today is unrecognizable.
Like for example, that whatever a "junior" will be now, will have to get better at thinking at a higher level, rather than the minute that we did as juniors (like design patterns and all that stuff).
So maybe the levels of abstraction change?
personally, I completely stopped 2 years ago
it's the same as the stack overflow problem: the incentive to contribute tends towards zero, at which point the plagiarism machine stops improving
no it creates shit thats close enough for people who are in a rush and dont care.
ie, you need artwork for shit on temu, boom job done.
You want to make a poster for a bake sale, boom job done.
Need some free music that sounds close enough to be swifty, but not enough to get sued, great.
But as an expression of creativity, most people cant get it to do that.
Its currently slightly more configurable clipart.
OK, great.
> That you are trying to use LLMs to create giant sprawling codebase feature packed software packages that define the modern software landscape. What's being missed is that any one user might only utilize 5% of the code base on any given day. Software is written to accommodate every need every user could have in one package. Then the users just use the small slice that accommodates their specific needs.
With all due respect, the fact that you made a few small programs to help with your tasks is wonderful but this last statement alone rather disqualifies your expertise to make an assessment on software engineering in general.
There's a great number of reasons why codebases get large. Complex problems inherently come with complexity and scale in both code and integrations. You can choose to move the complexity around but never fully get rid of it.
Indeed, even codex (and i've been using it prior to this release) is not remotely at the level of even a junior engineer outside of a set of tasks.
1. When I work on side projects and use AI, sometimes I wonder "what's the point if I am just copy / pasting code? I am not learning anything" but what I have come to realize is building apps with AI assistance is the skill that I am learning, rather than writing code per se as it was a few years ago.
2. I work in high scale distributed computing, so I am still presented with ample opportunities to get very low level, which I love. I am not sure how much I care about writing code per se anymore. Working with AI still is tinkering, it has not changed that much for me. It is quite different, but the underlying fun parts are still present.
https://constelisvoss.com/pages/a-computer-can-never-be-held...
More generally, specialty knowledge is valuable. From now on, all employees will be monitored in order to replace them.
Senior engineers are already very well paid. Wages rising a lot from where they already are, while companies compete for a few people, and those who can’t afford it need to lean on AI or wait 10+ years for someone to develop with equivalent expertise… all of this sounds bad for the industry. It’s only good for the few senior engineers that are about to retire, and the few who went out of their way to not use AI and acquire actual skills.
This problem might be new to CS, but has happened to other engineers, notably to MechE in the 90's, ChemE in 80's, Aerospace in 70's, etc... due to rapid pace of automation and product commoditization.
The senior jobs will disappear too, or offshored to a developing country: Exxon (India 152 - 78 US) https://jobs.exxonmobil.com/ Chevron (India 159 - 4 US) https://careers.chevron.com/search-jobs
a lot of junior eng tasks don't really help you become a senior engineer. someone needs to make a form and a backend API for it to talk to, because it's a business need. but doing 50 of those doesn't really impart a lot of wisdom
same with writing tests. you'll probably get faster at writing tests, but that's about it. knowing that you need the tests, and what kinds of things might go wrong, is the senior engineer skill
with the LLMs current ability to help people research a topic, and their growing ability to write functioning code, my hunch is that people with the time to spare can learn senior engineer skills while bypassing being a junior engineer
convincing management of that is another story, though. if you can't afford to do unpaid self-directed study, it's probably going to be a bumpy road until industry figures out how to not eat the seed corn