Context is the bottleneck for coding agents now

1. bhu8 ◴[26 Sep 25 15:16 UTC] No.45387487[source]▶

IMHO, jumping from Level 2 to Level 5 is a matter of:

- Better structured codebases - we need hierarchical codebases with minimal depth, maximal orthogonality and reasonable width. Think microservices.

- Better documentation - most code documentations are not built to handle updates. We need a proper graph structure with few sources of truth that get propagated downstream. Again, some optimal sort of hierarchy is crucial here.

At this point, I really don't think that we necessarily need better agents.

Setup your codebase optimally, spin up 5-10 instances of gpt-5-codex-high for each issue/feature/refactor (pick the best according to some criteria) and your life will go smoothly

replies(3): >>45387502 #>>45387676 #>>45387717 #

2. lomase ◴[26 Sep 25 15:18 UTC] No.45387502[source]▶

>>45387487 (TP) #

Can you show something you have built with that workflow?

replies(4): >>45387618 #>>45387678 #>>45387908 #>>45388791 #

3. hirako2000 ◴[26 Sep 25 15:27 UTC] No.45387618[source]▶

>>45387502 #

Of course not.

4. skaosobab ◴[26 Sep 25 15:32 UTC] No.45387676[source]▶

>>45387487 (TP) #

> Think microservices.

Microservices should already be a last resort when you’ve either: a) hit technical scale that necessitates it b) hit organizational complexity that necessitates it

Opting to introduce them sooner will almost certainly increase the complexity of your codebase prematurely (already a hallmark of LLM development).

> Better documentation

If this means reasoning as to why decisions are made then yes. If this means explaining the code then no - code is the best documentation. English is nowhere near as good at describing how to interface with computers.

Given how long gpt codex 5 has been out, there’s no way you’ve followed these practices for a reasonable enough time to consider them definitive (2 years at the least, likely much longer).

replies(1): >>45387747 #

5. bhu8 ◴[26 Sep 25 15:32 UTC] No.45387678[source]▶

>>45387502 #

Not yet unfortunately, but I'm in the process of building one.

This was my journey: I vibe-coded an Electron app and ended up with a terrible monolithic architecture, and mostly badly written code. Then, I took the app's architecture docs and spent a lot of my time shouting "MAKE THIS ARCHITECTURE MORE ORTHOGONAL, SOLID, KISS, DRY" to gpt-5-pro, and ended up with a 1500+ liner monster doc.

I'm now turning this into a Tauri app and following the new architecture to a T. I would say that it is has a pretty clean structure with multiple microservices.

Now, new features are gated based on the architecture doc, so I'm always maintaining a single source of truth that serves as the main context for any new discussions/features. Also, each microservice has its own README file(s) which are updated with each code change.

6. perplex ◴[26 Sep 25 15:36 UTC] No.45387717[source]▶

>>45387487 (TP) #

I've been using claude on two codebases, one with good layering and clean examples, the other not so much. I get better output from the LLM with good context and clean examples and documentation. Not surprising that clarity in code benefits both humans and machines.

replies(1): >>45388535 #

7. bhu8 ◴[26 Sep 25 15:39 UTC] No.45387747[source]▶

>>45387676 #

> Opting to introduce them sooner will almost certainly increase the complexity of your codebase prematurely

Agreed, but how else are you going to scale mostly AI written code? Relying mostly on AI agents gives you that organizational complexity.

> Given how long gpt codex 5 has been out, there’s no way you’ve followed these practices for a reasonable enough time to consider them definitive

Yeah, fair. Codex has been out for less than 2 weeks at this point. I was relying on gpt-5 in August and opus before that.

replies(1): >>45387792 #

8. lomase ◴[26 Sep 25 15:43 UTC] No.45387792{3}[source]▶

>>45387747 #

I understand why you made it microservices, people make that too even when not using LLMs, because it looks like it is more organized.

But in my experience a microservide architecture is orders of magnitud more complex to build and understand that a monolith.

If you, with the help of an LLM, strugle to keep a monolith organized, I am positive you will find even harder to build microservices.

Good luck in your journey, I hope you learn a ton!

replies(1): >>45387921 #

9. RedNifre ◴[26 Sep 25 15:51 UTC] No.45387908[source]▶

>>45387502 #

I vibe coded an invoice generator by first vibe coding a "template" command line tool as a bash script that substitutes {{words}} in a libre office writer document (those are just zipped xml files, so you can unpack them to a temp directory and substitute raw text without xml awareness), and in the end it calls libre office's cli to convert it to pdf. I also asked the AI to generate a documentation text file, so that the next AI conversation could use the command as a black box.

The vibe coded main invoice generator script then does the calendar calculations to figure out the pay cycle and examines existing invoices in the invoice directory to determine the next invoice number (the invoice number is in the file name, so it doesn't need to open the files). When it is done with the calculations, it uses the template command to generate the final invoice.

This is a very small example, but I do think that clearly defined modules/microservices/libraries are a good way to only put the relevant work context into the limited context window.

It also happens to be more human-friendly, I think?

10. bhu8 ◴[26 Sep 25 15:53 UTC] No.45387921{4}[source]▶

>>45387792 #

Noted. Thanks!

11. daxfohl ◴[26 Sep 25 16:48 UTC] No.45388535[source]▶

>>45387717 #

I think there will be a couple benefits of using agents soon. Should result in a more consistent codebase, which will make patterns easier to see and work with, and also less reinventing the wheel. Also migrations should be way faster both within and across teams, so a lot less struggling with maintaining two ways of doing something for years, which again leads to simpler and more consistent code. Finally the increased speed should lead to more serializability of feature additions, so fewer problems trying to coordinate changes happening in parallel, conflicts, redundancies, etc.

I imagine over time we'll restructure the way we work to take advantage of these opportunities and get a self-reinforcing productivity boost that makes things much simpler, though agents aren't quite capable enough for that breakthrough yet.

12. whstl ◴[26 Sep 25 17:13 UTC] No.45388791[source]▶

>>45387502 #

I "vibe coded" a Gateway/Proxy server that did a lot of request enrichment and proprietary authz stuff that was previously in AWS services. The goal was to save money by having a couple high-performance servers instead of relying on cloud-native stuff.

I put "vibe coded" is in quotes because the code was heavily reviewed after the process, I helped when the agent got stuck (I know pedants will complain but ), and this was definitely not my first rodeo in this domain and I just wanted to see how far an agent could go.

In the end it had a few modifications and went into prod, but to be really fair it was actually fine!

One thing I vibe coded 100% and barely looked at the code until the end was a MacOS menubar app that shows some company stats. I wanted it in Swift but WITHOUT Xcode. It was super helpful in that regard.