Most active commenters
  • danenania(15)
  • stavros(5)
  • andoando(4)
  • joshstrange(3)

←back to thread

432 points tosh | 35 comments | | HN request time: 1.926s | source | bottom
1. danenania ◴[] No.39996120[source]
People interested in Aider (which is an awesome tool) might also be interested in checking out my project Plandex[1]. It's terminal-based like Aider and has a somewhat comparable set of features, but is more focused on using LLMs to work on larger and more complex tasks that span many files and model responses. It also uses a git-style CLI approach with independent commands for each action vs. Aider's interactive shell.

I studied Aider's code and prompts quite a bit in the early stages of building Plandex. I'm grateful to Paul for building it and making it open source.

1 - https://github.com/plandex-ai/plandex

replies(8): >>39996346 #>>39996360 #>>39996792 #>>39997471 #>>39997512 #>>40002284 #>>40004973 #>>40071618 #
2. joshstrange ◴[] No.39996346[source]
Do you have any plans to build IDE plugins for this? I understand it's open source and anyone could add that, I was just wondering if that was even on the roadmap? Having this run in my IDE would just so awesome with diff tool I'm used to, with all the other plugins/hotkeys/etc I use.
replies(1): >>39996388 #
3. carom ◴[] No.39996360[source]
How do you situate changes in a file? That seems like the hard part to me since the LLM can't necessary count to output a patch with line numbers.
replies(2): >>39996418 #>>39996713 #
4. danenania ◴[] No.39996388[source]
Yes, VSCode and JetBrains plugins are on the roadmap. Here's the current roadmap by the way: https://github.com/plandex-ai/plandex#roadmap-%EF%B8%8F (it's not exhaustive, but can give you a sense of where I'd like to take Plandex in the future).
replies(1): >>39996408 #
5. joshstrange ◴[] No.39996408{3}[source]
And I completely missed that somehow... My apologies. Thank you for pointing that out.
replies(1): >>39996437 #
6. danenania ◴[] No.39996418[source]
It does use line numbers, which definitely aren't infallible. That's why a `plandex changes` TUI is included to review changes before applying. Unfortunately no one has figured out a file update strategy yet that doesn't make occasional mistakes--probably we'll need either next-gen models or fine-tuning to get there.

That said, counting isn't necessarily required to use line numbers. If line numbers are included in the file when it's sent to the model, it becomes a text analysis task rather than a counting task. Here are the relevant prompts: https://github.com/plandex-ai/plandex/blob/main/app/server/m...

7. danenania ◴[] No.39996437{4}[source]
No worries, it's pretty far down in the readme :)
8. stavros ◴[] No.39996713[source]
Doesn't the software just give the LLM the line numbers?
9. stavros ◴[] No.39996792[source]
This looked cool and I was excited to try it until I realized that I either need a subscription, or I need to set up a server. Why does this need a server, when Aider just works via the cli?
replies(1): >>39996877 #
10. danenania ◴[] No.39996877[source]
First I should note that while cloud will have a subscription eventually, it's free for now. There's an anonymous trial (with no email required) for up to 10 plans or 10 model responses, and then just name and email is required to continue.

I did start out with just the CLI running locally, but it reached a point where I needed a database and thus a client-server model. Plandex is designed for working on many 'plans' at different levels of the project hierarchy (some users on cloud have 50+ after using it for a week), and there's also a fair amount of concurrency, so it got to be too much for a local filesystem or even something like a local SQLite db.

Plandex also has the ability to send tasks to the background, which I think will start to play a more and more important role as models get better and more capable of running autonomously for longer periods, and I want to add sharing and collaboration features in the future as well, so all-in-all I thought a client-server model was the best base to build from.

I understand where you're coming from though. That local-only simplicity is definitely a nice aspect of Aider.

replies(1): >>39996896 #
11. stavros ◴[] No.39996896{3}[source]
I had a second look and the server doesn't look too hard to deploy. I like that there's reasoning behind requiring it, although I suspect that SQLite is more than capable to very easily do this.

I'm trying to deploy the server right now so I can try Plandex, it would be easier if I hadn't forgotten my Postgres password...

As a tip, self-hosting would be much easier (which may be something you don't want to do) if you provided a plain Docker image, then it would just be "pull the Docker image, specify the local directory, specify the DB URL, done".

By the way, why does it need a local directory if it has a database? What's stored in the directory?

replies(1): >>39996989 #
12. danenania ◴[] No.39996989{4}[source]
Agreed on providing a docker image. I made an issue to track it here: https://github.com/plandex-ai/plandex/issues/78

I do want to make self-hosting as easy as possible. In my experience, there will still be enough folks who prefer cloud to make it work :)

There's a local .plandex directory in the project which just stores the project id, and a $HOME/.plandex-home directory that stores some local metadata on each project--so far just the current plan and current branch.

replies(1): >>39997005 #
13. stavros ◴[] No.39997005{5}[source]
I see, thanks for the explanation! If you're only storing a bit of data, removing the requirement for a local directory would make deployment easier; these could just go into the database.
replies(1): >>39997071 #
14. danenania ◴[] No.39997071{6}[source]
Oh sorry, my comment was referring to the local files created by the CLI. The server uses the file system much more heavily in order to enable efficient version control with an embedded git repo for each plan. Everything in a plan that's version-controlled (context, the conversation, model settings, and tentative file updates) is stored in this repo instead of the database.
replies(1): >>39997079 #
15. stavros ◴[] No.39997079{7}[source]
Ah, that makes sense, thank you.
16. joshstrange ◴[] No.39997471[source]
I _cannot_ wait for you to get local models working with this (I know, they need function calling/streaming first). It's amazing! I burned through $10 like it was nothing and bigger context+local is going to make this killer IMHO. It needs additional guidance and with more context maybe loading lint rules into the context would get back code matching my coding style/guide but even as-is there is a ton of value here.

It was able to rewrite (partially, some didn't get fully done) 10 files before I hit my budget limits from Vue 2 Class Component syntax to Vue 3 Composition API. It would have needed another iteration or so to iron out the issues (plus some manual clean up/checking from me) but that's within spitting distance of being worth it. For now I'll use ChatGPT/Claude (which I pay for) to do this work but I will keep a close eye on this project, it's super cool!

replies(1): >>39997896 #
17. andoando ◴[] No.39997512[source]
Can we get someone to automate stuff like copying files, renaming stuff, setting env variables, any common tasks done in an OS.
replies(2): >>39997568 #>>39999110 #
18. danenania ◴[] No.39997568[source]
You could do this with Plandex (or Aider... or ChatGPT) by having it output a shell script then `chmod +x` it and run it. I experimented early on with doing script execution like this in Plandex, but decided to just focus on writing and updating files, as it seemed questionable whether execution could be made reliable enough to be worthwhile without significant model advances. That said, I'd like to revisit it eventually, and some more constrained tasks like copying and moving files around are likely doable without full-on shell script execution, though some scary failure cases are possible here if the model gets the paths wrong in a really bad way.

OpenInterpreter is another project you could check out that is more focused on code/script execution: https://github.com/OpenInterpreter/open-interpreter

replies(1): >>39997605 #
19. andoando ◴[] No.39997605{3}[source]
I feel like what I am saying should be natively supported.

If youre worried about changes getting it wrong, just show a prompt with all the batched changes.

me > build my jar, move it to the last folder I copied it to, and run it. LLM > built jar xyz.jar moving jar to x/y/z me > yes. me > redo last command.

Provide rollback/log for these features if need be.

I really dont think you even need an LLM for this. I feel like I can do it with a simple classifier. It just needs to be hooked into to OS, so that it can scan what you were doing, and replicate it.

For example if I keep opening up folder x and dropping a file called build.jar to folder y, a program should be able to easily understand "copy the new jar over"

I imagine at point this is going to be done at the OS level

replies(1): >>39997858 #
20. danenania ◴[] No.39997858{4}[source]
It's a great concept and I agree it will definitely exist at some point, but working a lot with GPT-4 has made me viscerally aware of how many different ways something like "build my jar, move it to the last folder I copied it to, and run it" can be spectacularly misinterpreted, and how much context is needed for that command to have any hope of being understood. The other big issue is that there is no rollback for a `rm` or `mv` command that screws up your system.

I had similar ideas when I started on Plandex. I wanted it to be able to install dependencies when needed, move files around, etc., but I quickly realized that there's just so much the model needs to know about the system and its state to even have a chance of getting it right. That's not to say it's impossible. It's just a really hard problem and I'd guess the first projects/products to nail it will either come from the OS vendors themselves, or else from people focusing very specifically on that challenge.

replies(2): >>39998227 #>>39998242 #
21. danenania ◴[] No.39997896[source]
Thanks for trying it and your feedback. I'm keeping tabs on open source/local models and will include them as soon as it's feasible.

I hear you on the API costs. You should see my OpenAI bills from building Plandex :-/

replies(2): >>39998360 #>>39998391 #
22. andoando ◴[] No.39998227{5}[source]
Youre right there is a lot of ambiguity there. I think being able to scan user actions helps a ton with this though, because you know exactly the steps the user took. Most of the times I want this is when I literally have to repeat the same set of actions 5+ times and writing a script to do it isnt worth it. I want to be able to just save/train the model and have it do what I want. Today I literally built a jar 50 times, with each time having to open up two folders and copying files between the two same directories. Massively annoying.

There is still some ambiguity there because cases might slightly differ, youre right.

For rm/mv. mv is easily reversible no? You just need to store some context. Same with rm, just copy it to a temp directory. But again with a confirmation prompt its a non issue either way.

23. andoando ◴[] No.39998242{5}[source]
Also maybe we need a slightly different kind of LLM, which instead of just assuming its top predictions are correct, gives you actions at critical steps on how to proceed.

build a jar. > I can build a jar with x,y,z, which do you want?

24. panqueca ◴[] No.39998360{3}[source]
If you're thinking it is expensive, wait until you start to play with Claude Opus. Sooner or later I will declare bankrupt

Nice product BTW. I really liked the UI, is very polished

25. sdesol ◴[] No.39998391{3}[source]
> You should see my OpenAI bills from building Plandex :-/

Sorry if you have answered this before, but can you estimate how many man hours were saved using OpenAI or was the high usage more test related?

replies(1): >>39998426 #
26. danenania ◴[] No.39998426{4}[source]
I have used Plandex a lot to help build Plandex faster, but yeah the high API costs are much more due to testing, where I need to run large tasks over and over in rapid succession in order to debug problems or iterate on the built-in prompts.
27. pax ◴[] No.39999110[source]
open interpreter can do that

https://github.com/OpenInterpreter/open-interpreter

28. j45 ◴[] No.40002284[source]
Can you describe how you read in a repo any better than aider?

Aider has a few blog posts speaking to it.

replies(1): >>40003583 #
29. danenania ◴[] No.40003583[source]
I haven't yet tried incorporating tree-sitter as Aider does to load in all definitions in the repo. In Plandex, the idea is more to load in just the files are relevant to what you're building before giving a prompt. You can also load directory layouts (with file names only) with `plandex load some-dir --tree`.

I like the idea of something like `plandex load some-dir --defs` to load definitions with tree-sitter. I don't think I'd load the whole repo's defs by default like Aider does (I believe?), because that could potentially use a lot of tokens in a large repo and include a lot of irrelevant definitions. One of Plandex's goals is to give the user granular control over what's in context.

But for now if you wanted to do something where definitions across the whole repo would be helpful (vs. loading in specific files or directories) then Aider is better at that.

replies(1): >>40004483 #
30. j45 ◴[] No.40004483{3}[source]
But.. it was said Plandex is similar or worthy of consideration next to something like Aider.. when this post is about Aider.

Understanding a codebase, along with the in/outs between the calls is pretty vital to any codebase, especially the larger a codebase gets.

I'm not attached to the way Aider or Plandex does anything, but I'm still not clear on which scenarios it's worth considering compared to Aider, or vice Versa. Aider seems pretty unique and stands alone on a number of things. I'll still install Plandex and try it out.

Without details, it's a little surprising a post like this could get upvoted so much.

replies(1): >>40004678 #
31. danenania ◴[] No.40004678{4}[source]
Plandex isn't really focused on understanding a whole codebase. It can be used for that to some extent, but it's more designed for building larger features where you'd load in maybe 5-20 relevant files and then have Plandex build the whole feature across potentially dozens of steps and model calls. With Aider (or ChatGPT) it would require a lot more back-and-forth and user interaction to get a similar result.

Like I said, I think Aider's use of tree-sitter is a great concept and something I'd like to incorporate in some way. I'm not at all trying to claim that Plandex is 'better' than Aider for every use case. I think they are suited to different kinds of tasks.

32. wbreau ◴[] No.40004973[source]
I apologize if I'm not posting this in the correct place, but I've been trying to test this out (looks like it'll be fantastic, btw) and I keep running into a 429 error that I've exceeded my current quota for chatgpt but I really don't think I have, which leads me to believe that maybe it's not really taking my api key when I run the export command. Is there a way to check or another reason I could be getting this error?
replies(1): >>40005201 #
33. danenania ◴[] No.40005201[source]
You'd be getting a different error if your key wasn't getting through at all. Can you double check that you're using the right OpenAI api key, have enough api credits (as distinct from chatgpt quota), and haven't hit any max spend limits? You can check here: https://platform.openai.com/account/api-keys
34. warvstar ◴[] No.40071618[source]
Any chance for multi agents and/or a vs-code extension(for diffing / applying changes)? nmv on the vscode extension, I just found your other comment.
replies(1): >>40071709 #
35. danenania ◴[] No.40071709[source]
Could you explain what you mean by "multi agents"?