←back to thread

469 points ghuntley | 1 comments | | HN request time: 0.24s | source
Show context
ofirpress ◴[] No.45001234[source]
We (the Princeton SWE-bench team) built an agent in ~100 lines of code that does pretty well on SWE-bench, you might enjoy it too: https://github.com/SWE-agent/mini-swe-agent
replies(7): >>45001287 #>>45001548 #>>45001716 #>>45001737 #>>45002061 #>>45002110 #>>45009789 #
simonw ◴[] No.45001548[source]
OK that really is pretty simple, thanks for sharing.

The whole thing runs on these prompts: https://github.com/SWE-agent/mini-swe-agent/blob/7e125e5dd49...

  Your task: {{task}}. Please reply
  with a single shell command in
  triple backticks.
  
  To finish, the first line of the
  output of the shell command must be
  'COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT'.
replies(3): >>45002285 #>>45002729 #>>45003054 #
nivertech ◴[] No.45002285[source]

  system_template: str = "You are a helpful assistant that can do anything."
anything? Sounds like an AI Safety issue ;)
replies(1): >>45004257 #
greleic ◴[] No.45004257[source]
You’d be surprised at the amount of time wasted because LLMs “think” they can’t do something. You’d be less surprised that they often “think” they can’t do something, but choose some straight ignorant path that cannot work.

There are theoretically impossible things to do, if you buy into only the basics. If you open your mind, anything is achievable; you just need to break out of the box you’re in.

If enough people keep feeding in that we need a time machine, the revolution will play out in all the timelines. Without it, Sarah Connor is lost.

replies(1): >>45008927 #
1. curvaturearth ◴[] No.45008927[source]
I'm already surprised by the amount of things they think they can do but can't