←back to thread

LLM Inevitabilism

(tomrenner.com)
1613 points SwoopsFromAbove | 1 comments | | HN request time: 0s | source
Show context
DanMcInerney ◴[] No.44572686[source]
These articles kill me. The reason LLMs (or next-gen AI architecture) is inevitably going to take over the world in one way or another is simple: recursive self-improvement.

3 years ago they could barely write a coherent poem and today they're performing at at least graduate student level across most tasks. As of today, AI is writing a significant chunk of the code around itself. Once AI crosses that threshold of consistently being above senior-level engineer level at coding it will reach a tipping point where it can improve itself faster than the best human expert. That's core technological recursive self-improvement but we have another avenue of recursive self-improvement as well: Agentic recursive self-improvement.

First there was LLMs, then there was LLMs with tool usage, then we abstracted the tool usage to MCP servers. Next, we will create agents that autodiscover remote MCP servers, then we will create agents which can autodiscover tools as well as write their own.

Final stage of agents are generalized agents similar to Claude Code which can find remote MCP servers, perform a task, then analyze their first run of completing a task to figure out how to improve the process. Then write its own tools to use to complete the task faster than they did before. Agentic recursive self-improvement. As an agent engineer, I suspect this pattern will become viable in about 2 years.

replies(5): >>44572851 #>>44573037 #>>44573179 #>>44573245 #>>44574822 #
jdauriemma ◴[] No.44573037[source]
> they're performing at at least graduate student level across most tasks

I strongly disagree with this characterization. I have yet to find an application that can reliably execute this prompt:

"Find 90 minutes on my calendar in the next four weeks and book a table at my favorite Thai restaurant for two, outside if available."

Forget "graduate-level work," that's stuff I actually want to engage with. What many people really need help with is just basic administrative assistance, and LLMs are way too unpredictable for those use cases.

replies(2): >>44573855 #>>44574556 #
babelfish ◴[] No.44573855{3}[source]
OpenAI Operator can do that task easily, assuming you've configured it with your calendar and Yelp login.
replies(1): >>44573991 #
1. jdauriemma ◴[] No.44573991{4}[source]
That's great to hear - do you know what success rate it might have? I've used scheduled tasks in ChatGPT and they fail regularly enough to fall into the "toy" category for me. But if Operator is operating significantly above that threshold, that would be remarkable and I'd gladly eat my words.