Other question: why reimplementing your framework, rather than using an existing agent framework like Claude + MCP, or OpenAI + tool calling? Is it because you're using your own LM models, or just because you wanted more control on retries, etc?
There are not that many agent frameworks around at the moment.  If you want to be provider independent you most likely either use pydantic AI or the vercel AI SDK would be my guess.  Neither one have built-in solution for durable execution so you end up driving the loop yourself.  So it's not that I don't use these SDKs, it's just that I need to drive the loop myself.