I used ollama to build this and ollama supports tool calling natively, by passing a `tools=[...]` in the Python SDK. The tools can be regular Python functions with docstrings that describe the tool use. The SDK handles converting the docstrings into a format the LLM can recognize, so my tool's code documentation becomes the model's source of truth. I can also include usage examples right in the docstring to guide the LLM to work closely with all my available tools. No system prompt needed!
Moreover, I wrote all my tools in a separate module, and just use `inspect.getmembers` to construct the `tools` list that i pass to Ollama. So when I need to write a new tool, I just write another function in the tools module and it Just Works™
Paired with qwen 32b running locally, i was fairly satisfied with the output.