Intended for exec of AI-generated code, for CICD runners, or for off-chain AI DApps. Mainly to avoid Docker-in-Docker dangers and mess.
Super easy to use with CLI / Python SDK, friendly to AI engs who usually don't like to mess with VM orchestration and networking too much.
Defense-in-depth philosophy.
Would love to get feedback (and contributors: clear & exciting roadmap!), thx
I wonder if nsjails or gVisor may be useful as well. Here's a more comprehensive list of sandboxing solutions: https://github.com/restyler/awesome-sandbox
Glancing at the readme, is your business model technical support? Or what's your plan with this?
Anything interesting to share around startup time for large artifacts, scaling, passing through persistent storage (or GPUs) to these sandboxes?
Curious what things like 'Multi-node cluster capabilities for distributed workloads' mean exactly? inter-VM networking?
Exactly! The main local requirement is to have hardware virtualization available (e.g. /dev/kvm) but that should be fine on your local linux machine. Won't work in cloud machines or on Mac ARM in current form but maybe if I extend
By multi-node I mean so far I only support 1 k8s node, i.e. 1 machine, but soon adding support for multiple. Still, on 20 CPUs I can run +50 VM pods with fractional vCPU limits.
For GPU passthrough: not possible today because I use Firecracker as VMM. On roadmap: Add support for Qemu, then GPU passthrough possible.
Inter-VM networking: it's already possible on single-node: 1 VM = 1 pod. Can have multiple pods per node (have a look at utils/stress-test.sh). Right now I default deny-all ingress for safety (because by default k8s allows inter pod communication), but can make ingress configurable.
Startup time: a second, or a few seconds, depending on which base image (alpine, ubuntu, etc...) and whether you use a before_script or not (what I execute before the network lockdown)
Large artifacts: you can configure resource allocated to a VM pod in the sandbox config and it basically uses k8s resource limits.
Let me know if any other question! Happy to help
Note: I use k3s' internal kubectl and containerd, to avoid messing with your own if you have some already installed. That means you can run commands like "k3s kubectl ..."
And thank you for the compliments on the stack.
IMHO that's kind of a red flag. There's a happy path here where it's successful but stays low-maintenance enough that you just work on it in your spare time, or it takes of and gets community support, or you get sponsorships or such. But there's also an option where in a year or two it becomes your job and you decide to monetize by rug-pulling and announce that actually paying the bills is more important than staying 100% open source. Not a dig at you, just something that's happened enough times that I get nervous when people don't have a plan and therefore don't have a plan to avoid the outcome that creates problems for users.
I like the Docker model, for instance: free for companies under 250 employees or $10m/y revenue.
In any case, it will always be open-source.
Those paid enterprise features wouldn't come from closed-source: they would come from compliance of a particular SaaS-offered infra setup, that anybody else could reproduce. Just like HuggingFace.
For anyone curious:
– Docs: https://docs.katakate.org
- LangChain Agent tutorial: https://docs.katakate.org/guides/langchain-agent
It's getting late where I am, so I'm heading to bed — looking forward to replying to any new comments tomorrow!
Hey, we built coderunner[1] exactly for this purpose. It's completely local. We use apple containers for this (which are 1:1 mapped to a lightweight VM).
1. Coderunner - https://github.com/instavm/coderunner
name: project-build
image: alpine:latest
namespace: default
egress_whitelist:
- "1.1.1.1/32" # Cloudflare DNS
- "8.8.8.8/32" # Google DNS
This is basically a wide-open network policy as far as data exfiltration goes, right?Malicious code just has to resolve <secret>.evil.com and Google/CF will forward that query to evil resolver.
Yes, blocking DNS exfiltration requires DNS filtering at cluster level. This is what will be added with the Cilium integration which is top-3 on the roadmap (top of readme).
DNS resolution is required for basic Kubernetes functionality and hostname resolution within the cluster.
That's said explicitly in several places in the docs: "DNS to CoreDNS allowed"
One thing I could do is make it exposed in config, to allow the user to block all DNS resolutions until Cilium is integrated. LMK if desired!
It lets you narrow the permission scope of an executable using simple command line wrappers.
Yes, but it's not great for it to be an optional config option. Trivially easy to use data exfiltration methods shouldn't be possible at all in a tool like this, let alone enabled by default.
I want to recommend ppl to try this out and not have to tell them about the 5 different options they need to configure in order for it to actually be safe. It ends up defeating the purpose of the tool in my opinion.
Some use cases will require mitmproxy whitelists as well, eg default deny pulling container image except matching the container whitelist.
I'll also add to the roadmap whilelist/deny for container pulling.
Thanks!
gVisor isolates containers by intercepting system calls in a user-space kernel, so it can still be vulnerable to sandbox escape via gVisor bugs, though not directly through Linux kernel exploits (since gVisor doesn’t expose the host kernel to the container).
Katakate also provides more than isolation: it offers orchestration through Kubernetes (K3s)
You could create a gVisor RuntimeClass in Kubernetes to orchestrate gVisor sandboxes, but that would require extra setup.
Our native K8s support and exposition of K8s API also makes it friendly to devops.
Finally, our deploy/infra stack is lean and tightly fits in a single Ansible playbook, which makes it easy to understand and contribute to, letting you rapidly gain full understanding and ownership of the stack.