←back to thread

176 points rgun | 1 comments | | HN request time: 0.207s | source
Show context
psviderski ◴[] No.46144570[source]
Hey, creator here. Thanks for sharing this!

Uncloud[0] is a container orchestrator without a control plane. Think multi-machine Docker Compose with automatic WireGuard mesh, service discovery, and HTTPS via Caddy. Each machine just keeps a p2p-synced copy of cluster state (using Fly.io's Corrosion), so there's no quorum to maintain.

I’m building Uncloud after years of managing Kubernetes in small envs and at a unicorn. I keep seeing teams reach for K8s when they really just need to run a bunch of containers across a few machines with decent networking, rollouts, and HTTPS. The operational overhead of k8s is brutal for what they actually need.

A few things that make it unique:

- uses the familiar Docker Compose spec, no new DSL to learn

- builds and pushes your Docker images directly to your machines without an external registry (via my other project unregistry [1])

- imperative CLI (like Docker) rather than declarative reconciliation. Easier mental model and debugging

- works across cloud VMs, bare metal, even a Raspberry Pi at home behind NAT (all connected together)

- minimal resource footprint (<150MB ram)

[0]: https://github.com/psviderski/uncloud

[1]: https://github.com/psviderski/unregistry

replies(11): >>46144726 #>>46144768 #>>46144784 #>>46144846 #>>46144978 #>>46145074 #>>46145335 #>>46145652 #>>46145808 #>>46146155 #>>46146244 #
avan1 ◴[] No.46145652[source]
Thanks for the both great tools. just i didn't understand one thing ? the request flow, imaging we have 10 servers where we choose this request goes to server 1 and the other goes to 7 for example. and since its zero down time, how it says server 5 is updating so till it gets up no request should go there.
replies(1): >>46145963 #
1. psviderski ◴[] No.46145963[source]
I think there are two different cases here. Not sure which one you’re talking about.

1. External requests, e.g. from the internet via the reverse proxy (Caddy) running in the cluster.

The rollout works on the container, not the server level. Each container registers itself in Caddy so it knows which containers to forward and distribute requests to.

When doing a rollout, a new version of container is started first, registers in caddy, then the old one is removed. This is repeated for each service container. This way, at any time there are running containers that serve requests.

It doesn’t say any server that requests shouldn’t go there. It just updates upstreams in the caddy config to send requests to the containers that are up and healthy.

2. Service to service requests within the cluster. In this case, a service DNS name is resolved to a list of IP addresses (running containers). And the client decides which one to send a request to or whether to distribute requests among them.

When the service is updated, the client needs to resolve the name again to get the up-to-date list of IPs. Many http clients handle this automatically so using http://service-name as an endpoint typically just works. But zero downtime should still be handled by the client in this case.