←back to thread

196 points bratao | 3 comments | | HN request time: 0.205s | source
Show context
c0l0 ◴[] No.41085314[source]
This would have been such a great resource for me just a few weeks ago!

We wanted to have finally encrypt the L2 links between our DCs and got quotes from a number of providers for hardware appliances, and I was like, "no WAY this ought to cost that much!', and went off to try to build something myself that hauled Ethernet frames over a wireguard overlay network at 10Gbps using COTS hardware. I did pull it off after a tenday of work or so, undercutting the cheapest offer by about 70% (and the most expensive one by about 95% or so...), but there was a lot of intricate reading and experimentation involved.

I am looking forward to validate my understanding against the content of this article - it looks very promising and comprehensive at first and second glance! Thanks for creating and posting it.

replies(2): >>41085350 #>>41085983 #
1. freedomben ◴[] No.41085350[source]
Are you able to share your code? I'd be fascinated to see how you would do that.
replies(2): >>41085957 #>>41092289 #
2. jasonjayr ◴[] No.41085957[source]
I just shared this a moment ago in another comment, but:

https://github.com/m13253/VxWireguard-Generator

https://gitlab.com/NickCao/RAIT

Both build a set of Wireguard configurations so you can setup a L2 mesh, and then run whatever routing protocol you want on them (Babel, BGP, etc)

(not the OP, but I use these the first one in my own multi-site network mesh between DO, AWS, 2x physical DC, and our office.)

3. c0l0 ◴[] No.41092289[source]
It's not really (my) code, it's just some clever configuration, mostly done via systemd-networkd.

At the "outside", there's two NIC with SFP+ ports that are connected via single mode optical fiber that runs through the city - let's call these NICs eth0 on each of their nodes. eth0 have RFC1918 IP addresses assigned and can talk IP with each other. Between those nodes, a wireguard instance encrypts traffic in an inner RFC1918-network of its own - that is wg0 on each node. (Initially, I had IPv6 ULA networks prepared for these two pruposes, but afaict there's some important offload support missing for IPv6 in Linux still, and performance was quite severly hampered by that.) Then, each of the nodes defines a GRETAP netdev that has, as its endpoint, the peer's wireguard interface address - that interface is grt0.

Finally, on each side, another NIC SFP+ port (let's assume eth1) using a DAC plugs into the local switch uplink port. eth1 configure in promiscuous mode, and some `tc-mirred(8)` magic makes sure every frame it receives gets replayed over grt0, and every frame that is received via grt0 gets replayed over eth1.

So it kinda looks like this in a (badly "designed") ad-hoc ASCII graph:

    [SWITCH]-<dac>-[ETH1]-<tc>-[GRT0]-[WG0]-[ETH0]-<fiber>-...
... with the whole shebang replicated once more, but in reverse, on the right-hand-side of the <fiber> cable/element.

An earlier iteration I (briefly ;)) had in operation featured a Linux bridge instead of tc, but it quickly turned out that won't work with a few L2 protocols that we unfortunately need in operation across these links (and group_fwd_mask won't cut it for them either, so patching the kernel would have been necessary), while tc-mirred can actually replay L2 traffic without any restrictions.