Most active commenters

rhuber(8)
JeremyNT(5)
discardedrefuse(5)
stavros(3)

Popular/hot comments

>>31261607 #

←back to thread

Tailscale raises $100M

(tailscale.com)

Show context

arsome ◴[04 May 22 14:39 UTC] No.31261100[source]▶

>>31259950 (OP) #

I was going to try TailScale but then it seemed the only option to do so as an individual was to login with a 3rd party cloud provider, which I in no way want tied into my networks.

I gave up and just setup wireguard directly instead, I don't trust Tailscale either if that's their attitude towards privacy, it's permanently marred my vision of their product.

replies(10): >>31261128 #>>31261230 #>>31261250 #>>31261558 #>>31261667 #>>31261807 #>>31261815 #>>31261981 #>>31262022 #>>31262899 #

1. JeremyNT ◴[04 May 22 14:49 UTC] No.31261250[source]▶

>>31261100 #

Indeed, this is why I won't use it either. I settled on Slack's Nebula [0] instead of wireguard because it handles direct p2p communication between nodes automatically.

There also exists an open source implementation of the tailscale control server [1] that you could self host.

[0] https://github.com/slackhq/nebula

[1] https://github.com/juanfont/headscale

replies(2): >>31261607 #>>31261688 #

2. rhuber ◴[04 May 22 15:12 UTC] No.31261607[source]▶

>>31261250 (TP) #

(Nebula coauthor here)

People sometimes ask me to describe the differences between Nebula and Tailscale. One of the most important relates to performance and scale. Nebula can handle the amount of internal network traffic and scalability of nodes (100k+ nodes, constant churn) required on a large network like Slack's, but Tailscale cannot. Tailscale's performance is fine for many situations, but not suitable for infrastructure. It is just a fundamentally different set of goals.

Nebula was created and open sourced before Tailscale was offering their product, but their architecture is similar to older offerings in the market, and is something we purposely avoided when creating Nebula.

Fwiw, I even recommend Tailscale to friends who want to do things like connect to their Plex server or Synology or [other thing] at home remotely. It simplifies this kind of thing greatly and doesn't require you to set up any infrastructure you control directly, which can be a headache for folks who just want to reach a handful of computers/devices.

replies(6): >>31261776 #>>31261960 #>>31262150 #>>31262492 #>>31263218 #>>31264233 #

3. discardedrefuse ◴[04 May 22 15:17 UTC] No.31261688[source]▶

>>31261250 (TP) #

Absolutely love nebula and really wanted it to win when I did my overlay network shootout (for personal use). But device on-boarding and management was overly complex for a lay person (I have a couple users that would require access).

I settled on ZeroTier for now. Unfortunately, I don't think ZeroTier is my long term solution. Their self-hosted option comes with a plethora of caveats that make it basically unusable. And I'm always scared companies that offer free versions of their paid product will eventually neuter the free tier.

I'll be keeping an eye on headscale. Hopefully they get their mobile client situation in order.

replies(1): >>31264257 #

4. stavros ◴[04 May 22 15:22 UTC] No.31261776[source]▶

>>31261607 #

Does Nebula have anything like Tailscale's rules engine? I am absolutely in love with being able to configure all my connections by just specifying a JSON file somewhere. No need to have firewalls, the configuration specifies which service or user can talk to which.

That having been said, I also am wary of using Tailscale for the same reasons as above, I have to trust Tailscale and Github? I can maybe justify trusting Tailscale, but trusting GH/Microsoft/other SSO provider is a bridge too far.

replies(1): >>31261821 #

5. rhuber ◴[04 May 22 15:26 UTC] No.31261821{3}[source]▶

>>31261776 #

It does! In fact replacing AWS security groups and making them cross region and cross platform was probably the first goal of the project. My coauthor, Nate, wrote Nebula's internal firewall code before we wrote a single line of the actual protocol, because he wanted to ensure it was performant enough for massive scale.

replies(1): >>31262134 #

6. JeremyNT ◴[04 May 22 15:36 UTC] No.31261960[source]▶

>>31261607 #

> Fwiw, I even recommend Tailscale to friends who want to do things like connect to their Plex server or Synology or [other thing] at home remotely. It simplifies this kind of thing greatly and doesn't require you to set up any infrastructure you control directly, which can be a headache for folks who just want to reach a handful of computers/devices.

First thanks for working on Nebula! It's great.

Nebula seems to be about 95% there. The functionality it actually does provide once set up is really great. It's just missing the 5% that is arguably the most important for a huge number of people: a simple way to do the configuration management bits such as device enrollment, revocations, key rotations, that sort of thing.

If you are a home user, with a small network, the overhead of doing things manually is low, but you need to be patient and technical enough to read the docs and do it right initially. If you're a big enough organization I guess you can write your own tooling. But for any small shop or any non-technical home user this is not going to fly and you will bounce off it.

I don't know if the plan is to create a commercial offering for this side of the house (it would make sense...) but as far as I'm concerned, this is the only reason that Tailscale is so successful and Nebula is lesser known (despite Nebula's advantages in other ways that may be more relevant to technical users).

replies(1): >>31262675 #

7. stavros ◴[04 May 22 15:49 UTC] No.31262134{4}[source]▶

>>31261821 #

Well that is great, thank you! I will play with it today.

replies(1): >>31264527 #

8. crawshaw ◴[04 May 22 15:50 UTC] No.31262150[source]▶

>>31261607 #

Tailscalar here. Tailscale can handle 100k+ nodes with lots of churn just fine.

replies(1): >>31262350 #

9. rhuber ◴[04 May 22 16:04 UTC] No.31262350{3}[source]▶

>>31262150 #

Fair enough. I am sure the key distribution is fast and all that, but not needing peer key distribution at all was a goal and the overhead associated is less scalable than just not doing it at all. Regardless, very cool that you can handle that many nodes, which is a hard problem. I assume you do just-in-time key distribution or something, because (n-1) distribution of peer keys would be ... less than ideal.

Anywho, the more important bit is my point about performance. Nebula is significantly faster than userspace Wireguard, and plain userspace Wireguard is (last I checked) a bit faster than Tailscale, due to the additional code needed for things like your ACLs. At gigabit type scale it is probably fine and not noticeable, but at Slack, we needed to scale to 10G+ on links, while ensuring we didn't take a significant hit on CPU resources.

Again, I think Tailscale is very good for its target use case as a VPN replacement, and congrats on raising these funds!

replies(1): >>31264794 #

10. vgel ◴[04 May 22 16:15 UTC] No.31262492[source]▶

>>31261607 #

> People sometimes ask me to describe the differences between Nebula and Tailscale. One of the most important relates to performance and scale. Nebula can handle the amount of internal network traffic and scalability of nodes (100k+ nodes, constant churn) required on a large network like Slack's, but Tailscale cannot. Tailscale's performance is fine for many situations, but not suitable for infrastructure. It is just a fundamentally different set of goals.

Making broad claims like this without a source or links to benchmarks feels like FUD to me. For example Tailscale's comparison page on performance (https://tailscale.com/kb/1148/tailscale-vs-nebula/#performan...) doesn't mention a meaningful performance difference, so if you're claiming they're not telling the truth (by omission), I'd hope to see more to that than just a straight assertion, even just "We tried Tailscale in Slack's network and it wasn't able to keep up with our usage patterns".

replies(1): >>31262546 #

11. rhuber ◴[04 May 22 16:19 UTC] No.31262546{3}[source]▶

>>31262492 #

Another fair criticism. We will publish the benchmarks and make them repeatable (which most existing ones I've found don't bother to do). We hadn't done so because Tailscale isn't really seen as a direct competitor to what the Nebula project is doing, but if people want numbers, that's a thing we are happy to provide.

replies(2): >>31265922 #>>31266998 #

12. rhuber ◴[04 May 22 16:29 UTC] No.31262675{3}[source]▶

>>31261960 #

The Nebula CA we built at Slack was very specific to Slack's internal devops, and just wasn't generalizable. It is highly automated there, and is custom tooling, just as you describe. The open source version is somewhat bare bones (a command line tool for CA vs something like vault).

I will say that the OSS tooling of Nebula is everything someone needs to stand up an entire working network on every common platform (linux/mac/windows/ios/android), but there is a definite gap in simplification that we need to address to make it easier for smaller scale use cases.

We actually have a managed enterprise Nebula offering at my current gig, but that's rather a different market than Tailscale, so I'm avoiding talking as that company as opposed to a Nebula OSS project lead. The commercial offering is targeted at large enterprises, because that's the market where Nebula has unique advantages. It also means we don't currently have a freemium or smb type offering, and are not prioritizing creating one at all. I don't want to give people false hope that we will, and would prefer to see the OSS project improve to address the small-medium use cases.

13. ncmncm ◴[04 May 22 17:12 UTC] No.31263218[source]▶

>>31261607 #

See, I have seen promotions of Tailscale and Zerotier before, but this is the first I have heard of Nebula. If with Nebula I am not beholden to some internet behemoth who may cancel my authentication without notice, I am motivated to try it.

14. FL410 ◴[04 May 22 18:40 UTC] No.31264233[source]▶

>>31261607 #

Nebula rocks!

15. FL410 ◴[04 May 22 18:42 UTC] No.31264257[source]▶

>>31261688 #

I am curious what you found complex - was it the PKI? I was able to get Nebula up and running WAY faster than any of the others. It's two (well really only one) binaries and a config file - the simplicity is awesome.

replies(2): >>31264889 #>>31264992 #

16. stavros ◴[04 May 22 19:04 UTC] No.31264527{5}[source]▶

>>31262134 #

Ah, it looks like the firewall rules need to be copied to each host separately. That's not a dealbreaker, but not as easy to deploy as having them managed centrally (by the lighthouse, I guess?).

17. lupire ◴[04 May 22 19:32 UTC] No.31264794{4}[source]▶

>>31262350 #

> the overhead associated is less scalable than just not doing it at all

That's only true if you can actually articulate a reason why it won't scale to some matbitut that some user might actually need today or at some point in the future.

For example, Go may be "not as scalable at C" (or vice versa! Or both!), but what matters is the scale to which it is actually desired to be deployed.

replies(1): >>31265394 #

18. JeremyNT ◴[04 May 22 19:43 UTC] No.31264889{3}[source]▶

>>31264257 #

It's easy to get started, but the issues come mostly from managing that "just a config file" over time.

Have a bunch of new nodes? Replacing a lighthouse? Revoking and replacing certs?

Here's a mistake that I made personally. Did you read the docs fully and realize that the default expiration for a CA is one year? The same is true for certificates. You need some kind of tooling to rotate certs every year, by default, or one day you'll find your entire overlay network disappears.

What about the ACL lists? Well, they're just stored in that same config file. What if you add a new service you didn't count on initially? Or you have a new class of clients?

What if your lighthouse needs to change its IP address? Or you need to retire and replace it outright?

And if you have hosts coming and going a lot, suddenly managing all those configuration files looks like quite a pain indeed...

None of this is unsolvable - assuming you have root on all the nodes you care about. You could even create tooling to automate these things with some kind of configuration management system (which indeed, if you are deploying to more than a handful of systems, you basically must do). But these pain points will eventually add up if you are just trying to connect to friends.

replies(1): >>31265035 #

19. discardedrefuse ◴[04 May 22 19:53 UTC] No.31264992{3}[source]▶

>>31264257 #

I found it too complex for a lay person. On a regular computer or server its not too bad. I can send someone a config file with the certs and keys already built in. That's easy enough. But on mobile it requires a back and forth exchange of keys over a different medium.

Compare that to ZeroTier where I can just tell someone, "install this app and punch in this Network ID". Also, ZT lets me control the entire network firewall from a centralized place. Where Nebula is doing it on a per-client basis and requires new certs if device groups change.

I don't want to talk up ZT too much though. Their self-hosted option is a joke. There is no webui. You have to do everything via the API...including the firewall rules; And you have to write those rules in the non-human readable format that their webui abstracts away. Worse still, their mobile apps won't work with the self-hosted option. I used them to get something up and running quickly, but I'll probably end up on Nebula anyways.

replies(1): >>31265633 #

20. discardedrefuse ◴[04 May 22 19:57 UTC] No.31265035{4}[source]▶

>>31264889 #

Just FYI, when you create a CA cert or sign certs with nebula-cert you can specify a -duration. Which I know doesn't help you after the fact, but it might help someone going forward.

replies(1): >>31266051 #

21. rhuber ◴[04 May 22 20:27 UTC] No.31265394{5}[source]▶

>>31264794 #

I mean... the title of the Tailscale blog post is "Tailscale raises $100M… to fix the Internet", and that's pretty massive scale. /s

I don't have 100k hosts on a large network to test deploying Tailscale, but if I did, I'd be benchmarking the cpu/network/storage overhead of telling 99,999 hosts about a new one that comes online, every time that happens, or every time its pubkey changes. You can optimize this away _if_ your "fan out" is not as large, but there are plenty of cases where every host on your network needs to talk to a particular host, so all of them need to know about its keys as soon as possible.

Again these aren't unsolvable problems, to a point, but we didn't want to solve a problem when we could avoid it entirely, so that's the path we chose. It removes complexity and is a good part of the reason the system we built has been resilient.

A complaint some people express about tailscale is the battery life on mobile (or at least iOS). This exists because there is coordination overhead on even idle tailscale nodes. Back when we ported Nebula to iOS, we sweated details like "how often it wakes the radios" and did a lot of profiling. I never turn Nebula "off" on my iPhone, and it just sits in there in the background not using any resources most of the time.

We worked hard to optimize this out of our architecture, so that Nebula avoids generating traffic that is unrelated to the actual communication between hosts or lookups to lighthouses. An idle nebula tunnel can truly be idle indefinitely, and that also matters as the set of hosts becomes larger.

I do not think the Nebula project and Tailscale are direct replacements for each other in any fashion, and afaik neither is trying to be. I'm just pointing out that different design goals led to unique advantages and disadvantages to each architecture.

22. api ◴[04 May 22 20:45 UTC] No.31265633{4}[source]▶

>>31264992 #

> Their self-hosted option is a joke. There is no webui.

There's a community developed one:

https://github.com/key-networks/ztncui

replies(1): >>31268518 #

23. SahAssar ◴[04 May 22 21:13 UTC] No.31265922{4}[source]▶

>>31262546 #

So "People sometimes ask me to describe the differences between Nebula and Tailscale" and the answer is "performance and scale", but you don't have clear comparisons for those numbers?

replies(1): >>31266049 #

24. rhuber ◴[04 May 22 21:24 UTC] No.31266049{5}[source]▶

>>31265922 #

We have an automated set of ansible scripts that spin up large groups of hosts for Nebula performance regression testing, and a while back I added zerotier, tailscale, wireguard-userspace, wireguard, tinc, ipsec, and openvpn to that automation so I could get a sense of where things stand. I spent a lot of time optimizing each of the above options to make fair comparisons, but it was mostly for mine and the team's curiosity, and we weren't interested in playing benchmark-fight with similar softwares of the world.

Publishing repeatable benchmarks is hard, and when doing open source work, it just hasn't been a priority. As I replied above, if I'm going to say it I should prove it, and I promised to do just that.

And a counterpoint: tailscale does mention in the "Tailscale vs Nebula" article on their website that performance is just about the same but similarly provides no proof. This is motivation enough for me to show proof of the opposite, I guess.

25. JeremyNT ◴[04 May 22 21:25 UTC] No.31266051{5}[source]▶

>>31265035 #

Very good to know! I did learn this and used 10 year certs/ca when my originals expired... as will presumably most of the other people who didn't fully grok the implications of the defaults :)

replies(1): >>31266113 #

26. rhuber ◴[04 May 22 21:32 UTC] No.31266113{6}[source]▶

>>31266051 #

We need to do a better job of this and I'm really sorry you had a not-great experience with expiration. Totally agree with your take.

replies(1): >>31273594 #

27. vgel ◴[04 May 22 23:03 UTC] No.31266998{4}[source]▶

>>31262546 #

That's fair, if you've been benchmarking but haven't made the benchmarks public / repeatable yet. Too used to software where the authors claim it's fast with no proof or based on heuristics like what language it's written in :-)

28. discardedrefuse ◴[05 May 22 02:05 UTC] No.31268518{5}[source]▶

>>31265633 #

I had looked at this. It doesn't seem like they've implemented anything to handle firewall rules. They may not even be able to, seeing as how that part of ZT is closed source. Also, this doesn't solve the problem with mobile apps, so the whole thing was a moot point for me.

replies(1): >>31270067 #

29. benoliver999 ◴[05 May 22 06:02 UTC] No.31270067{6}[source]▶

>>31268518 #

The mobile app does work with the self hosted option, we use it at work.

replies(1): >>31290153 #

30. JeremyNT ◴[05 May 22 14:13 UTC] No.31273594{7}[source]▶

>>31266113 #

I hope I don't come across as too negative! Sure I'd love to see some improvements here, and they would help adoption amongst hobbyists / home users, but I totally understand focusing on the features needed to make the business work first.

The existing open source functionality for the overlay network itself is (for me) what's really exciting, and it's all there. The management limitations just keep me from evangelizing more broadly (outside of places like HN).

31. discardedrefuse ◴[06 May 22 22:19 UTC] No.31290153{7}[source]▶

>>31270067 #

The official ZT docs* say, "The mobile apps don't support custom roots." And I don't see any setting in the Android app to point it to any server.

* https://docs.zerotier.com/self-hosting/introduction

replies(1): >>31298146 #

32. benoliver999 ◴[07 May 22 20:16 UTC] No.31298146{8}[source]▶

>>31290153 #

Ah, that's because we run a controller node not a root. So you just add an ID as normal.

The software linked in the parent works with the mobile apps.

↑