Most active commenters
  • bscphil(6)
  • croemer(3)

←back to thread

816 points tosh | 39 comments | | HN request time: 1.664s | source | bottom
1. geerlingguy ◴[] No.41276702[source]
I've used this for years when passing large files between systems in weird network environments, it's almost always flawless.

For some more exotic testing, I was able to run my own magic wormhole relay[1], which let me tweak some things for faster/more reliable huge file copies. I still hate how often Google Drive will fall over when you throw a 10s-of-GB file at it.

[1] https://www.jeffgeerling.com/blog/2023/my-own-magic-wormhole...

replies(4): >>41277198 #>>41277682 #>>41277698 #>>41278657 #
2. cl3misch ◴[] No.41277198[source]
> you need a machine that can handle whatever link speeds you need

I would have expected the relay server only being used for initial handshake to punch through NAT, after which the transfer is P2P. Only in the case of some network restrictions the data really flows through the relay. How could they afford running the free relay otherwise?

replies(2): >>41277253 #>>41277819 #
3. lotharrr ◴[] No.41277253[source]
There are two servers. The "mailbox server" helps with handshakes and metadata transfers, and is super-low bandwidth, a few hundred bytes per connection. The "transit relay helper" is the one that handles the bulk data transfer iff the two sides were unable to establish a direct connection.

I've been meaning to find the time to add NAT-hole-punching for years, but haven't managed it yet. We'd use the mailbox server messages to help the two sides learn about the IP addresses to use. That would increase the percentage of transfers that avoid the relay, but the last I read, something like 20% of peer-pairs would still need the relay, because their NATs are too restrictive.

The relay usage hasn't been expensive enough to worry about, but if it gets more popular, that might change.

replies(1): >>41277764 #
4. dangoodmanUT ◴[] No.41277682[source]
I just read this out in your voice
replies(1): >>41285073 #
5. bscphil ◴[] No.41277698[source]
> For some more exotic testing, I was able to run my own magic wormhole relay[1], which let me tweak some things for faster/more reliable huge file copies.

The lack of improvement in these tools is pretty devastating. There was a flurry of activity around PAKEs like 6 years ago now, but we're still missing:

* reliable hole punching so you don't need a slow relay server

* multiple simultaneous TCP streams (or a carefully designed UDP protocol) to get large amounts of data through long fat pipes quickly

Last time I tried using a Wormhole to transmit a large amount of data, I was limited to 20 MB/sec thanks to the bandwidth-delay product. I ended up using plain old http, with aria2c and multiple streams I maxed out a 1 Gbps line.

IMO there's no reason why PAKE tools shouldn't have completely displaced over-complicated stuff like Globus (proprietary) for long distance transfer of huge data, but here we are stuck in the past.

replies(4): >>41278538 #>>41279150 #>>41279898 #>>41284508 #
6. bscphil ◴[] No.41277764{3}[source]
The folks on the wormhole-rs fork (who appear to share your Github organization? [1]) already have NAT punching working 95+% of the time in my testing, so maybe what they're doing could be ported over to the Python implementation.

[1] https://github.com/magic-wormhole

replies(2): >>41282062 #>>41358694 #
7. from-nibly ◴[] No.41277819[source]
You cant make a p2p connection over a NAT without exposing a port on the public side of the NAT.
replies(3): >>41277878 #>>41278549 #>>41278577 #
8. t0mas88 ◴[] No.41277878{3}[source]
You can: https://en.wikipedia.org/wiki/Hole_punching_(networking)
9. themoonisachees ◴[] No.41278538[source]
I overall agree, but "reliable holpunching" is an oxymoron. Hole punching is by definition an exploit of undefined behavior, and I don't see the specs getting updated to support it. UPnP IGD was supposed to be that, but well...
replies(1): >>41279167 #
10. kccqzy ◴[] No.41278549{3}[source]
Go check out STUN and ICE.

The best article I've found about NAT traversal is this article from Tailscale: https://tailscale.com/blog/how-nat-traversal-works

replies(2): >>41280333 #>>41282082 #
11. voxic11 ◴[] No.41278577{3}[source]
You aren't guaranteed to be able to do that but in practice most times you can.
12. bsharper ◴[] No.41278657[source]
I end up using a combination of scp, LocalSend, magic wormhole and sharedrop.io. Occasionally `python -m http.server` in a pinch for local downloads. It's unfortunate that this xkcd comic is still as relevant as it was in 2011: https://xkcd.com/949/
13. croemer ◴[] No.41279150[source]
20MB/sec is 160Mbps, so wormhole wasn't that far off the 1Gbps. Sure not maxing out but within a factor of 6.
replies(3): >>41279692 #>>41279911 #>>41282474 #
14. namibj ◴[] No.41279167{3}[source]
Well, with v6 you're down from NAT-hole-punching to Firewall-hole-punching, which in principle should be as simple as arranging the IP:Port pairs of both ends via the setup channel, and then sending a "SYN" packet in both directions at once.

Then, trying to use e.g. TCP Prague (or, I guess, it's congestion control with UDP-native QUIC) as a scalable congestion controller, to take care of the throughout restrictions caused by high bandwidth delay product.

replies(1): >>41312091 #
15. ghusbands ◴[] No.41279692{3}[source]
A factor of six is a very long way off, pretty much universally.
replies(2): >>41280240 #>>41282224 #
16. Uptrenda ◴[] No.41279898[source]
I've been working on this problem for a few years now and have made considerable progress. https://p2pd.readthedocs.io/en/latest/python/index.html

I'm working on a branch that considerably improves the current code and hole punching in it works like a swiss watch. If you're interested you should check out some of the features that work well already.

17. dgoldstein0 ◴[] No.41279911{3}[source]
^ found the astronomer
replies(1): >>41280526 #
18. mrinfinitiesx ◴[] No.41280240{4}[source]
Yeah but with magic wormholes you see, there could be other universes where that's not the case and 160mbps is close to 1024mbps or 1000mbps whatever the cool kids call a gigabit now adays.
19. jvansc ◴[] No.41280333{4}[source]
https://sendfiles.dev/
replies(2): >>41280397 #>>41283384 #
20. yownie ◴[] No.41280397{5}[source]
is there a filesize limit for this?
21. anonymousiam ◴[] No.41280526{4}[source]
I (Electrical + Software Engineer) once worked for a physicist who believed that anything less than an order of magnitude was merely an engineering problem. He was usually correct.
replies(4): >>41280572 #>>41281479 #>>41282602 #>>41285264 #
22. vasco ◴[] No.41280572{5}[source]
I was taught the same. To not care a lot about things under an order of magnitude. Over the years when planning large software projects or assessing incidents and so on, the 1 order of magnitude threshold helped me often.
replies(1): >>41281098 #
23. croemer ◴[] No.41281098{6}[source]
Bingo, I studied Physics!
24. elashri ◴[] No.41281479{5}[source]
As a physicist, I think this is correct too :). You don't start to see problems with things under that, unless they are deviation from standard model predictions.
25. Fnoord ◴[] No.41282062{4}[source]
The Rust implementation on Tailscale worked well for me. Except on a layer 7 firewall have to be quick to permit the connection or else it tries fallback.
26. Fnoord ◴[] No.41282082{4}[source]
Not bad, though you don't even need STUN or ICE;

https://github.com/samyk/pwnat

https://github.com/samyk/slipstream

27. croemer ◴[] No.41282224{4}[source]
Not as far off as the casual reader might think 20MB vs 1Gb sounds way more than the actuall 160Mb vs 1Gb - one shouldn't use Bytes and bits in a direct comparison together. One or the other, otherwise it's misleading/confusing.
replies(1): >>41285306 #
28. CyberDildonics ◴[] No.41282474{3}[source]
If you sit down and need to wait one hour for one and 6 hours for the other to do the same thing I doubt you would say they are 'not that far off'
29. sva_ ◴[] No.41282602{5}[source]
Variance in accuracy of this statement also safely within one order of magnitude
30. kccqzy ◴[] No.41283384{5}[source]
That uses WebRTC, which uses the same NAT traversal tricks.
31. sleepydog ◴[] No.41284508[source]
As a protocol tcp should be able to utilize a long fat pipe with a large enough receive window. You might want to check what window scaling factor is used and look for a tunable. I accept that some implementations may have limits beyond the protocol level. And even low levels of packet loss can severely affect throughput of a single stream.

A bigger reason you want multiple streams is because most network providers use a stream identifier like the 5-tuple hash to spread traffic, and support single-stream bandwidth much lower than whatever aggregate they may advertise.

replies(1): >>41285258 #
32. geerlingguy ◴[] No.41285073[source]
Heh and I was able to do some of that work in service of the dumb but fun test of Internet vs Pigeon data transfer speeds.
33. bscphil ◴[] No.41285258{3}[source]
> with a large enough receive window

Yeah, that's the issue. I didn't have root permissions on either side. Moreover, a transfer tool should just work without requiring its users to have expert knowledge like this.

In this case, I checked the roundtrip ping time and multiplied it by the buffer size, and it agreed with the speeds I was seeing within ~5%, so it was not an issue with throttling. Actually, if I were a network provider interested in doing this, I would throttle on the 2-tuple as well.

34. bscphil ◴[] No.41285264{5}[source]
An order of magnitude isn't a defined quantity, it depends on what base you're working in.
replies(1): >>41292709 #
35. bscphil ◴[] No.41285306{5}[source]
In this case transferring the data at the slow rate would have taken more than a week, so it's no small difference. Actually one side had a 10 Gbps line, so if the other side had had faster networking I could easily have exceeded the limit and gotten the transfer done more than 6x faster.

I used the term "1 Gbps line" just because it's a well known quantity - the limitation of Gigabit Ethernet. The point wasn't that multiplexing TCP can get you 6x better speeds, it's that it improved the speed so much that the TCP bandwidth-delay product was no longer the limiting factor in the transfer.

36. newaccount74 ◴[] No.41292709{6}[source]
The difference between log 2, ln, and log 10 is less than an order of magnitude, so to a physicist it's all the same :)
37. a_subsystem ◴[] No.41312091{4}[source]
>> sending a "SYN" packet in both directions at once.

Might be a totally dumb question but how does this work? Wouldn’t you already have to have communication to set a time?

replies(1): >>41343031 #
38. bscphil ◴[] No.41343031{5}[source]
That's why protocols like this have what Wormhole calls a "mailbox server", which allows two ends separated by firewalls to do secure key exchange and agree upon a method for punching through directly. See also STUN: https://en.wikipedia.org/wiki/STUN
39. meejah ◴[] No.41358694{4}[source]
That is just using normal STUN/TURN via another server that one of those developers is running