Most active commenters
  • bscphil(5)
  • croemer(3)

←back to thread

816 points tosh | 22 comments | | HN request time: 2.114s | source | bottom
Show context
geerlingguy ◴[] No.41276702[source]
I've used this for years when passing large files between systems in weird network environments, it's almost always flawless.

For some more exotic testing, I was able to run my own magic wormhole relay[1], which let me tweak some things for faster/more reliable huge file copies. I still hate how often Google Drive will fall over when you throw a 10s-of-GB file at it.

[1] https://www.jeffgeerling.com/blog/2023/my-own-magic-wormhole...

replies(4): >>41277198 #>>41277682 #>>41277698 #>>41278657 #
1. bscphil ◴[] No.41277698[source]
> For some more exotic testing, I was able to run my own magic wormhole relay[1], which let me tweak some things for faster/more reliable huge file copies.

The lack of improvement in these tools is pretty devastating. There was a flurry of activity around PAKEs like 6 years ago now, but we're still missing:

* reliable hole punching so you don't need a slow relay server

* multiple simultaneous TCP streams (or a carefully designed UDP protocol) to get large amounts of data through long fat pipes quickly

Last time I tried using a Wormhole to transmit a large amount of data, I was limited to 20 MB/sec thanks to the bandwidth-delay product. I ended up using plain old http, with aria2c and multiple streams I maxed out a 1 Gbps line.

IMO there's no reason why PAKE tools shouldn't have completely displaced over-complicated stuff like Globus (proprietary) for long distance transfer of huge data, but here we are stuck in the past.

replies(4): >>41278538 #>>41279150 #>>41279898 #>>41284508 #
2. themoonisachees ◴[] No.41278538[source]
I overall agree, but "reliable holpunching" is an oxymoron. Hole punching is by definition an exploit of undefined behavior, and I don't see the specs getting updated to support it. UPnP IGD was supposed to be that, but well...
replies(1): >>41279167 #
3. croemer ◴[] No.41279150[source]
20MB/sec is 160Mbps, so wormhole wasn't that far off the 1Gbps. Sure not maxing out but within a factor of 6.
replies(3): >>41279692 #>>41279911 #>>41282474 #
4. namibj ◴[] No.41279167[source]
Well, with v6 you're down from NAT-hole-punching to Firewall-hole-punching, which in principle should be as simple as arranging the IP:Port pairs of both ends via the setup channel, and then sending a "SYN" packet in both directions at once.

Then, trying to use e.g. TCP Prague (or, I guess, it's congestion control with UDP-native QUIC) as a scalable congestion controller, to take care of the throughout restrictions caused by high bandwidth delay product.

replies(1): >>41312091 #
5. ghusbands ◴[] No.41279692[source]
A factor of six is a very long way off, pretty much universally.
replies(2): >>41280240 #>>41282224 #
6. Uptrenda ◴[] No.41279898[source]
I've been working on this problem for a few years now and have made considerable progress. https://p2pd.readthedocs.io/en/latest/python/index.html

I'm working on a branch that considerably improves the current code and hole punching in it works like a swiss watch. If you're interested you should check out some of the features that work well already.

7. dgoldstein0 ◴[] No.41279911[source]
^ found the astronomer
replies(1): >>41280526 #
8. mrinfinitiesx ◴[] No.41280240{3}[source]
Yeah but with magic wormholes you see, there could be other universes where that's not the case and 160mbps is close to 1024mbps or 1000mbps whatever the cool kids call a gigabit now adays.
9. anonymousiam ◴[] No.41280526{3}[source]
I (Electrical + Software Engineer) once worked for a physicist who believed that anything less than an order of magnitude was merely an engineering problem. He was usually correct.
replies(4): >>41280572 #>>41281479 #>>41282602 #>>41285264 #
10. vasco ◴[] No.41280572{4}[source]
I was taught the same. To not care a lot about things under an order of magnitude. Over the years when planning large software projects or assessing incidents and so on, the 1 order of magnitude threshold helped me often.
replies(1): >>41281098 #
11. croemer ◴[] No.41281098{5}[source]
Bingo, I studied Physics!
12. elashri ◴[] No.41281479{4}[source]
As a physicist, I think this is correct too :). You don't start to see problems with things under that, unless they are deviation from standard model predictions.
13. croemer ◴[] No.41282224{3}[source]
Not as far off as the casual reader might think 20MB vs 1Gb sounds way more than the actuall 160Mb vs 1Gb - one shouldn't use Bytes and bits in a direct comparison together. One or the other, otherwise it's misleading/confusing.
replies(1): >>41285306 #
14. CyberDildonics ◴[] No.41282474[source]
If you sit down and need to wait one hour for one and 6 hours for the other to do the same thing I doubt you would say they are 'not that far off'
15. sva_ ◴[] No.41282602{4}[source]
Variance in accuracy of this statement also safely within one order of magnitude
16. sleepydog ◴[] No.41284508[source]
As a protocol tcp should be able to utilize a long fat pipe with a large enough receive window. You might want to check what window scaling factor is used and look for a tunable. I accept that some implementations may have limits beyond the protocol level. And even low levels of packet loss can severely affect throughput of a single stream.

A bigger reason you want multiple streams is because most network providers use a stream identifier like the 5-tuple hash to spread traffic, and support single-stream bandwidth much lower than whatever aggregate they may advertise.

replies(1): >>41285258 #
17. bscphil ◴[] No.41285258[source]
> with a large enough receive window

Yeah, that's the issue. I didn't have root permissions on either side. Moreover, a transfer tool should just work without requiring its users to have expert knowledge like this.

In this case, I checked the roundtrip ping time and multiplied it by the buffer size, and it agreed with the speeds I was seeing within ~5%, so it was not an issue with throttling. Actually, if I were a network provider interested in doing this, I would throttle on the 2-tuple as well.

18. bscphil ◴[] No.41285264{4}[source]
An order of magnitude isn't a defined quantity, it depends on what base you're working in.
replies(1): >>41292709 #
19. bscphil ◴[] No.41285306{4}[source]
In this case transferring the data at the slow rate would have taken more than a week, so it's no small difference. Actually one side had a 10 Gbps line, so if the other side had had faster networking I could easily have exceeded the limit and gotten the transfer done more than 6x faster.

I used the term "1 Gbps line" just because it's a well known quantity - the limitation of Gigabit Ethernet. The point wasn't that multiplexing TCP can get you 6x better speeds, it's that it improved the speed so much that the TCP bandwidth-delay product was no longer the limiting factor in the transfer.

20. newaccount74 ◴[] No.41292709{5}[source]
The difference between log 2, ln, and log 10 is less than an order of magnitude, so to a physicist it's all the same :)
21. a_subsystem ◴[] No.41312091{3}[source]
>> sending a "SYN" packet in both directions at once.

Might be a totally dumb question but how does this work? Wouldn’t you already have to have communication to set a time?

replies(1): >>41343031 #
22. bscphil ◴[] No.41343031{4}[source]
That's why protocols like this have what Wormhole calls a "mailbox server", which allows two ends separated by firewalls to do secure key exchange and agree upon a method for punching through directly. See also STUN: https://en.wikipedia.org/wiki/STUN