←back to thread

492 points storf45 | 8 comments | | HN request time: 0.424s | source | bottom
Show context
dylan604 ◴[] No.42157048[source]
People just do not appreciate how many gotchas can pop up doing anything live. Sure, Netflix might have a great CDN that works great for their canned content and I could see how they might have assumed that's the hardest part.

Live has changed over the years from large satellite dishes beaming to a geosat and back down to the broadcast center($$$$$), to microwave to a more local broadcast center($$$$), to running dedicated fiber long haul back to a broadcast center($$$), to having a kit with multiple cell providers pushing a signal back to a broadcast center($$), to having a direct internet connection to a server accepting a live http stream($).

I'd be curious to know what their live plan was and what their redundant plan was.

replies(6): >>42157110 #>>42157117 #>>42157164 #>>42159101 #>>42159285 #>>42159954 #
colesantiago ◴[] No.42157164[source]
This is the whole point of chaos engineering that was invented at Netflix, which tests the resiliency of these systems.

I guess we now know the limits of what "at scale" is for Netflix's live-streaming solution. They shouldn't be failing at scale on a huge stage like this.

I look forward to reading the post mortem about this.

replies(1): >>42157426 #
1. dylan604 ◴[] No.42157426[source]
Everyone keeps mentioning at scale. I seriously doubt this was an "at scale" problem. I have strong suspicion this was a failure at the origination point being able to push a stable signal. That is not an "at scale" issue, but a hubris of we can do better/cheaper than broadcasting standard practices
replies(6): >>42157737 #>>42158523 #>>42159296 #>>42159379 #>>42159456 #>>42160379 #
2. zinodaur ◴[] No.42157737[source]
If it was a problem at origin, why did it get better/worse as viewership fell/rose?
3. kristjansson ◴[] No.42158523[source]
As counterpoint, I observed 2-3 drops in bitrate, but an otherwise fine experience. So the problem seems to have been in dissemination, not at the origin.
replies(1): >>42162220 #
4. kortilla ◴[] No.42159296[source]
I highly doubt this. Netflix has a system of OCAs that are loaded with hard disks, are installed in ISP’s networks, and serve the majority of those ISP’s customers.

Given than many people had no problems with the stream, it is unlikely to have been an origin problem but more likely the mechanism to fanout quickly to OCAs. Normally latency to an OCA doesn’t matter when you’re replicating new catalogs in advance, but live streaming makes a bunch of code that previously “didn’t need to be fast” get promoted to the hot path.

5. woobar ◴[] No.42159379[source]
I've tried to watch an old Seinfeld episode during this event. It was freezing every few minutes even at downgraded bitrate. A video that should be on my local CDN node.
6. mmcgaha ◴[] No.42159456[source]
I am not sure that it is an issue with the origination point. In fact I just thought it was my ISP because my daughter's boyfriend was watching and doing facetime with her and my video was dropping but his was not. I have 2gb fiber and we regularly stream five TVs without any issue, so it should not have been a bandwidth issue.
7. ssl-3 ◴[] No.42160379[source]
Perhaps it was, or perhaps it was not.

I was watching a pirated, live retransmission of the event on Twitch (in Portuguese), and there was zero buffering on my end.

8. collinrapp ◴[] No.42162220[source]
Yeah, I was switching between my phone and desktop to watch the stream and I had a seamless experience on both devices the entire time. I’m not sure why so many people are assuming this was a universal experience.