WebSockets cost us $1M on our AWS bill

1. cogman10 ◴[06 Nov 24 20:07 UTC] No.42068537[source]▶

>>42067275 (OP) #

This is such a weird way to do things.

Here they have a nicely compressed stream of video data, so they take that stream and... decode it. But they aren't processing the decoded data at the source of the decode, so instead they forward that decoded data, uncompressed(!!), to a different location for processing. Surprisingly, they find out that moving uncompressed video data from one location to another is expensive. So, they compress it later (Don't worry, using a GPU!)

At so many levels this is just WTF. Why not forward the compressed video stream? Why not decompress it where you are processing it instead of in the browser? Why are you writing it without any attempt at compression? Even if you want lossless compression there are well known and fast algorithms like flv1 for that purpose.

Just weird.

replies(5): >>42068622 #>>42068813 #>>42069256 #>>42070259 #>>42074207 #

2. isoprophlex ◴[06 Nov 24 20:13 UTC] No.42068622[source]▶

>>42068537 (TP) #

Article title should have been "our weird design cost us $1M".

As it turns out, doing something in Rust does not absolve you of the obligation to actually think about what you are doing.

replies(1): >>42069227 #

3. rozap ◴[06 Nov 24 20:27 UTC] No.42068813[source]▶

>>42068537 (TP) #

Really strange. I wonder why they omitted this. Usually you'd leave it compressed until the last possible moment.

replies(1): >>42069220 #

4. dylan604 ◴[06 Nov 24 20:55 UTC] No.42069220[source]▶

>>42068813 #

> Usually you'd leave it compressed until the last possible moment.

Context matters? As someone working in production/post, we want to keep it uncompressed until the last possible moment. At least as far as no more compression than how it was acquired.

replies(1): >>42069663 #

5. dylan604 ◴[06 Nov 24 20:55 UTC] No.42069227[source]▶

>>42068622 #

TFA opening graph "But it turns out that if you IPC 1TB of video per second on AWS it can result in enormous bills when done inefficiently. "

6. tbarbugli ◴[06 Nov 24 20:57 UTC] No.42069256[source]▶

>>42068537 (TP) #

Possibly because they capture the video from xvfb or similar (they run a headless browser to capture the video) so at that point the decoding already happened (webrtc?)

7. DrammBA ◴[06 Nov 24 21:25 UTC] No.42069663{3}[source]▶

>>42069220 #

> Context matters?

It does, but you just removed all context from their comment and introduced a completely different context (video production/post) for seemingly no reason.

Going back to the original context, which is grabbing a compressed video stream from a headless browser, the correct approach to handle that compressed stream is to leave it compressed until the last possible moment.

replies(1): >>42070110 #

8. pavlov ◴[06 Nov 24 21:57 UTC] No.42070110{4}[source]▶

>>42069663 #

Since they aim to support every meeting platform, they don’t necessarily even have the codecs. Platforms like Zoom can and do use custom video formats within their web clients.

With that constraint, letting a full browser engine decode and composite the participant streams is the only option. And it definitely is an expensive way to do it.

9. bri3d ◴[06 Nov 24 22:10 UTC] No.42070259[source]▶

>>42068537 (TP) #

I think the issue with compression is that they're scraping the online meeting services rather than actually reverse engineering them, so the compressed video stream is hidden inside some kind of black box.

I'm pretty sure that feeding the browser an emulated hardware decoder (ie - write a VAAPI module that just copies compressed frame data for you) would be a good semi-universal solution to this, since I don't think most video chat solutions use DRM like Widevine, but it's not as universal as dumping the framebuffer output off of a browser session.

They could also of course one-off reverse each meeting service to get at the backing stream.

What's odd to me is that even with this frame buffer approach, why would you not just recompress the video at the edge? You could even do it in Javascript with WebCodecs if that was the layer you were living at. Even semi-expensive compression on a modern CPU is going to be way cheaper than copying raw video frames, even just in terms of CPU instruction throughput vs memory bandwidth with shared memory.

It's easy to cast stones, but this is a weird architecture and making this blog post about the "solution" is even stranger to me.

replies(1): >>42071222 #

10. cogman10 ◴[06 Nov 24 23:31 UTC] No.42071222[source]▶

>>42070259 #

> I think the issue with compression is that they're scraping the online meeting services rather than actually reverse engineering them, so the compressed video stream is hidden inside some kind of black box.

I mean, I would presume that the entire reason they forked chrome was to crowbar open the black box to get at the goodies. Maybe they only did it to get a framebuffer output stream that they could redirect? Seems a bit much.

Their current approach is what I'd think would be a temporary solution while they reverse engineer the streams (or even get partnerships with the likes of MS and others. MS in particular would likely jump at an opportunity to AI something).

> What's odd to me is that even with this frame buffer approach, why would you not just recompress the video at the edge? You could even do it in Javascript with WebCodecs if that was the layer you were living at. Even semi-expensive compression on a modern CPU is going to be way cheaper than copying raw video frames, even just in terms of CPU instruction throughput vs memory bandwidth with shared memory.

Yeah, that was my final comment. Even if I grant that this really is the best way to do things, I can't for the life of me understand why they'd not immediately recompress. Video takes such a huge amount of bandwidth that it's just silly to send around bitmaps.

> It's easy to cast stones, but this is a weird architecture and making this blog post about the "solution" is even stranger to me.

Agreed. Sounds like a company that likely has multiple million dollar savings just lying around.

replies(1): >>42074236 #

11. dmazzoni ◴[07 Nov 24 07:00 UTC] No.42074207[source]▶

>>42068537 (TP) #

> Here they have a nicely compressed stream of video data

But they don't.

They support 7 different meeting providers (Zoom, Meet, WebEx, ...), none of which have an API that give you access to the compressed video stream.

In theory, you could try to reverse-engineer each protocol...but then your product could break for potentially days or weeks anytime one of those companies decides to change their protocol - vs web scraping, where if it breaks they can probably fix it in 15 minutes.

Their solution is inefficient, but robust. And that's ultimately a more monetizable product.

12. dmazzoni ◴[07 Nov 24 07:04 UTC] No.42074236{3}[source]▶

>>42071222 #

> Their current approach is what I'd think would be a temporary solution while they reverse engineer the streams (or even get partnerships with the likes of MS and others. MS in particular would likely jump at an opportunity to AI something).

They support 7 meeting platforms. Even if 1 or 2 are open to providing APIs, they're not all going to do that.

Reverse-engineering the protocol would be far more efficient, yes - but it'd also be more brittle. The protocol could change at any time and reverse-engineering it again could days between days and weeks. Would you want a product with that sort of downtime?

Also, does it scale? Reverse-engineering 7+ protocols is a lot of engineering work, and it's very specialized work that not any software engineer could just dive into quickly.

In comparison, writing web scrapers to find the video element for 7 different meeting products is super easy to write, and super easy to fix.

replies(1): >>42074840 #

13. lostmsu ◴[07 Nov 24 08:42 UTC] No.42074840{4}[source]▶

>>42074236 #

If they forked Chromium, they should have direct access to compressed stream of a particular video element without much fuss.