←back to thread

261 points tosh | 10 comments | | HN request time: 0.21s | source | bottom
1. cogman10 ◴[] No.42068537[source]
This is such a weird way to do things.

Here they have a nicely compressed stream of video data, so they take that stream and... decode it. But they aren't processing the decoded data at the source of the decode, so instead they forward that decoded data, uncompressed(!!), to a different location for processing. Surprisingly, they find out that moving uncompressed video data from one location to another is expensive. So, they compress it later (Don't worry, using a GPU!)

At so many levels this is just WTF. Why not forward the compressed video stream? Why not decompress it where you are processing it instead of in the browser? Why are you writing it without any attempt at compression? Even if you want lossless compression there are well known and fast algorithms like flv1 for that purpose.

Just weird.

replies(4): >>42068622 #>>42068813 #>>42069256 #>>42070259 #
2. isoprophlex ◴[] No.42068622[source]
Article title should have been "our weird design cost us $1M".

As it turns out, doing something in Rust does not absolve you of the obligation to actually think about what you are doing.

replies(1): >>42069227 #
3. rozap ◴[] No.42068813[source]
Really strange. I wonder why they omitted this. Usually you'd leave it compressed until the last possible moment.
replies(1): >>42069220 #
4. dylan604 ◴[] No.42069220[source]
> Usually you'd leave it compressed until the last possible moment.

Context matters? As someone working in production/post, we want to keep it uncompressed until the last possible moment. At least as far as no more compression than how it was acquired.

replies(1): >>42069663 #
5. dylan604 ◴[] No.42069227[source]
TFA opening graph "But it turns out that if you IPC 1TB of video per second on AWS it can result in enormous bills when done inefficiently. "
6. tbarbugli ◴[] No.42069256[source]
Possibly because they capture the video from xvfb or similar (they run a headless browser to capture the video) so at that point the decoding already happened (webrtc?)
7. DrammBA ◴[] No.42069663{3}[source]
> Context matters?

It does, but you just removed all context from their comment and introduced a completely different context (video production/post) for seemingly no reason.

Going back to the original context, which is grabbing a compressed video stream from a headless browser, the correct approach to handle that compressed stream is to leave it compressed until the last possible moment.

replies(1): >>42070110 #
8. pavlov ◴[] No.42070110{4}[source]
Since they aim to support every meeting platform, they don’t necessarily even have the codecs. Platforms like Zoom can and do use custom video formats within their web clients.

With that constraint, letting a full browser engine decode and composite the participant streams is the only option. And it definitely is an expensive way to do it.

9. bri3d ◴[] No.42070259[source]
I think the issue with compression is that they're scraping the online meeting services rather than actually reverse engineering them, so the compressed video stream is hidden inside some kind of black box.

I'm pretty sure that feeding the browser an emulated hardware decoder (ie - write a VAAPI module that just copies compressed frame data for you) would be a good semi-universal solution to this, since I don't think most video chat solutions use DRM like Widevine, but it's not as universal as dumping the framebuffer output off of a browser session.

They could also of course one-off reverse each meeting service to get at the backing stream.

What's odd to me is that even with this frame buffer approach, why would you not just recompress the video at the edge? You could even do it in Javascript with WebCodecs if that was the layer you were living at. Even semi-expensive compression on a modern CPU is going to be way cheaper than copying raw video frames, even just in terms of CPU instruction throughput vs memory bandwidth with shared memory.

It's easy to cast stones, but this is a weird architecture and making this blog post about the "solution" is even stranger to me.

replies(1): >>42071222 #
10. cogman10 ◴[] No.42071222[source]
> I think the issue with compression is that they're scraping the online meeting services rather than actually reverse engineering them, so the compressed video stream is hidden inside some kind of black box.

I mean, I would presume that the entire reason they forked chrome was to crowbar open the black box to get at the goodies. Maybe they only did it to get a framebuffer output stream that they could redirect? Seems a bit much.

Their current approach is what I'd think would be a temporary solution while they reverse engineer the streams (or even get partnerships with the likes of MS and others. MS in particular would likely jump at an opportunity to AI something).

> What's odd to me is that even with this frame buffer approach, why would you not just recompress the video at the edge? You could even do it in Javascript with WebCodecs if that was the layer you were living at. Even semi-expensive compression on a modern CPU is going to be way cheaper than copying raw video frames, even just in terms of CPU instruction throughput vs memory bandwidth with shared memory.

Yeah, that was my final comment. Even if I grant that this really is the best way to do things, I can't for the life of me understand why they'd not immediately recompress. Video takes such a huge amount of bandwidth that it's just silly to send around bitmaps.

> It's easy to cast stones, but this is a weird architecture and making this blog post about the "solution" is even stranger to me.

Agreed. Sounds like a company that likely has multiple million dollar savings just lying around.