←back to thread

261 points tosh | 5 comments | | HN request time: 0.99s | source
Show context
cosmotic ◴[] No.42067844[source]
Why decode to then turn around and re-encode?
replies(3): >>42068029 #>>42068118 #>>42068185 #
1. pavlov ◴[] No.42068118[source]
Reading their product page, it seems like Recall captures meetings on whatever platform their customers are using: Zoom, Teams, Google Meet, etc.

Since they don't have API access to all these platforms, the best they can do to capture the A/V streams is simply to join the meeting in a headless browser on a server, then capture the browser's output and re-encode it.

replies(1): >>42068689 #
2. MrBuddyCasino ◴[] No.42068689[source]
They‘re already hacking Chromium. If the compressed video data is unavailable in JS, they could change that instead.
replies(2): >>42068828 #>>42069540 #
3. moogly ◴[] No.42068828[source]
They did what every other startup does: put the PoC in production.
4. pavlov ◴[] No.42069540[source]
If you want to support every meeting platform, you can’t really make any assumptions about the data format.

To my knowledge, Zoom’s web client uses a custom codec delivered inside a WASM blob. How would you capture that video data to forward it to your recording system? How do you decode it later?

Even if the incoming streams are in a standard format, compositing the meeting as a post-processing operation from raw recorded tracks isn’t simple. Video call participants have gaps and network issues and layer changes, you can’t assume much anything about the samples as you would with typical video files. (Coincidentally this is exactly what I’m working on right now at my job.)

replies(1): >>42072925 #
5. cosmotic ◴[] No.42072925{3}[source]
At some point, I'd hope the result of zooms code quickly becomes something that can be hardware decoded. Otherwise the CPU, battery consumption, and energy usage are going to be through the roof.