←back to thread

72 points indulona | 1 comments | | HN request time: 0.205s | source

I am working on a website that has video hosting capability. Users can upload video files and i will generate multiple versions with different qualities or just audio, thumbnails and things like that.

I have chosen the mp4 container because of how widely supported it is. To prevent users having to fetch whole files, i use the fast start option, where the container's metadata is written at the beginning of the file, instead of at the end.

Next, I have picked h264 codec because of how widely supported it is. VP8/VP9/AV1/x265/x266 are certainly better but the h264 software encoding is often beating hardware encoding due to highly optimized and time-proven code and supported hardware. And the uploaded videos are already compressed, users won't be uploading some 8k raw videos where most advanced codes would be useful for preserving "quality".

For audio, i have picked opus codec. Seems like good value over others. Not much else to add.

I run the ffmpeg to convert video with command like this:

ffmpeg -hide_banner -loglevel error -i input.mp4 -g 52 -c:v h264 -maxrate:v vbr -bufsize vbr -s HxW -c:a libopus -af aformat=channel_layouts=7.1|5.1|stereo -maxrate:a abr -ar 48000 -ac 2 -f mp4 -movflags faststart -map 0:v:0 -map 0:a:0 output.mp4

where vbr is video bitrate like 1024k(1mbps), abr is audio bitrate like 190k and HxW is video dimensions in case of resizing.

I wonder how are folks that handle video encoding process and generate their videos?

How did you pick your settings, what issues have you encountered and any tips you can share are certainly appreciated.

Quite a niche segment when it comes to operations and not being merely consumer/customer.

Show context
visualblind ◴[] No.41056046[source]
Video codec transcoding is very CPU resource expensive. If you do a lot of it, you should be looking into doing hardware-accelerated transcoding. https://trac.ffmpeg.org/wiki/HWAccelIntro

My ffmpeg how-to/examples/scratchfile can be viewed here: https://paste.travisflix.com/?ce12a91f222cc3d7#BQPKtw6sEs9cE...

replies(2): >>41056080 #>>41056126 #
izacus ◴[] No.41056080[source]
Hardware video encoders all - even in 2024 - produce significantly worse quality at the same filesize.

They're made to be realtime, but for any kind of delayed playback where there's time to encode, software encoders win without any kind of effort. For web delivery especially, hw encoders have no business being used because quality per expended bandwidth is paramount and costs money.

replies(3): >>41056115 #>>41056142 #>>41056295 #
rahimnathwani ◴[] No.41056142[source]
IIRC both x264 and nvenc have multiple profiles for the tradeoff between quality and computing power.

For your comparison, are assuming that the objective is best quality, e.g. that you'd accept 10x the computation even if it gave only a 2% quality improvement?

(I can see how this could make sense, if you're encoding a file once and it will be viewed many times. But I could imagine other situations, e.g. where most files are viewed once or never, and only a few files are very popular.)

replies(1): >>41056215 #
1. izacus ◴[] No.41056215[source]
Having profiles doesn't really change the fact that even Ampere generation encoding block at slowest profile won't come close to visual quality at same output bitrate to x264s slow+ profiles (and we're not even touching on H.265/AV1 here).

> For your comparison, are assuming that the objective is best quality, e.g. that you'd accept 10x the computation even if it gave only a 2% quality improvement?

The difference is more like 150% encoding time for half the filesize at same SSIM - depending on configuration and video type of course. And that ignores the fact that a server machine with 64-core Threadripper or equivalent can handle parallel encoding of many more videos at massively lower dollar cost than using nvenc. Especially at current GPU prices and GPU power consumption.

There's a reason why all online services are encoding in software (usually with x264 & co.) for mainstream most used profiles (that is, SD/HD, many also for 4K).

It just doesn't make sense from product quality, user experience or financial perspective. It only makes sense if you never check the results of your production.