your idea also doesn't work with live streaming, and may also not work with inte...

kevincox · on Dec 12, 2023

It can work with live streaming, you just need to add N keyframes of latency. With low-latency livestreaming keyframes are often close together anyways so adding say 4s of latency to get 4x encoding speed may be a good tradeoff.

mort96 · on Dec 12, 2023

Well, you don't add 4s of latency for 4x encoding speed though. You add 4s of latency for very marginal quality/efficiency improvement and significant encoder simplification, because the baseline is current frame-parallel encoders, not sequential encoders.

Plus, computers aren't quad cores any more, people with powerful streaming rigs probably have 8 or 16 cores; and key frames aren't every second. Suddenly you're in this hellish world where you have to balance latency, CPU utilization and encoding efficiency. 16 cores at a not-so-great 8 seconds of extra latency means terrible efficiency with a key frame every 0.5 second. 16 cores at good efficiency (say, 4 seconds between key frames) means terrible 64 second of extra latency.

imtringued · on Dec 12, 2023

You can pry vp8 out of my cold dead heands. I'm sorry, but if it takes more than 200ms including network latency it is too slow and video encoding is extremely CPU intensive so exploding your cloud bill is easy.

bagels · on Dec 12, 2023

4s of latency is not acceptable for applications like live chat

kevincox · on Dec 12, 2023

As I said, "may be". "Live" varies hugely with different use cases. Sporting events are often broadcast live with 10s of seconds of latency. But yes, if you are talking to a chat in real-time a few seconds can make a huge difference.

dbrueck · on Dec 12, 2023

Actually, not only does it work with live streaming, it's not an uncommon approach in a number of live streaming implementations*. To be clear, I'm not talking about low latency stuff like interactive chat, but e.g. live sports.

It's one of several reasons why live streams of this type are often 10-30 seconds behind live.

* Of course it also depends on where in the pipeline they hook in - some take the feed directly, in which case every frame is essentially a key frame.

kevincox · on Dec 12, 2023

> except you don't actually need to load every chunk into memory, only the current frames.

That's a good point. In the general case of reading from a pipe you need to buffer it somewhere. But for file-based inputs the buffering concerns aren't relevant, just the working memory.