H.264 is magic

Thom Holwerda 2016-11-05 Multimedia, AV 31 Comments

H.264 is a video compression codec standard. It is ubiquitous – internet video, Blu-ray, phones, security cameras, drones, everything. Everything uses H.264 now.

H.264 is a remarkable piece of technology. It is the result of 30+ years of work with one single goal: To reduce the bandwidth required for transmission of full-motion video.

Technically, it is very interesting. This post will give insight into some of the details at a high level – I hope to not bore you too much with the intricacies. Also note that many of the concepts explained here apply to video compression in general, and not just H.264.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

31 Comments

2016-11-06 1:08 am

number9
I had a fellow Ph.D. student working on a hardware codec that got rid of one of the “pay to play” portions of the H.264 codec. Google was highly interested in it (back in 2013). Odd that it is so ubiquitous, even though they offer free use to end users… the provider has to pay, as it is patented. From Wiki:

https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC

“H.264 is protected by patents owned by various parties. A license covering most (but not all) patents essential to H.264 is administered by patent pool MPEG LA.[2] Commercial use of patented H.264 technologies requires the payment of royalties to MPEG LA and other patent owners. MPEG LA has allowed the free use of H.264 technologies for streaming internet video that is free to end users, and Cisco Systems pays royalties to MPEG LA on behalf of the users of binaries for its open source H.264 encoder.”

2016-11-06 9:17 pm

kurkosdr
I had a fellow Ph.D. student working on a hardware codec that got rid of one of the “pay to play” portions of the H.264 codec. Google was highly interested in it (back in 2013). Odd that it is so ubiquitous, even though they offer free use to end users… the provider has to pay, as it is patented. From Wiki:

https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC

People have to pay for the patented codec because nobody has managed to make a true rival. On2 made some codecs which were essentially H.264 while avoiding the H.264 patents, and the results when compared with x264 (considered the best H.264 encoder) for the same amount of compression time were ridiculous. Of course, Google rigged the tests and used a mode in vpxenc which will take hours to encode a video (cpu-used=0) and made some good-looking numbers, but they were just that: good-looking numbers.

On2 has also aggressively optimised their default encoder for PSNR while completely ignoring SSIM in order to help in the “good-looking numbers” department.

Back in the real world, users will just download encoders and decoders from countries without patent restrictions (hello Handbrake and VLC), for everything else the price of the encoder and/or the decoder is hidden in the product price (be it smartphones or proprietary video editing software or OSes), usage for streaming is free, so nobody will use an inferior codec and wait forever for his videos to be compressed or get an inferior quality/size ratio just because some FOSS advocates think it is so much of a burden to download from a French website or because they like to make a political statement out of everything.

Edited 2016-11-06 21:18 UTC

2016-11-07 2:06 pm

Quikee
People have to pay for the patented codec because nobody has managed to make a true rival. On2 made some codecs which were essentially H.264 while avoiding the H.264 patents, and the results when compared with x264 (considered the best H.264 encoder) for the same amount of compression time were ridiculous. Of course, Google rigged the tests and used a mode in vpxenc which will take hours to encode a video (cpu-used=0) and made some good-looking numbers, but they were just that: good-looking numbers.

On2 has also aggressively optimised their default encoder for PSNR while completely ignoring SSIM in order to help in the “good-looking numbers” department. [/q]

PSNR is what the industry used to measure quality of the coding tools used. This is also true in the process when the H.264 was developed (and I think this didn’t change with H.265 either). By that you could say H.264 is also tuned towards PSNR, so I don’t see how VP8 was rigged – they just tuned to the metric everyone was tuning to (also note that x264 used –tune PSNR in their tests).

SSIM is not better than PSNR either – it is just a different metric which also has its advantages and flaws and doesn’t correspond to human visual perception (for example both don’t even take chroma values into account).

x264 is such a good encoder just because it was carefully optimized against subjective visual quality not a particular metric.

Anyway it is 2016 now (soon to be 2017) and we have VP9 available for quite some time, which is better than H.264 by quite a nice margin. It is gaining more support recently (more encoders besides libvpx are being produced, more chips support it in hardware) and still is free to use by anyone for any use-case they have.

[q]Back in the real world, users will just download encoders and decoders from countries without patent restrictions (hello Handbrake and VLC), for everything else the price of the encoder and/or the decoder is hidden in the product price (be it smartphones or proprietary video editing software or OSes), usage for streaming is free, so nobody will use an inferior codec and wait forever for his videos to be compressed or get an inferior quality/size ratio just because some FOSS advocates think it is so much of a burden to download from a French website or because they like to make a political statement out of everything.

If you download the encoder from another country doesn’t mean that you don’t break the law in your country and don’t need to license it. And even in France (or many other countries in Europe) they would need to license it as many H.264 patents are filed there too.

It is not just about FOSS advocates at all, it is also about small business and users. How crazy the patent situation can become you can see with H.265 where even the big software, hardware and streaming companies rather join Google’s efforts to create a patent unencumbered codec, than deal the licensing mess of H.265.

2016-11-07 3:36 pm

Alfman verbose=1
Quikee,

If you download the encoder from another country doesn’t mean that you don’t break the law in your country and don’t need to license it. And even in France (or many other countries in Europe) they would need to license it as many H.264 patents are filed there too. [/q]

Legally, that’s a given. However kurkosdr’s point was that people don’t bother to license it like they’re supposed to, they just live under the radar.

[q]It is not just about FOSS advocates at all, it is also about small business and users. How crazy the patent situation can become you can see with H.265 where even the big software, hardware and streaming companies rather join Google’s efforts to create a patent unencumbered codec, than deal the licensing mess of H.265.

Yea, the statement about FOSS guys was a bit ironic. Who is more free: the FOSS guys who want everyone to have & use legally unrestricted codecs, or the crowds who outright ignore the legal restrictions on software because they don’t believe in the restrictions and simply use whatever codec they want? I don’t know, that’s actually a tough call. On the one hand it’s important to be legally free, but on the other hand those who ignore laws might actually have more freedom in practice.
2016-11-07 4:33 pm

kurkosdr
It is not just about FOSS advocates at all, it is also about small business and users. How crazy the patent situation can become you can see with H.265 where even the big software, hardware and streaming companies rather join Google’s efforts to create a patent unencumbered codec, than deal the licensing mess of H.265.

VP9 is patent unencumbered. American software companies like Mozilla use it without paying any royalties to anybody, while for other formats like H.264 they have to pay or arrange for someone else to pay.

So, just think for a moment, since both VP9 and H.265 break bitstream compatibility with H.264, and VP9 is royalty-free while H.265 isn’t, there has to be something in H.265 in order for for-profit companies to pay to use it instead of going for the royalty-free VP9.

In fact, from various samples I ‘ve seen, the best VP9 encoders can’t even beat x264 when it comes to subjective visual quality (aka blind tests). Sure, if you sacrifice encoding time you will get a nice PSNR value from most VP9 encoders that will make you all happy as a benchmarker, but corporations who have to deliver the best quality using the minimum bandwidth will pay for H.264 (where compatibility with existing H.264 decoding circuits is desired) or pay for H.265 (where compatibility with existing H.264 decoding circuits doesn’t apply, such as 4K video) and won’t choose VP9 over H.265.

Look, I like royalty-free as much as the next guy, because it would reduce the cost of hardware like smartphones by a dollar or two, but I am also aware of a little thing called “generation loss” and “lossy compression”, so I do not want to mindlessly move my video collection from VP8 to VP9 to VP10 (yes, google is prepping a VP10) ’till they finally manage to beat the H.26x family of codecs. For me, its H.264 for anything 1080p and below, H.265 for 4K (unless the device records 4K in H.264).

Edited 2016-11-07 16:36 UTC

2016-11-07 5:49 pm

Alfman verbose=1
kurkosdr,

VP9 is patent unencumbered. American software companies like Mozilla use it without paying any royalties to anybody, while for other formats like H.264 they have to pay or arrange for someone else to pay. [/q]

I’m not sure if that’s actually true. Originally VP8 was said to be patent unencumbered, however it turns out that it did infringe and some companies like google were forced to pay royalties, and google’s license agreement explicitly includes VP9.

https://en.wikipedia.org/wiki/WebM

In March 2013, MPEG LA announced that it had reached an agreement with Google to license patents that “may be essential” for the implementation of the VP8 codec, and give Google the right to sub-license these patents to any third-party user of VP8 or VP9.[66][67]

http://appleinsider.com/articles/13/03/07/google-admits-its-vp8webm…

After years of legal maneuvering, Google has now agreed to license the H.264 patents that WebM infringes.

“This is a significant milestone in Googleâ€™s efforts to establish VP8 as a widely-deployed web video format,â€ said Allen Lo, Googleâ€™s deputy general counsel for patents in a press release. â€œWe appreciate MPEG LAâ€™s cooperation in making this happen.â€

The status of VP9 as an unencumbered codec is somewhat unclear on products that don’t have a license from the MPEG-LA. Google’s license covers both VP8 and VP9, but I personally can’t find information about whether google would have to license VP9 alone if it did not also license VP8.

https://www.cnet.com/news/google-urges-fast-adoption-of-vp9-video-co…

Some VP8 techniques are used in VP9, and the agreement covers those usages, Frost said. However, it’s not yet clear whether new patent infringement claims might surface with other techniques in VP9.

In general the reason software patents are so evil is that you can develop the code yourself and genuinely think that you don’t infringe, but patent infringement doesn’t require you to “copy” anything in order for you to be found guilty of it. There are so many patents being filed and issued it’s virtually impossible to develop anything of significance without infringing a patent somewhere.

BTW if you can find any authoritative sources covering the patentability of VP9 specifically, I’d love to see them because I am a proponent of unencumbered software.

[q]So, just think for a moment, since both VP9 and H.265 break bitstream compatibility with H.264, and VP9 is royalty-free while H.265 isn’t, there has to be something in H.265 in order for for-profit companies to pay to use it instead of going for the royalty-free VP9.

There’s a lot of corporate power plays going on. Apple and microsoft are significant patent stakeholders and actually stand to benefit from patent encumbered technology by using them against others. If you want your content to work for IOS users, apple doesn’t give you much of a say. It’s often said that microsoft made a lot more money from android than it did from windows phone because of patents.

That aside, back when it was H.264 versus VP8, my own tests concluded H.264 performed significantly better. It wasn’t even close, but it could have been that VP8 was too immature at that time. I don’t know if a gap still remains now. Does anyone here have experience with H.265 versus VP9?

Edited 2016-11-07 18:03 UTC
2016-11-07 6:15 pm

kurkosdr

I’m not sure if that’s actually true. Originally VP8 was said to be patent unencumbered, however it turns out that it did infringe and some companies like google were forced to pay royalties, and google’s license agreement explicitly includes VP9. [/q]

No, it didn’t, there is no proof VP8 or VP9 infringe on any patent not held by Google. “Infringe” is a technical term. It means that a patent has been found valid and the product makes use of the method described in the patent. A “patent claim” is also a technical term. It means that someone has stepped forth and said “your product infinges my patent No XXX” where XXX is some number.

So, the facts are that nobody has asserted a patent number against VP8 or VP9. Nope. Ziltch. Zero.

BTW, MPEG LA got some money from Google to end their “maybe infringes” fear campaign against the VP codecs, but they have not asserted a single patent number against VP8 and VP9 and nobody else has, so the formats are as unencumbered as PNG is.

Just to be clear: A piece of software is patent encumbered only if a specific patent number has been asserted against it. Period. Google’s backdeals with MPEG LA are irrelevant. As long as MPEG LA has no patent numbers against VP8 or VP9, they can’t get you. And they don’t have that.

BTW, there is always the chance someone else coming forth with some junk patent and asserting the patent against VP8, VP9 or even PNG, just like there is the chance someone asserting a patent against H.264 and not willing to join the MPEG LA pool.

In general the reason software patents are so evil is that you can develop the code yourself and genuinely think that you don’t infringe, but patent infringement doesn’t require you to “copy” anything in order for you to be found guilty of it. There are so many patents being filed and issued it’s virtually impossible to develop anything of significance without infringing a patent somewhere.

Software patents are evil, and the earth is round. Please no more text explaining either of the two. The thing is, I prefer using a superior format (H.265 or H.264) that is patent encumbered because I do not want to make a political statement out of everything. I care about the quality (and bitstream compatibility in the case of H.264) more than political statements. Bad person I am. Just like I prefer using Windows (because I love switchable graphics, the excellent Nvidia Windows drivers and the better power management) instead of GNU/Linux, despite Windows being an evil OS from an evil company. I am a selfish person.

[q] There’s a lot of corporate power plays going on. Apple and microsoft are significant patent stakeholders and actually stand to benefit from patent encumbered technology by using them against others. It’s often said that microsoft made more money from android than it did from windows phone.

Netflix also uses H.265 for their 4K streams, despite the fact they hold no MPEG LA patents and have nothing to gain from using H.265 over VP9, instead they have to pay a license fee. Netflix uses H.265 (for their 4K streams) to deliver video to Android devices too which have VP9 hardware decoding circuitry, so it is not a hardware decoding issue either. So, there is got to be something drawing Netflix (a for-profit company) to the patent-encumbered fee-requiring H.265 instead of the royalty-free VP9, right? (also note that Netflix doesn’t copy video bitstreams from Blu-rays but generate their own bitstreams, so they could have gone with VP9 instead of H.265 if they wanted) Visit tthe wikipedia entry of VP9 for the links where Netflix descibes why.

PS: Allow me not to comment on the silly AppleInsider article. No patent number presented, no discussion about patent-encumberness. I don’t comment on fear campaigns and speculation.

Edited 2016-11-07 18:35 UTC
2016-11-07 7:56 pm

Quikee
Netflix also uses H.265 for their 4K streams, despite the fact they hold no MPEG LA patents and have nothing to gain from using H.265 over VP9, instead they have to pay a license fee. Netflix uses H.265 (for their 4K streams) to deliver video to Android devices too which have VP9 hardware decoding circuitry, so it is not a hardware decoding issue either. So, there is got to be something drawing Netflix (a for-profit company) to the patent-encumbered fee-requiring H.265 instead of the royalty-free VP9, right? (also note that Netflix doesn’t copy video bitstreams from Blu-rays but generate their own bitstreams, so they could have gone with VP9 instead of H.265 if they wanted) Visit tthe wikipedia entry of VP9 for the links where Netflix descibes why.

PS: Allow me not to comment on the silly AppleInsider article. No patent number presented, no discussion about patent-encumberness. I don’t comment on fear campaigns and speculation.

That’s why Netflix is part of AOMedia – because they are happy by paying for H.265.

Regarding VP9 – http://www.streamingmedia.com/Articles/Editorial/Featured-Articles/…

“Streaming Media: Where are you distributing VP9-encoded files? Assume compatible browsers, what about Android or compatible OTT?

Ronca: The primary VP9 targets will be mobile/cellular and 4K.”

Edited 2016-11-07 20:00 UTC
2016-11-07 8:19 pm

Alfman verbose=1
kurkosdr,

No, it didn’t, there is no proof VP8 or VP9 infringe on any patent not held by Google [/q]

Seriously, google has no shortage of top notch lawyers at it’s disposal, yet despite their best efforts google was unable to avoid paying MPEG-LA’s patent royalties. If google can loose, then anybody can!

Or are google’s lawyers that stupid?

BTW, MPEG LA got some money from Google to end their “maybe infringes” fear campaign against the VP codecs, but they have not asserted a single patent number against VP8 and VP9 and nobody else has, so the formats are as unencumbered as PNG is.

The public might not be privy to the details, but I’m sure the lawyers knew. In 2011 MPEG-LA asked it’s members to push forward patents for it’s WebM pool, and in 2013 google became a licensee, that seems straightforward to me. Even from a statistical POV, there are so many damned patents in existence that the odds of not infringing at least some of them is virtually NIL.

Let’s assume you are right, and that VP8 doesn’t infringe on any patents, MPEG-LA’s assertions are baseless, and it was all about FUD. If this were completely true, then the best thing google could do to dispel the FUD is to lead by example, not pay any royalties and tell everyone else to do the same: “We’re not going to pay these trolls and neither should you.”

Conversely the worst thing that google could to clear the FUD is to accept MPEG-LA’s terms and pay royalties on VP8/9. This leads people to “wrongly” believe that the MPEG-LA’s patent pool is in fact valid against VP8/9.

In other words, if it’s true that MPEG-LA’s patent case against VP8 has zero merit, then it makes absolutely no sense to me that google itself is actively playing a role in the conspiracy to make it seem like the MPEG-LA case has merit. It just flies completely against google’s interests for WebM.

To me, it seems far less fishy and more plausible that VP8 was found to infringe some patents and google was forced to pay. There’s no conspiracy needed for these events to make sense. It sucks, it’s inconvenient for the FOSS supporters (including me BTW) who want an unencumbered codec, but that’s a biproduct of our terrible legal system. That’s what we need to change.

[q]Software patents are evil, and the earth is round. Please no more text explaining either of the two. The thing is, I prefer using a superior format (H.265 or H.264) that is patent encumbered because I do not want to make a political statement out of everything. I care about the quality (and bitstream compatibility in the case of H.264) more than political statements. Bad person I am. Just like I prefer using Windows (because I love switchable graphics, the excellent Nvidia Windows drivers and the better power management) instead of GNU/Linux, despite Windows being an evil OS from an evil company. I am a selfish person.

I’m not judging you.
2016-11-07 8:39 pm

TasnuArakun
In other words, if it’s true that MPEG-LA’s patent case against VP8 has zero merit, then it makes absolutely no sense to me that google itself is actively playing a role in the conspiracy to make it seem like the MPEG-LA case has merit. It just flies completely against google’s interests for WebM.

As far as I understand it, no one else would have dared to use VP8 as long as there was a clear possibility of patent infringement since they themselves might get sued. Google paid some money to make the problem go away quickly rather than have to go through a multi-year process.[1]

I remember the story being very different depending on where I was hearing it from. It was quite jarring. On one end you had those who claimed that Google willfullyÂ infringed on the patents for several years and also tried to outmanÅ“uvre an already established standard in favour of its own.[2][3] On the other end you had sites like OSnews that described it as “Google called the MPEG-LA’s bluff, and won”.[4]

[1] http://xiphmont.livejournal.com/59893.html?thread=310261#t310261

[2] http://www.svt.se/nyheter/utrikes/google-brot-mot-upphovs-och-paten… (Swedish)

[3] https://marco.org/2013/03/09/google-webm-infringement

[4] http://www.osnews.com/story/26849/Google_called_the_MPEG-LA_s_bluff…
2016-11-07 9:02 pm

Alfman verbose=1
TasnuArakun,

I remember the story being very different depending on where I was hearing it from. It was quite jarring. On one end you had those who claimed that Google willfully infringed on the patents for several years and also tried to outmanÅ“uvre an already established standard in favour of its own.[2][3] On the other end you had sites like OSnews that described it as “Google called the MPEG-LA’s bluff, and won”.[4]

Yep, you are right. There are lots of mixed signals.

I would have much preferred for google not to negotiate with the MPEG-LA. That way we could have seen the MPEG-LA case play out in court. I just hate that the agreement implicitly puts webm in a state of “encumbrance limbo” for parties who want to integrate webm into technologies that don’t fit under google’s agreements.

Edited 2016-11-07 21:06 UTC
2016-11-07 10:07 pm

TasnuArakun
I went through some old bookmarks and found this gem: http://arstechnica.com/tech-policy/2011/03/report-doj-looking-into-…

I wonder what came of it. Was it dropped after Google agreed to pay a licensing fee?

I also recall someone saying that MPEG-LA’s stance was basically that it was impossible to create a modern video codec that didn’t infringe on their patents â€“ which is why they’d set up a patent pool before even looking.
2016-11-07 10:41 pm

Alfman verbose=1
TasnuArakun,

I went through some old bookmarks and found this gem: http://arstechnica.com/tech-policy/2011/03/report-doj-looking-into-…..

I wonder what came of it. Was it dropped after Google agreed to pay a licensing fee?

I don’t know, but I find it kind of ironic that the federal government would investigate companies for harm caused it’s own system of granting patent monopolies.

I’m not having luck finding details of the VP8/9 agreement other than the generic press release indicating that google would pay for the royalties. I don’t suppose there’s anyone here who has a definitive answer to this but what’s the scope they agreed on? Have any details ever been made public? It’s really unclear to me what projects the VP8 patent sub-license applies to.

VP8 is open source, but if we were to use VP8/9 algorithms in a new codec or a different kind of software, how does that derived work fit into the google / MPEG-LA patent licensing agreement?

Edited 2016-11-07 22:56 UTC
2016-11-07 11:15 pm

Quikee
VP8 is open source, but if we were to use VP8/9 algorithms in a new codec or a different kind of software, how does that derived work fit into the google / MPEG-LA patent licensing agreement?

AFAIK, VP8 and VP9 are covered. VP10 wouldn’t be but they shifted to AV1 anyway so it is not that important. They know the patents and with a larger patent portfolio it shouldn’t be a problem to find a solution.

see also:

http://www.webmproject.org/cross-license/vp8/faq/
2016-11-07 11:43 pm

Alfman verbose=1
Quikee,

Thanks for that link! It specifically answers my questions and is more informative than anything else I’ve read. And it’s from an authoritative source as well… Why didn’t I find this earlier? Bah.

http://www.webmproject.org/cross-license/vp8/faq/

What activities are licensed?

Generally, the license permits the exploitation of the VP8 format in any device, system, software program, method or process, but the licensed field is limited solely to the functions of encoding, decoding, transcoding and playing VP8-format video. More specifically, the license covers only the making, using, selling (including licensing software), offering for sale (including offering software licenses) and importing of the portions of a product, system or computer program, and the steps of any method or process, to implement a VP8-compliant codec, where such activities may be covered by one or more of the licensed patents.

Iâ€™m interested in implementing techniques that are not part of the IETF VP8 Data Format and Decoding Guide RFC 6386. Does this license extend to those techniques?

No. The license does not extend to any portion of a product, system or computer program or to any steps of any method or process that are not necessary to implement a VP8-compliant codec.

Iâ€™m interested in developing a product that can encode and/or decode VP8 Video as well as other formatted video. Does this license extend to encoding and/or decoding the other formatted video as well?

No. The license does not extend to any portion of a product, system or computer program or to any steps of any method or process that are not necessary to implement a VP8-compliant codec, even if the portion, method step or process forms a part of or is integrated with a VP8 codec and even if the VP8 codec also implements those process or method steps. [/q]

So that answers that…

I’m guessing VP9 will be similar, but apparently the patent cross licensing for VP9 isn’t actually completed yet.

[q]Will there be a VP9 cross-license?

Yes, we are in the process of drafting a VP9 cross-license.
2016-11-08 1:46 am

kurkosdr
Will there be a VP9 cross-license?

Yes, we are in the process of drafting a VP9 cross-license.

And yet everybody is using VP9 and selling hardware and software implementations in the USA, and MPEG LA hasn’t sued anybody and hasn’t asserted a single patent number.

Google paid MPEG LA some money to drop their FUD campaign to avoid a a death of the format from lack of adoption from software and hardware vendors due to fear of some potential lawsuit.

I agree that Google negotiating with the MPEG LA gave MPEG LA’s claims a small amount of cred, but the fact everybody is using VP9 even without an official “cross licensing” speaks a lot. MPEG LA has no patents against VP8 or VP9 and it was all a FUD campaign to kill the format in its infancy. Good thing the DOJ took a look at this whole “we have patents but we won’t tell you which” that the MPEG LA was pulling when VP8 was announced.

Edited 2016-11-08 01:50 UTC
2016-11-08 2:29 am

Alfman verbose=1
kurkosdr,

And yet everybody is using VP9 and selling hardware and software implementations in the USA, and MPEG LA hasn’t sued anybody and hasn’t asserted a single patent number.

…

I agree that Google negotiating with the MPEG LA gave MPEG LA’s claims a small amount of cred, but the fact everybody is using VP9 even without an official “cross licensing” speaks a lot. MPEG LA has no patents against VP8 or VP9 and it was all a FUD campaign to kill the format in its infancy. Good thing the DOJ took a look at this whole “we have patents but we won’t tell you which” that the MPEG LA was pulling when VP8 was announced.

Obviously that’s what the agreement was: google will cover the royalties, and MPEG-LA won’t go after VP8/9. That agreement is already in effect. The patent cross licensing agreement being drafted is google’s agreement with us, not the MPEG-LA.

You know, this could be a case of the ends justifying the means. As long as WebM is royalty free for us, then who cares that google’s paying the patent royalties for it? It’s an interesting question.
2016-11-07 9:32 pm

Quikee
“Google called the MPEG-LA’s bluff, and won”

But that’s true – isn’t it.

Google did the best thing possible – they settled and got a license which allowed to sub-license the patents to everyone that used the codec. If they would drag it to the court it would take a long time and cost a big amount of money to disprove that the VP8 doesn’t infringe those patents. The case would also bury VP8 and VP9 in the mean time because of patent uncertainty.
2016-11-07 7:43 pm

Quikee
VP9 is patent unencumbered. American software companies like Mozilla use it without paying any royalties to anybody, while for other formats like H.264 they have to pay or arrange for someone else to pay. [/q]

And I didn’t say otherwise.

So, just think for a moment, since both VP9 and H.265 break bitstream compatibility with H.264, and VP9 is royalty-free while H.265 isn’t, there has to be something in H.265 in order for for-profit companies to pay to use it instead of going for the royalty-free VP9.

Sure – H.265 is a better codec than VP9. But large for-profit companies like Netflix, Microsoft and Cisco recognized the bad licensing situation around H.265. Not to mention that various multimedia companies suddenly started making their own VP9 encoders and hardware companies added VP9 support to their chips. So it is clearly a good sign there is something wrong.

In fact, from various samples I ‘ve seen, the best VP9 encoders can’t even beat x264 when it comes to subjective visual quality (aka blind tests).

Bullshit – read last Netflix codec comparison where they did objective (using their new machine-learning based VMAF metric) and subjective tests.

Sure, if you sacrifice encoding time you will get a nice PSNR value from most VP9 encoders that will make you all happy as a benchmarker, but corporations who have to deliver the best quality using the minimum bandwidth will pay for H.264 (where compatibility with existing H.264 decoding circuits is desired) or pay for H.265 (where compatibility with existing H.264 decoding circuits doesn’t apply, such as 4K video) and won’t choose VP9 over H.265.

Sure, I would agree – but the situation currently is not like this. H.265 licensing situation is a mess. For a streaming company, even after paying 50M$+ per year, which cuts largely into company’s profits, is still not safe that a company like Technicolor won’t come and demand some more on top (Technicolor withdrew from HEVC Advance patent pool after they changed the licensing).

With this kind of uncertainty VP9 actually looks like a good alternative – at least you can calculate the cost (or sacrifice the quality for bandwidth). It has some support in hardware, quite good support in browsers, single-threaded encoding is fast so if you paralellize per video or chunk up encoding for DASH streaming the speed is acceptable (YouTube for example).

If a company like Apple doesn’t want to advertise H.265 support this should make you think.

[q]Look, I like royalty-free as much as the next guy, because it would reduce the cost of hardware like smartphones by a dollar or two, but I am also aware of a little thing called “generation loss” and “lossy compression”, so I do not want to mindlessly move my video collection from VP8 to VP9 to VP10 (yes, google is prepping a VP10) ’till they finally manage to beat the H.26x family of codecs. For me, its H.264 for anything 1080p and below, H.265 for 4K (unless the device records 4K in H.264).

1. VP10 won’t happen – I’m quite surprised you didn’t don’t know about AOMedia and AV1 codec. That explains a lot.

2. Sure, re-encoding doesn’t make sense unless you have the original available.

3. For a backup x264 at a higher bitrate is probably the best choice. At lower bitrates it quickly falls apart (compared to x265 and VP9).

4. I’m not really interested and don’t look from a standpoint of a home user, but from a professional user or company and streaming companies viewpoint and potentially real-time video (teleconference) companies viewpoint.

2016-11-06 1:24 am

galvanash
Not that the rest of the article isn’t an interesting overview of compression, but the png is only 568kb, not 1015kb. Maybe Apple changed it since the article was written?

Also, the video is literally just 3 still images with a quick slide animation between them – there is very little motion in it to begin with, and it is also much lower resolution than the png. About 80% of it is just playing the same 3 frames over and over again…

Any old codec, even DiVX, would be able to compress it pretty well, maybe not quite as well – but close.

He would be better served calling this article “Video Compression is Magic”, because very little of what he describes is specific to h.264, and most of it predates it by decades…
2016-11-06 3:36 am

Alexey Technologov
Well written, and nice, but I would like to read more about quantization. I read several articles about it, and I still don’t fully understand it.

I do know it is the basis for ALL modern codecs, image, audio and video, from JPEG to MPEG to HEVC, VP9, MP3, Ogg Vorbis and more…

But a nice read. And yes, this article is generic about any video, so I agre with another guy saying to rename it to “Video codecs are magic”.

Edited 2016-11-06 03:39 UTC

2016-11-06 10:28 am

evert
Fourier transformations. Unfortunately I never learnt how to use them and I’m a bit too busy for it now.
2016-11-06 1:21 pm

TasnuArakun
I feel the article fails to properly explain the quantization step. Quantization is not about throwing away high frequencies. It’s where yo take your sampled value, in this case the amplitudes of the frequency components, and round them of to fit within your limited range of values. For 8-bit that’s one of 256 possible values. By reducing the number of possible values further you get even more compression. I don’t know how it works in H.264 (is the “frequency domain mask” really a thing?) so I’ll describe how JPEG does it. During the quantization step it uses a quantization matrix. Each frequency component is divided by the corresponding value in the quantization matrix. The divisor is larger for higher frequencies. This will reduce the size of the values and many of them will be rounded to zero. The zeroes can then be efficiently compressed using run-length encoding (like in the heads and tails example). It’s the values in the quantization matrix you control when you slide the quality slider in your image editor.

Wikipedia has a very good description of the steps that are involved in generating a JPEG image. https://en.wikipedia.org/wiki/JPEG#Encoding

2016-11-06 9:04 pm

kurkosdr
I feel the article fails to properly explain the quantization step. Quantization is not about throwing away high frequencies. It’s where yo take your sampled value, in this case the amplitudes of the frequency components, and round them of to fit within your limited range of values. For 8-bit that’s one of 256 possible values. By reducing the number of possible values further you get even more compression. […] [/q]

Excellent info, I will add some more info so that “Alexey Technology” can understand it for sure. I will use ONLY layman’s terms, in order to explain the concept of quantization once and for all. All you need to understand my post is the idea of moving to the frequency domain (“doing DCT” as it is called). I cannot explain the concept of the frequency domain/DCT in a comment box, sorry. But everything else is layman’s terms.

[q]Well written, and nice, but I would like to read more about quantization. I read several articles about it, and I still don’t fully understand it.

General concept:

First, we move from the spatial domain (aka brightness values) to the frequency domain. Then, during lossy compression, the higher-frequency values are not discarded, they are just recorded with lower accuracy. How do you “record with lower accuracy”? I ‘ll explain:

Method:

The move to the frequency domain happens in blocks and NOT for the whole image as the dangerously misleading article Thom posted suggests (aka, the image is broken into blocks). Let’s assume a block size of 8×8. Let’s assume we are processing a single block and we have already moved to the frequency domain.. So… JPEG (and H.264) divide (with integer division) the resulting 8×8 block element-by-element with a “quantization matrix”. Here is how a “quantization matrix” looks like btw[1] By doing an integer division of the block by that matrix, you drop accuracy. For example, let’s say that the four elements (frequency components) at the bottom right corner of the 8×8 block are 113, 115, 118, 116. Now look at the bottom right corner of our quantization matrix[1]. Those four values will be divided by 56, 69, 69, 83 accordingly. So, the result will be 2, 1, 1, 1. Of course, during decoding you have to multiply with the quantization matrix (element-by-element) to “restore” the values. The “restored” values will be 112, 69, 69, 83. Notice how accuracy was lost. This happened because of the integer division.

Also, notice how, after quantization, we now have three same values 1,1,1 Hello run-length encoding! After quantization, we scan the table with the infamous zig-zag pattern[2] to put the elements of the (quantized) table into a series. Notice how our three similar values (1,1,1) will be grouped together by that pattern.

(that moment when you realise the horrible article didn’t even mention the zig-zag pattern).

Also, after run-length happens, all compression standards use a special scheme to encode those integers (value and length) where low integers (0,1,2,3) use the minimum number of bits. Variable-length coding we call it.

The savings quickly add up.

—

It is also worth noting that most pictures don’t have much in the high-frequency components anyways, so most high-frequency components (sometimes even half the block) will become zero after division.

—-

As another guy said, most encoders allow you to choose different quantization tables (which more often than not is a single quantization table which has all of its elements multipled by a “QP” factor to generate many tables). This allows encoders to have a “quality setting”.

—

Article also doesn’t mention that after “motion estimation” has been done, an “error” is calculated. Since motion estimation is nothing more than a dumb copy paste from another position in another frame (but the same size) some error will exist, which is essentially values that follow the same compression principles as outlined above (but with a different quantization matrix).

—

[1] http://images.slideplayer.com/25/8083681/slides/slide_6.jpg

[2] http://lh3.googleusercontent.com/-T_6oixAuBjs/Vd8ojQIbjLI/AAAAAAAB3…

Edited 2016-11-06 21:24 UTC

2016-11-06 9:59 am

nicubunu
Sorry, but I couldn’t read this article, is so full of bull. I managed to get somehow past the overly-exaggerated intro up to the image comparison.

Seriously, a perfect looking PNG versus a horrible video with hideous compression artifacts and poor image quality (protip: save the PNG as JPEG with 60-70 compression ration, file sze in much smaller, image quality much better compared with H.264). After this, I just can’t take the author seriously.

2016-11-06 11:16 am

Quikee
I agree.

His analogies are bad, leaves important things out (intra prediction), no high level encoding diagram or other diagrams and most things are actually JPEG (the real magical format that is still relevant today – 25 years after it was released) with a bit of MPEG2 at the end. There is nothing about H.264 or what it makes this format magical.

He should’ve describe it in a better order: start with simple general purpose coding, then still image coding and human perception with describing techniques of JPEG in the order they are encoded, then go to video and describe techniques that are unique for video (with examples from MPEG2) and lastly describe what H.264 does on top.

Now it just feels really sloppy and deceives about H.264.

2016-11-06 1:41 pm

TasnuArakun
I feel the same thing: I’m not terribly impressed by this article. It’s all quite basic stuff and it could just as well have been describing MPEG-1. I’m pretty sure H.264 has a lot more interesting things in it’s toolbox. I’d like to know about all the stuff that’s been added over the years that make H.264 more efficient than MPEG-1.

I also don’t like how it handwaves the more complicated stuff by calling it “mindfuck”. By the way, I didn’t think run-length encoding counted as entropy coding and I think he is mixing talk about the frequency domain and the sampling theorem in a confusing way.

For a better introduction to digital media I urge people watch Christopher “Monty” Montgomery’s videos: https://www.xiph.org/video/ . I really wish he could do one on DCT and MDCT though. MDCT still seems like some kind of evil dark magic to me.

2016-11-06 7:28 pm

Alfman verbose=1
I agree with all the criticisms of the article, however that aside I’d just like to be on record saying that I do like when these kinds of articles are posted. It’s a refreshing change away from IOS/android/MS/etc that dominate the headlines. So, more articles from left field please
2016-11-06 8:50 pm

dpJudas
By the way, I didn’t think run-length encoding counted as entropy coding and I think he is mixing talk about the frequency domain and the sampling theorem in a confusing way.

In JPEG, entropy coding is huffman + significant bits of 16 bit integers. Strictly speaking not RLE, but I would put both in the same family of compression techniques, with RLE being the inferior method.

I do agree with the general critique of this article not having anything directly to do with H264 though. More like a primer on the basic ideas used in lossy image and video compression. It also leaves out half the important stuff, like predictors and other transforms.

2016-11-07 1:44 am

j-beda
Captain Disillusion had a bit of this in his “Reptilian Bieber-mosh” explination of how video artifacts give rise to people looking like lizards. His imagery of how the p-frame “wears” the i-frame like a skin is sort of cute:

https://www.youtube.com/watch?v=flBfxNTUIns
2016-11-07 3:36 pm

Milo_Hoffman
https://www.reddit.com/r/programming/comments/5b31gt/h264_is_magic/