Heath Firestone
Issue: October 1, 2007

THE TRUTH ABOUT CODECS - PART 1

It’s a bit of a challenge staying on top of all the new formats and codecs used in our industry. In the past, there have been a few mainstay codecs, such as MPEG-2, DV, HDV and DVCPRO HD, but now several new codecs are emerging, including variants on MPEG-4, MJPEG2000, RedCode, DNxHD, ProRes and others. Here’s what you need to know about these codecs, especially the ones used in high definition.

INTRAFRAME VS. LONG GOP

Intraframe (spatial) compression is a form of compression where an entire image can be reconstructed by the information contained within that image, (considered single-image compression). They are generally referred to as I-frames.

Long GOP (temporal) compression has I-frames as well, but also has partial frames using temporal compression, in which frames store differences between themselves and their neighboring frames. These frames are B and P frames. Long GOP compression is used on DVDs and in HDV. DVDs generally use a 15-frame GOP (Group of Pictures), which means that there is one I-frame every half second, or 15 frames. The other frames in the GOP structure have less bandwidth allocated, referencing their nearest I or P frames to reconstruct the frames.

While Long GOP offers up to three times more efficient compression, it comes at the cost of requiring more processing power and cannot be edited without first being converted to full frames, which can be done through conformation or conversion to an intraframe codec, or in rare instances an editing application may do the conversion in realtime, storing the Long GOP files in their native format.

The other potential drawback presents itself in the form of blocky compression artifacts in situations where too much of the image changes — from one frame to the next — for the limited bandwidth allocated for the B and P frames to reconstruct the frame accurately. This however, is less of a problem with higher data rate compressions, shorter GOP structures and variable bit rate (VBR) encoding, which allows more bandwidth to be allocated when needed to counter these types of problems.

DCT & WAVELET

There are basically two types of compression used in Intraframe compression, known as wavelet, and Discrete Cosine Transform (DCT). Most of the codecs employed for professional video use DCT compression. There is no simple explanation for how this works. DCT generally uses 16x16 or 8x8 pixel macroblocks, which are used in a matrix-based compression algorithm. wavelet compression, by comparison uses a wavelet-based algorithm, which converts the pixels into coefficients, which then go through transform coding and quantization, making it an ideal solution for scalability.

The biggest difference in compression artifacts using wavelet compression versus DCT-based compression is that instead of seeing blocky artifacts, there is a softening of edges.

OTHER TECHNIQUES

Many of these codecs also employ Chroma Subsampling, Huffman run length encoding (RLE) and entropy encoding.

Chroma Subsampling means that the luma (brightness) information is stored for every pixel, but the image has a half  (4:2:2) or one-quarter (4:1:1, and 4:2:0) chroma (color) resolution. For the pixels without color sampling, the image assumes the same color information as the last color sampled pixel, with the stored luminance value applied. This is one of the reasons 4:1:1 sampling creates difficulties for composites as it causes stair step artifacts along the edges between the subject and the greenscreen.

RLE encoding is a quick, lossless compression, which works by turning areas where several pixels in a row have the same value into a compressed version. To visualize this, assume 15 pixels in a row have the value of A, then the next five have a value of B, and the next 10 have a value of C. Rather than have 30 individual values, it would reduce the info to 15A5B10C. In many situations this has little effect on reducing the size, however if you are exporting an image with alpha information, a great deal of that image might have the same value, of being completely transparent. In these situations, RLE compression can have a dramatic effect on file size.

Entropy Encoding reduces the number of colors in an image through palletization, which basically reduces image size by reducing the number of colors in the image. This is done by creating a customized palette, which represents the most commonly used colors in the image. An example of this would be if a 24-bit image was reduced to an 18-bit image by creating a pallet of the 262,144 most commonly used colors. This works, because most frames don’t really need every color to create an image. For example, a mostly red image probably doesn’t need to reference too many shades of blue, though it may need every shade of red to avoid posterizing artifacts, in other words, to maintain a smooth red gradient.

MORE TO COME

Next month, Post will discuss what common codecs are DCT- and Wavelet-based, and what the advantages and disadvantages of each codec is. We’ll also explain what this means when it comes to production and post quality and workflow.

Heath Firestone is a Producer/Director with Firestone Studios in Denver, CO. He can be reached at: heath@firestonestudios.com.