DV.com - Inspiring and Empowering Creativity
Ben Waggoner
Codecs and more...
This article provides both a qualitative and quantitative look at the current generation of codecs designed to work with the three leading Web video architectures: Apples QuickTime, Microsoft Windows Media, and RealNetworks RealVideo. I examine the codecs one would actually consider to use today, and ignore several alternate, pending, and older codecs. DV will examine MPEG-4 codecs issue when more ISO-compliant MPEG-4 codecs are available. The inexplicably popular DivX codec is covered in the More Codecs sidebar on page XX.
In this evaluation I compare each codec's ability to handle different real-world content and at several delivery data rates. I measured a wide variety of parameters including encode speed, platform requirements, quality at a given bit rate, and color accuracy. Given the variety of uses for any codec, I didn't assign final overall star ratings. However, it's clear not all codecs are created equal. An important part of the evaluation is on DV.com. There you can view all the compressed test files yourself.
Methodology
I had each codec compress four different source clips, each at two starting resolutions: 640x480 pixels and 320x240 pixels. For QuickTime, I started with lossless QuickTime Photo-JPEG files. For Real and Windows Media, I used Huffy-encoded AVI files. Both Photo JPEG and Huffy Huffman are encoded at 4:2:0. They use the native color space of the various codecs, and have similar decode speeds. But creating different files for different architectures was a hassle (see The Pain of Intermediate Files for detail).
Each source was precisely 60 seconds long and contained no audio. Since these tests were meant to solely test codec quality, I preprocessed the files myself to exclude any variable effects from built-in preprocessing by the various vendors own compression tools.
I encoded to three different delivery data rates: 26kbps at 320x240 pixels, 200kbps at 320x240 pixels, and 800kbps at 640x480.
Codecs ship with very different default settings. To achieve a level playing field, I tested all codecs in their highest quality mode in order to determine their native quality as best as possible. Where appropriate, I provide metrics and sample files for different modes.
How I compressed
Once I had the source files in the formats and sizes I needed, I was ready to start compressing. I performed all final encoding, including QuickTime encoding, on a dual-processor1GHz Pentium III, giving me a common base to compare compression speeds. Since Apple and the QuickTime developers did such a good job creating a cross-platform architecture and cross-platform codecs, it didnt compromise the image quality of the QuickTime codecs when compared to files compressed on Macs. Yes, I checked.
For RealProducer Plus and Windows Media Encoding Utility I wrote DOS Batch files to automate encoding and insertion of time stamps. The Windows version of QuickTime Player Pro doesnt support command-line or DCOM scripting, although the Mac version supports AppleScript in a much more robust way than either Real or Microsoft do on Windows. The API for Apples QuickTime for Windows appears to provide enough tools to make writing a command-line encoding application a relatively straightforward project for an experienced programmerhopefully someone will create one soon.
Instead, I used Media 100 Cleaner 5.1 batch files to automate QuickTime encoding on Windows. I ran Cleaner in Minimize Preview mode, with all the graphical displays in the Output window closed. As an alternative, I could have encoded from QuickTime Player Pro and timed with a stopwatch, but the encode times were with Cleaner and QuickTime Player Pro were the same for long encodes, and for many files short enough that hand timing made getting consistent results difficult. You can download my batch files, settings files, and logs from DV.com.
Frame quality was given the maximum emphasis over frame rate. If a codec allowed frame dropping in order to maintain data rate, I enabled that option.
The two delivery modes I targeted were progressive download, and RTSP (Realtime Streaming Protocol) or MMS (Microsoft Media Server) streaming. For formats that supported both modes, I evaluated both. For those that targeted only one (e.g., RealVideo for RTSP and VP3 for progressive), I targeted just that one.
The progressive download settings used full-file 2-pass VBR (variable bit-rate) encoding if the codec supported it.
The different formats make different assumptions about whats best for realtime streaming. For example, Windows Medias realtime buffer defaults to five seconds of media, but Real defaults to seven seconds. So for RTSP codecs that support buffering, I set the buffers to six seconds. For all codecs, I set one keyframe every 10 seconds.
I evaluated each codecs output on several computers running a variety of processors and operating systems. The quality evaluations were based on viewing the moving file, not on staring at still frames.
The Source files
I used four source files for this evaluation, one with film-originated footage, one of computer-generated motion graphics, one of a DV-sourced talking head, and one of a found-footage art project.
The film-originated sequence attempts to mimic the spatial and temporal characteristics of a typical feature film with careful lighting and smooth camera moves. The sequence was assembled from digital files provided by ArtBeats (www.artbeats.com). All source was shot on film, and the intermediates were inverse-telecined via After Effects to their native 24fps. I added in a one-second cross dissolve and some rolling credits at the end.
The film sequence was one of easiest to encode. The lower 24fps frame rate means 20-percent more bits can be applied per pixel when compared to 30fps source. The content itself is quite clean even though it has significant motion. However, that motion has the substantial motion blur of film, another reason film is so easy to encode.
The motion graphics sequence was quite hard to compress because of the clips sharp edges, saturated colors, and large amount of overall motion. All the clips except the color bar were rendered from projects in Chris and Trish Meyers excellent book, Creating Motion Graphics with After Effects (CMP Books, 2000). I changed the original project rendering settings to 640x480 pixels and progressive scanning.
The color bars give an easy way to calibrate color shifts or distortions caused by the codecs. I placed it at the start of the sequence to ensure that the codecs wouldnt be busy encoding high-motion images and would have the maximum data rate available for encoding.
This talking head clip is typical talking head content, just a guy talking without any complex edits. The footage was shot by DV contributor Bruce Johnson for a documentary by Senior Editor Jim Feeley. This is the easiest kind of content to compress. However, the general lack of motion tests a codecs ability to make new keyframes that match the delta frames.
The fast-motion sequence was created as an art project, not a codec buster. The video was created by Carrie Hazelwood and entered in DVs 2000 MediaMasters contest. However, this artistic combination of found video shot off a TV screen serves as an excellent example of content not optimized for Web video bandwidth. The only thing that would make it more complex would be if it were in color, not black and white.
Windows Media Video 8
Microsoft's Windows Media (www.microsoft.com/windowsmedia) offers a limited number of codecs and no third-party codecs. But what it does offer is excellent. Microsoft currently offers these Windows Media video codecs: MS MPEG-4v3, Windows Media Video V7, Windows Media Video V8, ISO MPEG-4, and Windows Media Screen. I focused on Windows Media Video V8, the codec Microsoft says looks the best.
The Windows Media V8 codec (WMV8) is included in the current Windows Media Player (at this writing, version 7.1 for Windows and 7.01 for Mac). Versions of the Windows Media Player back to version 6.4 can automatically download the new codec when they encounter V8 encoded material, as long as the user has administrative access and isnt behind an aggressive firewall.
WMV8 is an all-around excellent codec. The latest version of the Windows Media codec supports some new features, including both one-pass CBR (constant bitrate) and VBR (variable bitrate) modes. In Microsoft parlance, CBR is the streaming optimized mode, and VBR the progressive/LAN optimized mode.
You can specify a maximum buffer size in the CBR mode, but not in VBR, since VBR distributes bandwidth over the whole file. You can access the one-pass modes in Microsofts free GUI encoder, Windows Media Encoder 7.1.
However, Microsoft didnt add two-pass VBR support to the current SDK. I presume some architectural decisions early in the Windows Media development painted them into a corner. Instead, they produced free a command-line encoder, Windows Media 8 Encoding Utility. Yes, thats command line, as in C:> DOS Prompts. The good news is that many different GUI utilities have been created to automatically generate the command line code. I like WM8GUI.EXE, which you can find at DV.com.
During playback WMV8 can apply one of four different levels of post-filtering processing, depending on how much excess CPU power is available in the client machine. As extra CPU power increases, WMV8 applies an increasing amount of deblocking. At the two highest levels WMV8 adds a de-ringing filter, making the edges of text look better.
While WMV8 is the preferred Windows Media codec, some folks still use MS MPEG-4v3, since it is the only codec available in the current Windows Media Encoders that works with the standard version of Windows Media Player 6.4. WMP 6.4 is the last version thats compatible with Windows 95 and NT. While this doesnt significantly limit a clips reach in the consumer space, a large number corporate PCs are still on Windows NT or sit behind a firewall and cant download the new codec. Microsoft plans to release by the time you read this a new WMP6.4 package that includes the V8 codec.
To summarize, Windows Media 8 offers the following encoding modes:
One-pass CBR. The traditional encode mode.
Two-pass CBR. A buffered VBR, and the ideal choice for MMS streaming.
One-pass VBR. This mode controls for image quality, not data rate. The data rate goes changes, sometimes radically, as needed to maintain image quality. I didnt evaluate WMV8s one-pass VBR mode since I evaluated image quality for given data rates.
Two-pass VBR. This full-file VBR is great for progressive download and local file playback (e.g., with the file on your computer or on a CD-ROM). Note that WMV8 two-pass VBR files dont play reliably in WMP 6.4, but play fine in WMP7+.
WMV8 results
Data Rate control. WMV8's data rates were a little unpredictable. The CBR modes were generally closer to the requested rates with clips targeted at 30kbps all having an actual data rate of 32kbps, the 200kbps-target clips running at 207kbps, and the 800kbps clips at 807kbps. Since those results are so consistent, you could easily apply a fudge factor to hit the actual target data rate.
For VBR, the results were more variable. Clips targeted at 30kbps delivery came out between 30 and 34kbps, those targeted at 200kbps actually ran between 192 to 212kbps, and those targeted at 800kbps ran between 768 and 855kbps. Generally, the talking head or motion graphics clip had the lowest data rate, and the film-sourced clip had the largest.
Frame rate. Frame rate was rock-solid for every file. For every segment with motion, the compressed files consistently maintained the sources original frame rate. The static color bars at the beginning of the motion graphics played back at 1 fps, but this is appropriate since that section has no motion.
Performance. While WMP 7.1 will run on a Pentium 166MHz machine with 32MB RAM, Microsoft recommends a Pentium or AMD Athlon K6 266MHz processor and 64MB RAM. I found only a couple minor playback issues. On the dual-processor 1GHz PIII, the deblocking filter didn't kick in at 800kbps, but the data rate was high enough for decent quality. Files running at lower data rates had significant post-processing. No dropped frames.
Quality. The files compressed with Windows Media V8 all looked quite good. Even the text at the end of the 30kbps film file was legible. In general files compressed in the two-pass modes, especially two-pass VBR, show a significant, but not tremendous, quality improvement in image quality over clips compressed with the one-pass mode. However, some elements looked slightly worse with two-pass compression as the codec reduced bandwidth for one section in order to raise it for another. Overall, Windows Media 8 delivers very high-quality images.
RealVideo 8
With the latest version of RealVideo (www.real.com/devzone) RealNetworks, in a wonderful act of simplification, provides just a single codec that one would want to use: RealVideo 8 (RV8). While the older RealVideo G2 and G2+SVT codecs are still included, there isnt a compelling reason to use them. RealPlayer, now at version 8.5, has supported the RV8 codec since 8.0. Reals robust auto-upate feature means that the vast majority of active RealVideo users already have, or will soon painlessly add, the RV8 codec.
RV8 is a solid, modern codec. It offers a number of configurable parameters such as letting you choose to emphasize image quality over frame rate. It also supports two-pass VBR encoding with a buffer size of up to 60 seconds. RealVideo isnt aimed at the CD-ROM market, and so doesnt have a progressive/local mode beyond that buffering.
The RV8 decoder offers a widely scalable post-processor that delivers increasingly better playback quality on increasingly powerful client computers. Very slow machines (under 200MHz) show just the default frames. On faster machines, a deblocking filter is applied. And for machines well-above a given clips baseline needs, RV8 can actually play it back at a frame rate higher than that it was encoded at.
How's that? RV8 evaluates the motion vectors of each clip and creates interpolated frames. This works a lot better than you might suspect. It also makes RV8 more difficult to review since viewer experience is even more dependent than usual on the performance of their particular computer and on the content of a particular clip.
RV8 results
Data rate control. RealVideo's data rate control is extremely accurate and doesnt vary at all from the setting you request. However, the file sizes for one-pass RV8 encoded material vary somewhat. For example, the motion-graphics and high-motion samples exceed the size suggested by the reported data rate. File sizes for two-pass encoding were all quite accurate.
Frame rate. Predicted frame rates were generally quite accurate. The high-motion sample dropped its frame rate below the target in some sections, but not in an annoying way. During playback of the 30kbps files the reported frame rate frequently exceeded the encoding frame rate due to post-processing interpolation.
Performance. All files played fine on their target machines.
Quality. Overall, RealVideo 8-encoded files had good quality, but not as good as files encoded with Windows Media 8. In particular, RV8 files tended have an overall softness, and exhibited some blockiness around circles and diagonal lines, as seen in the film-source sequence with the Moire patterns around the planes engines.
RV8's frame rate interpolation caused trouble with some 30kbps clips. For example, in the film-source sequence's credit roll blocks of moving text ran out of sequence. The variable frame rate also produced distracting results in sources that had constant motion. RealNetworks should offer an option for users to disable this feature.
As expected, RV8's two-pass mode provided better overall quality than its one-pass encoding, although a few areas looked worse. As buffer sizes increase, the image-quality advantage of two-pass encoded files increases.
QuickTime 5
QuickTime has always provided the broadest selection of codecs for the broadest range of tasks. I wont even try to give a complete list of the codecs currently available. Since QuickTime, unlike Windows Media and RealVideo, is used in content production as well as content distribution, many of those codecs dont concern us here. Theres another reason why there are so many QuickTime codecs.
QuickTime makes it very easy for third party developers to create and distribute codecs that automatically work with existing applications. Apple has been helping with the distribution effort by opening its codec auto-update functionality to certain third party codecs. If a viewer plays a file that requires a new codec, QuickTime queries a server at Apple to determine if a downloadable codec (or just decompressor) is available. If so, the user is asked for permission to download. If they click OK, the codec is installed and the files plays without having to reboot the computer or even restart QuickTime Player. Its slick.
Thus, a healthy codec marketplace has developed around QuickTime.
The first three third-party codecs included in Apples auto-update system are Sorenson Video 3, ON2s VP3, and Media Metastasis's ZyGoVideo with more in the works.
Conventional wisdom states that QuickTime is an excellent technology for progressive download and local files, but not for RTSP streamed video. With QuickTime 5 and QuickTime/Darwin Streaming Server 3 (QTSS3), this is much less true today. QTSS3 includes skip protection technology that dramatically improves the end user experience.
Skip protection lets the client-side buffer get extremely long, so if a user has a connection speed higher than a clips data rate, QuickTime will buffer as much of the file as possible. This means you can even unplug the computer from the network for a while, and the video will keep playing until it hits the last frame in the buffer. QTSS3 also will retransmit lost packets, which makes the prevalence of glitches in the video much lower. Lastly, QTSS3 can dynamically drop B-frames to reduce frame and data rate when streaming MPEG-1 and Sorenson Video 3 (more on that later).
Some QuickTime codecs include a native packetizer for RTSP streaming. A native packetizer lets a codec deal with corrupt or missing data more intelligently than just displaying a white screen. This formerly critical feature for RTSP streaming is less important with QTSS3s skip protection, but it is still a nice feature to have. The only QuickTime codecs with native packetizers right now are Sorenson Video 2 and 3, and H.263. Native packetizers don't matter at all for progressive or local movies. Now on to the codecs.
Sorenson Video 3
Long-hyped and long-delayed, but finally shipping, Sorenson Video 3 (SV3) is Apple's default codec for QuickTime 5 Web video. Based on a technology completely different than the previous versions of Sorenson Video, SV3 still offers many of the same options and features as its predecessors. However, the underlying architecture is superior, bringing SV3 into the top ranks of modern codecs. The $499 Professional Edition is easily the most fully featured codec available for any architecture.
SV3 playback is built into QuickTime 5.0.2, and any version of QuickTime 5 will auto-update with the SV3 decompressor when needed. QuickTime 5.0.2 also includes a Standard version of the SV3 compressor. SV3 Standard provides a good suite of default options and is much better for CD-ROM and progressive content than the old SV 2 Basic version. However anyone doing professional QuickTime compression should buy the Professional Edition, especially if they are doing RTSP streaming.
SV3 Standard offers the normal QuickTime options for setting data, frame, and keyframe rate. The options available in the Developer version are preset and not user definable. But compared with previous free versions of Sorenson, compressing with this version isnt dog slow, and defaults to image smoothing on. SV3 Standard is a viable option for consumers authoring CD-ROMs and progressive Web content.
The Sorenson Video 3 Professional Edition options are legion. The most important for compression is Bidirectional Prediction, enabling B-frames that reduce data rates around15 to 25 percent at a given quality. Also, QTSS3 can dynamically drop a few B-frames during streaming if there isnt sufficient bandwidth for all of them. This feature will drop the frame rate by half, instead of dropping all the way down to just keyframes.
One drawback to Bidirectional Prediction is that it moves video playback one frame back in time. This can cause audio sync to be off, and the lower the frame rate, the greater the lag. So with SV3, if the target frame rate is below 10, this option is automatically deactivated during encoding.
The Temporal Scalability option lets B-frames get dropped when a CPU is getting overwhelmed during playback. This feature requires Bidirectional Prediction, so if youre using that, you might as well turn Temporal Scalability on. Unlike its implementation in SV2, Temporal Scalability imparts no quality or performance hit.
Force Block Refresh lets you determine a maximum length of time between refreshes of each16x16-pixel block of video. This option enables the frame of video to regenerate within a set number of seconds if a packet was dropped or a stream is tuned in just after a keyframe already played. The option uses a little extra bandwidth for low-motion content, and has no effect on high-motion content where every block is likely to change frequently. I suggest using the options defaults with media destined for RTSP delivery, and turning the option off for everything else. I made our files with this option off.
Other SV3 Professional encoding features include a Quick Compression mode, configurable Automatic Keyframes, a Minimum Quality frame dropping feature, the ability to disable the Image Smoothing deblocking filter, Media Key passwords, and real-time alpha channeling and watermarking. Unlike SV2, SV3 supports color watermarks.
Another important feature of the Professional edition is two-pass VBR encoding. This option is currently only available when SV3 Professional is used in conjunction with Media 100 Cleaner 5.1 (let me disclose that I used to work for Media 100). Currently, Cleaner 5.1 doesnt let you set a buffer size, instead it distributes the data rate over the entire file. It also lets you set a peak data rate to keep complex sections of a clip from overloading the processor during playback. With Cleaner 5.1, SV3s VBR mode is optimal for progressive and local content, but should not be used for RTSP content. This is a limitation of Cleaner, not of the codec or QuickTime. I hope Media 100 and other companies developing tools to compress SV3 will soon add support for a configurable buffer size.
While SV3 playback requires more from a CPU than SV2 did, SV3 does offer a couple of ways to reduce its load. First, in both the Standard and Professional versions, image smoothing gets automatically turned off during playback on slower machines at certain resolutions. Its never applied to pre-G3 Macs and pre-Pentium II PCs, turned off below 320x240 pixels on G3 and PII machines, and so on.
Second, for high data-rate content, the Temporal Scalability option can cut CPU requirements roughly in half by dropping half the frames on slower machines.
With all these options, I had to make some choices when compressing our source files with Sorenson Video 3. I turned off Force Block Refresh, mirroring having Error Correction off in RealVideo 8. I set the two-pass VBR peak limits to 300kbps for 30kbps delivery, 800kbps for 200kbps, and 1600kbps for 800kbps. Without limits, peaks in the 800kbps high-motion file reached above 3000kbps.
I set the Minimum Quality option at 0 and allowed drop frames to maintain data rate, in order to better hit data rate targets.
SV3 results
Data rate control. For the more difficult clips, SV3 had trouble hitting the 800kbps target data rate even with frame dropping activated. The one-pass clips produced much more data rate variability than those processed with two-pass encoding. Oddly, at lower data rates were within a few percent of the target data rate. Two-pass encoding produced clips with on-target data rates, except with the high-motion source where the encoded clips tended towards a too-high data rate.
Frame rate. With one-pass encoding, the frame rates of encoded the high-motion and the motion-graphics clips dropped by 10 to 30 percent compared to the source. Frame rate in two-pass mode was always perfect.
Performance. All machines were able to play all clips, if a peak data rate was set. Without a set Peak value, even on the dual-processor Pentium III system had trouble playing the 800kbps 640x480 SV3 video clips at full frame rate. Setting the Peak to 1600KNps for those clips eliminated the problem. The moral of this story is always set a peak data rate when encoding high data rate content with SV3.
Quality. SV3 automatically turns off its Image Smoothing filter for content delivered at 640x480-pixels, so the high data-rate content can look blocky in very difficult shots.
SV3's two-pass VBR mode significantly improves overall quality. Hopefully a future version of either Cleaner or Sorenosns own upcoming Squeeze compression tool will implement a buffered VBR mode that will enhance the value of SV3 for RTSP streaming.
Overall, Sorenson Video 3 provides excellent quality, better than RealVideo 8 and on par with Windows Media 8.
H.263
H.263 is an established and widely-used videoconferencing standard. The MPEG-4 video codec is essentially an enhanced H.263. Compared to SV2 it handled fast motion much better, and encoded much faster, making it the default QuickTime codec for live broadband streaming. But SV3 encodes quite a bit faster and offers higher quality than H.263.
However, since H.263 is a standard, several non-QuickTime players can view streams encoded with it. H.263 is internally limited to 352x288-pixel resolution, so encoding at higher resolutions causes it to scale up internally and inelegantly, rather limiting its quality.
One tip: If you use H.263 for RTSP streaming, turn on the Cycle Intra Macroblocks option. This option achieves the same effect as Force Block Refresh, helping frames replace missing macroblocks without having to wait for the next keyframe.
Since H.263 has a maximum resolution of 352x288 pixels, I didnt test it at 800kbps.
H.263 results
Data rate control. Data rate control in H.263 is always dead-on, a legacy of the tight data rates needed in videoconferencing, H.263s heritage.
Frame rate. While H.263 reports that its output frame rate is equal to the input, visually it clearly drops substantial numbers of frames in difficult sections.
Performance. H.263 decodes fine on all target machines, noting that I didnt create any 800kbps H.263 files.
Quality. While H.263 was a contender against SV2, especially SV2 Basic, SV3 is clearly superior across the board. H.263's soft images and dropped frames simply arent competitive. Even the simple talking head file looks rather terrible at 30kbps.
VP3
VP3 note: doesn't seem to be an acronym-jim from On2 Technologies, formerly The Duck Corporation (www.on2.com), is a close second to Sorenson Video 3 in the QuickTime buzz contest. While not part of QuickTime's standard 5.02 install, VP3 was the first codec available through the auto update feature. If you have yet to come across any VP3-encoded material and trigger the codec's automatic download, you can also manually start the auto-update with the QuickTime Updater application the fetched both the encoder as well as the decoder.
On2 also offers a plug-in that lets VP3-encoded content play back in RealPlayer, and plans to offer by the time you read this a version of VP3 that will work with Video for Windows-compliant applications. At our deadline, On2 announced plans to release an Open Source version of VP3.2. They also announced a licensing deal with RealNetworks for VP4 (see More Codecs on page XX).
Note that just having the encoder doesnt give you redistribution rightsthose cost $29.95 for personal use and $1995 for commercial use.
VP3 has a deblocking filter that improves image quality when running with sufficient power on a Mac G4 or Windows PIII, but not on a G3, PII or earlier CPUs. Note that while On2s browser plug-in version of the VP3 technology only drew every other line during scaled-up playback, the QuickTime version handles scaling properly without any black lines.
Encoding with VP3 is similar to most QuickTime codecs, but with a few special options. It defaults to the Quick Compress mode, which entails less of a quality hit than most other Quick Compress modes, and speeds up encoding quite a bit. You can specify whether or not the codec drops frames during playback to maintain image quality and data rate. And VP3 has a pretty powerful auto-keyframe mode.
For most content, VP3s default options work well, though I tend to set the Maximum Keyframe Distance setting to ten times the frame rate. If you use thisand I recommend you doset QuickTimes Keyframe every value in the main QuickTime dialog to 9999 so that QuickTime itself doesnt try to insert additional keyframes.
VP3 simply doesn't encode to 30kbps, so that data delivery rate excluded from our tests. I used VP3s built-in keyframe detection system, and disabled QuickTime's.
VP3 results
Data rate control. At 200kbps and 800kbps, final data rates were close to the targets, with only the high motion clip being significantly higher than the target data rate. In some cases, final data rates were lower than requested. However, if I didnt enable frame dropping, the high-motion file size doubles, and others increased significantly.
Frame rate. At 200 and 800kbps, frame rates were accurate for the talking head and film clips, down 10 by for motion graphics, and down by 25 percent for the high-motion clip.
Performance. Performance is VP3s biggest strength. Content encoded with VP3 at 640x480 pixels can play back on machines of middling speed.
Quality. While VP3s quality isn't great at 200kbps, it comes into its own at 800kbps, even of you keep the deblocking filter turned off. Compared to SV3, VP3 offers better quality with some content, and a much more efficient decoder, allowing bigger movies to play on slower machines. However, VP3 lacks a two-pass VBR mode and a native packetizer. In all, VP3 is a good alternative to SV3 for high-resolution broadband content.
ZyGoVideo
The ZyGoVideo from Media Metastasis (www.zygovideo.com) codec is based on wavelets, the technology behind the Indeo 5 and JPEG2000 codecs. Wavelets offer a lot of promising features for Web video, such as arbitrary scaling and dynamic thinning of quality bands to reduce bandwidth or CPU requirements. The current builds of ZyGoVideo dont support these features yet, although the company is looking into adding the functionality into a future release. ZyGoVideo is also developing a playback codec for handheld computers based on the PalmOS
ZyGoVideo currently offers a free basic and a $49 (introductory price) Pro version. The Pro version adds a few encoding options and several behind-the-scenes quality improvements. The extra options are configurable keyframe sensitivity, configurable motion detection, and image smoothing. Increasing the motion detection value improves quality somewhat, but dramatically increases encode time and increases decode requirements about 10 percent.
The codec keeps tight control of its data rate. Because of this, keyframes are often starved for bandwidth, causing the video to get blocky at keyframes, and then regain quality over the next few delta frames. This is an odd design decision since the code doesnt have a native packetizer, making the current version of ZyGoVideo unsuitable for tasks other than progressive download and local playback, where a loose data rate would be fine. The quality slider in the main QuickTime dialog has an effect on ZyGoVideo keyframe size, so turning it up whey you have trouble with blocky keyframes substantially helps overall quality.
For the two lower data rates, I turned on the smoothing filter and set Motion Quality to maximum. For the 800kbps, I set smoothing off and high motion on, since most CPU couldn't support both options at that data rate. I raised spatial quality from the default of 50 to 100 in order to get keyframes large enough to avoid a strobing effect.
ZyGoVideo results
Data rate control. ZyGoVideo had poor data rate control. The motion graphics files were anywhere from 30 to 100 percent above the target rate. The high motion files were also above target, but not by as much. The talking head files often fell below target data rate, even though their quality was still limited. These results suggest that ZyGoVideo would benefit from a frame-dropping mode to help it hit target data rates.
Frame rate. The codec delivered encoded files with exactly the frame rate that assigned at compression.
Performance. Even the dual Pentium III computer had trouble playing back all frames evenly on the 800kbps files encoded with ZyGoVideo. Alas, the lack of a frame-dropping mode inhibited smooth performance. QuickTime Player also crashed several times when trying to randomly access various points inside the 800kbps files.
Quality. In its current iteration ZyGoVideo is not competitive. Even with smoothing turned on, an underlying block structure was still quite visible. Some streaks were left behind when playing back the high-motion files.
Well that's what I found. But a big part of choosing the right codec for you is deciding which one you think looks best. All of the test files created for this article can be viewed on DV.com.
Ben Waggoner
Copyright 2002, CMP Media LLC