The Annoyance of Video Compression
This is probably going to be a bit of a rant. After having spent the last few hours playing with trying to get TVersity 1.0 RC1 to properly read and transcode video I captured using a Plextor ConvertX/Intervideo WinDVD Creator 2.. its pretty straightforward video, it captures at 720×480 and compresses using Divx 6 (6.8 is installed on the machine) but annoyingly enough it doesn’t use mp3 for the audio, but rather mp2. Which, I suppose is understandable. Unfortunately, its not the easiest format to work with, there’s no VfW decoders for mp2 audio that are free, due to licensing restrictions, which makes tools like VirtualDub fail to be able to convert the audio to something more friendly.
Where the fighting comes in, is that the digital media player here, the D-Link DSM-320, which is pretty nice, does not support MP2 audio in an AVI. Even though the profile tversity has for the device, which specifies what needs to be transcoded to be able to play vs. what can be played natively, enabling a much wider range of videos to be played, thinks it can. The result, audio that’s nothing but skipping static. and lots of it. Eek. Luckly just a simple comment in the profile.xml file for the device telling the software that an mp2 in an avi doesn’t work, is enough and the transcoder now kicks in to handle the conversion to a format the player can actually handle.
The transcoding feature though, while very nice when it works, can be an absolute headache to get working properly with a variety of formats. Anybody who’s saved videos from the web or captured stuff over a long enough time has no doubt collected quite a mess of different formats, even if they only appear to be a few types.. (the container types tend to be nice and misleading to make you think there’s only a few.) commonly nowadays, flv, wmv, mov, and avi.. with rm (and ogg) in there somewhere.. this of course, left out all the mpeg, mp4, m4v and the mountain of extensions those use. On top of that, if somebody hands you a .mov, is it H264? MJPEG? or something else? If you’re just using Quicktime, its not a big deal, since any type the file can be will play with it, but then you end up with a pile of players installed on your system fighting over file types since they all want to be the one true player, but fail in subtle ways abysmally. As you’d expect though, not everything can just go through the one true player for playback and not expect things to be a bit more compatible, TVersity’s transcoder is one of those things. On Windows, which is the only platform for now, it supports, it uses DirectShow to read the videos, so any video you can play in Windows Media Player (or any DirectShow capable player) can be read, and played back to your device.
So this brings up the obvious problem, how do you get DS to be able to handle the popular formats? Well, let’s start with a (hopefully) simple explaination of how video is played back…
The image above is graph of youtube flash video, as it would be opened by directshow. First thing is the container is opened by the appropriate filter, in this case the “FLV Splitter”, flvsplitter.ax, which as its name implies “splits” the video into its component parts, which are then read to determine what compressor was used, (for youtube, this is currently H.263 for video with MP3 audio) and the appropriate decompressors are selected and the decompressed data would then be passed to to your soundcard (“default directsound device”) and player screen (“Video Renderer”). In my case, both H.263 and MP3 for video is handled by ffdshow, but a variety of directshow filters exist.
So, if everything goes well, the flv you saved will now play back in all its glory for you. For quicktime, using mp4splitter.ax (“MP4 Splitter”) will usually work since the quicktime format is mpeg4 based. Though some quicktime files seem to like having more quicktime components available, so installing Quicktime is sometimes the only solution. The MediaPlayer classic sourceforge page includes most of these filters, which are really handy. After you get and install the filters you’ll need, you’ll want to install FFDShow, which is a very nice set of decompressors/compressors for a variety of media types (including all the FLV types, H263, VP6 and H264.)
Now for the rant…. Why is all this so hard? Why, to make use of a fairly simple device do I need to install a crapload of filters and decompressors to transcode video that the ‘media server’ can’t even handle… I know I have an above average variety of videos locally to deal with, some compressors like Indeo and cinepak and all the mpeg varients as well as the average quicktime, flash, real and divx videos, and I don’t expect the older videos or the more obscure compressors to work, but its not too much to ask for a real error message when they fail, perhaps with some indication whats wrong, Video codecs have long confused users of all skill levels, people just want their computer to work.. not to spend hours trying to figure out which combination of codec and container in their matrix they missed installing software for. If you send Grandma a video of your cat playing with a ball of yarn, having to think about if they have quicktime (which your digital camera was so nice to preselect for you.. ) or not shouldn’t be needed. Thats one of the reasons why Youtube and similar sites which, yes, use flash, which most systems have today, its easy and you know on the other end it’ll play. And surprise, now that its become easy to share video without the headaches, millions of people have chosen to do so. Its really a great thing. It does come with its own problems though, alot of people have difficulty or don’t know how to save or play back web video later, and content authors often probably think their video will only stay on sites like youtube. (thats speculation, but it makes sense to me anyway).
Its already been proven that the various organizations and companies that develop these compressors and decompressors and players have no intentions of making that world any easier on the rest of us. Otherwise various platforms would have standard ways (like directshow) and players would actually, provide the filters to enable their formats to play, and not try to lock-in users to use only their players…
Now, here comes more fun, though only tangently related to the other paragraphs in this lengthy and wordy post.
Now that video is the web’s next big thing.. (thanks to youtube and flash player.) the slow, but steady standardization people have decided, that what we really need in the next version of html, is a <video> tag, because we really need to be rescued from the evils of flash-based video. These tags would enable the video to be played directly in the browser, which I suppose, is useful. Unfortunately, there’s no guarantee with these tags that the video format you want to use is available on the user’s browser. For obvious reasons, standards people want to use Ogg Theora and Vorbis (which most of the rest of the web goes.. what’s that? too.) as a baseline of support, which if nobody uses, you’re stuck with the bloat of having the code to support. There is a reason all that mess above with hundreds of annoying formats and compressors exist. Technology advances, there’s plenty of patents on this type of technology since the implemention of these compressors in media players, etc and the resulting payment of patent-fees for licensing their use cover the R&D costs. Of course, the mainstream web is likely heading towards the use of H.264 (which is being used for HD set top boxes, flash, blu-ray discs, mobile phones like the iphone, and alot more.. ] and MP3 (which at this point, is basically a household name, and probably one of the only formats to achieve such ubiquity in society), which is much more common, and make much more sense, from a pure-logic standpoint, but have licensing concerns for open/free specifications. The result of this rambling between the parties, is a specification that currently, afaik, guarantees the author, nothing, as there’s no baseline. its unfinished though, so hope remains. But even then, each browser that implements it, has the ability to implement additional types, via the use of directshow or whatever, so it just brings the incompatible video world right back into the users face. Oh, and thats assuming that all the major browsers decide to implement support… Now i don’t think that just leaving it to adobe to decide with its flash player that has been installed on 97% of browsers, what video types should succeed and fail is a good thing, but what exists now, does work. as far as web-video is concerned. (where work, to me = that the average user can succeed in posting silly video from their cell phone or whatever and share it without the technogy getting in the way) What gets standardized, *must* be that easy, or its not worth doing. 10+ years of having completely painful video experience was enough.
The next versions of Opera, Safari, and Firefox will have the open-format, license-free, patent-unrestricted Ogg/Theora built into the browser, to support the HTML5 tag. This should resolve much of your headaches, and help promote an open specification standard. You can find long lists of web sites using Ogg/Vorbis and Ogg/Theora on the Xiph.org wiki.