| Digital Video: Overcoming Poor Quality With
Compression BY NANCY MIRACLE
Lets face it digital video conferencing isnt going to become the
norm in the business world until the image on the computer monitor is as good as what
people see on their home TV. In a world where television and video games abound, we have
become accustomed to seeing a clear, sharp picture. Visual information is attractive and
is often the most effective means of conveying extremely complex information and exact
shades of meaning. Why then, hasnt the digital video conferencing business been
exploding?
Research shows that people assimilate visual information more quickly than audio or
written data. The human brain is capable of transforming complex visual images into
meaningful information at very high speeds. This ability may originally have had a clear
evolutionary benefit (when applied, for instance, to such problems as the recognition and
evasion of fast-moving predators).
Information professionals are increasingly being forced to deal with a new problem: How
to deliver even more information to users who are already overwhelmed with data. Most
system implementers have had the unpleasant experience of installing a system or service
that "should have been" extremely useful, only to have it go unused, often for
largely inexplicable reasons.
The use of digital video in business computer applications has had surprisingly little
success, at least until driven by the explosion of graphics on the Internet. Although
widely discussed, the deployment of multimedia data in the business environment has been
significantly hampered by problems with the attractiveness of the video. This is a result
of the high cost of delivering large data streams.
DEALING WITH DATA SIZE
A textual data screen of 80 columns by 24 lines, refreshed on demand approximately once
per minute, generates a data stream of 256 bits per second (0.000256 Mbps). In comparison,
uncompressed audio data requires approximately 64 Kbps (0.064 Mbps).
Both of these streams pale next to video data, which is extremely bulky. Simple NTSC
(the U.S. commercial television standard) or PAL (the European commercial television
standard) images, shown at standard display rates, generate data streams on the order of
100 Mbps. The combined data streams for a raw audio/video data stream therefore require
about 400,000 times the band-width of the text data stream.
REDUCING BANDWIDTH REQUIRED
The cost of transmission increases directly with the amount of bandwidth required. Thus,
one barrier to the deployment of video systems has been the amount of bandwidth that is
needed to transmit real-time video.
The designer of video transmission systems is helped by the fact that video streams,
although very bulky, often contain large amounts of redundant data. In the business
context, much visual data is "quiet," with monochrome backgrounds. Furthermore,
there is often little difference between images. In an image of a person talking, the
amount of change between frames is typically limited to the area surrounding the head.
Similarities both within and between images can be exploited to reduce the amount of
bandwidth required for transmission. This process is called data compression (see sidebar
entitled Six Steps Of Data Compression).
In addition to having different results on the final decoded image, each step requires
a certain amount of computational resources. As a general rule, the more computer power
that can be applied to the tasks above, the smaller the resulting data stream. It is the
job of the designer of the video transmission system to select the combination of data
compression steps to provide an image of the desired quality. The amount of computing
power required to create the image is a trade off against the amount of bandwidth that is
available during the transmission.
ADVANCES IN COMPRESSION
Initially driven by military requirements for the efficient transmission of extremely
detailed data from satellite sources, research in signal processing has focused
extensively on the mathematical algorithms used for sampling and compression. One of the
most promising developments of the 1980s was the application of the wavelet transform and
its discrete-time cousin, filter banks, providing sub-band coding to video compression.
Wavelet transforms operate on a continuous pixel stream, rather than on macro blocks.
This results in a tangible difference between the algorithms that utilize discrete cosign
transforms with macro blocks and those that use wavelet transforms. Because of the human
ability to recognize patterns, viewers are very sensitive to defects that involve
repetitive horizontal or vertical elements. Defects from algorithms that use macro blocks
are strongly geometrical and are easily perceived by the user. At best, viewers find them
distracting. At worst, they make the image almost impossible to view. In contrast, the
continuous pixel compression technique used during wavelet compression causes these images
to look soft and muted at higher compression rates, but image detail is not lost and
transmission defects are less noticeable to the viewer.
Even more significantly, wavelet transforms achieve compression ratios that are
impossible with other types of compression. Data can be compressed at 100:1 rates with
virtually no visible artifacting, and at rates of up to 300:1 providing an accept able
QCIF picture. QCIF (Quarter Common Intermediate Format) is part of the ITU-Ts H.261
standard for video conferencing.
COMPRESSION APPLICATIONS
The compression and reconstruction of satellite image data was one of the first practical
applications of wavelet transforms to image processing. One characteristic of the wavelet
transform is that image detail is not lost as quickly as it is lost with algorithms that
operate on macro blocks. This meant that wavelet transforms were ideal for the compression
of medical image data. In medical imaging, the loss of small image details is
unacceptable. In 1997, the first commercial video communication system that used wavelet
compression became available. Today wavelet compression is being proposed as an integral
part of many of the new compression standards. The JPEG 2000 compliance committee is
already examining wavelet image compression and it is expected that the MPEG committee
will do likewise.
Only a few algorithms are used in video compression for commercial applications. Each
has been designed for specific purpose, each uses different elements for compression, and
each has characteristic performance and defects.
CHOOSING A COMPRESSION ALGORITHM
The designers choice of compression algorithm is significant to the information
professional for a number of reasons. Although the deployment of video in business has
been hampered by cost, it has been even more significantly hampered by poor video quality.
Historically, businesses have deployed low-bandwidth systems that use the H.261 algorithm.
Of all the commercially-used compression algorithms, this delivers the poorest quality
image. Although there are some exceptions, it has been all too common for businesses to
purchase expensive video conferencing systems that, after installation, went almost wholly
unused. Although system complexity is often cited as an issue in user acceptance, a more
common complaint is that the video quality is so poor that people simply will not use the
system.
To put this into perspective, the average person in the U.S. watches approximately
three hours of broadcast television per day. Thus, the user of business video systems is
likely to have sophisticated expectations of the image quality and be averse to watching
poor quality visual imaging. In most cases, people will (and apparently do) use audio
conferencing in preference to suffering with poor quality visual data. The real benefit of
wavelet compression is, therefore, that it can be used to deliver images of broadcast
quality to the end user.
Virtually all the compression algorithms require hardware assistance to operate at any
reasonable speed and leave any resources at all in the users computer. Costs of
these products have been dropping rapidly. Systems that formerly cost over $50,000 to
install can be purchased now for approximately $1,500. LAN-based systems are typically
less expensive to install and maintain than ISDN-based systems (due to utilizing the
existing network infrastructure), and IP-based systems are generally very inexpensive to
operate.
The barriers to video at the desktop poor quality and high cost are
falling rapidly. By utilizing changes in the technology that underlies video compression,
the promises of video communications attractiveness and efficiency are
finally on their way to being realized.
Nancy Miracle is vice president of operations at Intelect
Visual Communications. Intelect Visual Communications Corp. (IVC) designs and markets
state-of-the-art LAN and WAN video conferencing solutions. IVC offers multipoint,
television- quality video conferencing products for desktop and executive systems. IVC
products operate on any network that transports IP and run on Windows 95 and Windows NT.
IVC is based in New York and can be reached at 800- 922-3433 . |