MMX - The Impact Of On-Board Digital Signal Processing BY
BROUGH TURNER
Over the years, weve witnessed a steady, and enormous, increase in the power of
Intel processors. And, in the past five years, weve seen several attempts by Intel
toward encompassing traditional digital signal processing tasks. A few years ago it was
Native Signal Processing. Now its MMX. Will Intel succeed? And how might
Intels success impact the computer-telephony industry?
THE POWER IS THERE
Intel has maintained dominance in the CPU market, not only by leveraging Moores Law
(that is, by putting more transistors into each evolution of silicon), but also by adding
newer and more powerful functions all the while maintaining compatibility with
earlier software. With the MMX instruction set for the Pentium processor, Intel has made a
significant step in support for digital signal processing. But the question is, will this
func-tionality replace that provided by discrete DSPs from companies like Texas
Instruments (TI), Motorola, Lucent Microelectronics, and Analog Devices?
The answer is yes, to a limited extent. When we compare the number of
instructions executed per second for the inner loops of specific DSP algorithms, we see
that the Intel Pentium with MMX is often faster than existing, widely used DSPs. For
example, to compare a 266 MHz Intel Pentium II with MMX with a 100 MHz TI 320C549 DSP
executing a filter algorithm, you could count the number of filter taps per second (ftps)
that each processor can execute. The TI DSP will support 100 million ftps, and the Intel
Pentium with the 16-bit MMX instructions will support 320 million ftps. Using this method
of calculation, it would appear that the Pentium is much more powerful. But theres
more to useful signal processing than such single function comparisons reveal.
A DEEPER LOOK
There are several issues to consider. First, if we need a multi-channel digital signal
processing system (as in a com-putertelephony server), we need to consider total power
requirements. In other words, its not just the number of filter taps per second that
concerns us, but rather the number of filter taps per second per watt. The TI 320C549 DSP
mentioned above uses a quarter of a watt, for a power-performance rating of 400 million
ftps/watt. The Intel Pentium with MMX once we account for the power dissipated by
memory and portions of the rest of the motherboard comes in at less than 4 million
ftps/watt. On this count, modern DSPs are better by a factor of 100!
A second issue to consider is actual performance on a complete and useful signal
processing task. Filter taps per second is interesting, but what we really need is more
likely to be the combination of an echo canceller, a DTMF detector, a voice coder, and
perhaps a modem. The 320 MIPS of the Intel MMX now has to be discounted for several
factors. There is a substantial overhead incurred going into and out of MMX mode. In
addition, there is a conversion penalty associated with transferring data between the
16bit data for MMX and the floating-point data for which the regular Intel instruction set
is optimized. As a result, while the MMX instruction set can be a hundred times more
efficient than the original Intel instruction set for specific signal processing
benchmarks, typical improvements for complete and useful algorithms (MPEG audio, for
example) are more like a factor of two to three. In short, there are MIPS and there are
MIPS.
To get a handle on what computer telephony tasks are actually achievable, it is useful
to consider a complete 33.6 Kbps (v.32bis) modem implemented in software using MMX. As I
write, I am not aware of such a modem as a shipping product for the mass market, but
several are nearly ready. And, while its a close call, it appears it will be
possible to fit such a modem under an arbitrary 20 percent threshold; that is, the modem
will use no more than 20 percent of the power of a currently available Pentium II with
MMX. Assuming one of these products makes it to market, well have a solid basis for
comparison. When we scale from this likely benchmark, we see that a single-processor 300
MHz MMX motherboard comes out just slightly better at computertelephony signal processing
tasks than a single 100 MIPS 320C549 DSP.
Where did the arbitrary 20 percent threshold come from? For a real CT application, the
first assumption we have to make is that the CPU can support the operating system, the
application, and the desired signal processing. Microsoft seems to make the operating
system larger and slower with each release. Similarly, application software is called upon
to provide more functions in every release. So we have to restrict our digital signal
processing to some percentage of the total motherboard processing power.
For consumer use, a good target is to keep the signal processing load below 20 percent
of the total processor power. This way the system will perform acceptably, even while
doing native signal processing.
For a computer-telephony server, the CTI developer has more flexibility in tailoring
the system. However, the same trade-off remains between processor cycles to run the CTI
application and cycles used to do lowlevel signal processing.
What does this mean in terms of realworld computer telephony signal processing tasks?
What signal processing power is needed per port to support typical CT applications? And
can this be accomplished on the motherboard? To get a feel for how we might answer these
questions, we can take a look at the DSP MIPS typically required per port for some common
CT applications. Clearly, if a single-channel, softwareonly modem shows up as expected in
1998, then, in the same time frame, a single-port computer-telephone board can be built
without any special signal processing chips, that is, using the Intel host for all signal
processing.
THE TELEPHONE LINE INTERFACE
Of course, we cant perform all the functions of a computertelephony board in
software. The interface to the telephone network remains an issue. Todays telephone
interfaces include specialized hardware and software to address diverse regulatory
requirements around the world. And there is a minimum electrical interface to the
telephone line that must remain in hardware. But, if everything else is moved to software,
this minimum telephone line interface can be extremely simple a few components plus
a telephone connector on one side and a Universal Serial Bus (USB) connector on the other.
Such an adapter will be the same for a consumer modem as for comput-ertelephony
applications. So single-Function line computer-telephony applications should move to low
cost, mass market technology in 1998-99.
THE FUTURE OF DSPs
If, by 1999, a single-line CT interface is just an extremely simple adapter plus some
host-based software, then we can make some predictions about the fourport voice board
business. Based on Moores law alone, the traditional fourport CT interfaces will do
without DSPs by the year 2002.
But will the Intel processor completely replace the discrete DSPs for all CT
applications? No. Host-based signal processing (DSP on the Intel CPU) will only replace
discrete DSPs in lowend, PC-based CT applications. There are several reasons for this.
First, the work a CT interface board is called upon to do increases every few years.
The first DSP-based voice boards had only 25 MIPS per port. Combined voice/fax
boards now use 1015 MIPS per port. And, CT boards used in IP gateway applications
typically have at least 30 MIPS per port. This trend is likely to continue as more complex
voice coders are developed and as multimedia communications increase.
Second, the Intel-based approach doesnt scale beyond the Intel chips on a single
motherboard. Using the host CPU for digital signal processing makes sense to the extent
the systems primary motherboard has leftover power. But it doesnt make
economic sense to add more Intel CPUs just to run more signal processing channels. As we
saw above, the Intel CPU is a relative power hog, as a dedicated signal processor. That
is, for high-end, multi-channel applications, an Intel-based signal processing approach
would use more power than the fans in todays server can handle. For high-capacity
systems, standard DSPs still have a factor of 100 performance per watt advantage over MMX
signal processing.
And dont forget, the same Moores Law that allows Intels designers to
create ever-more powerful chips for PC motherboards, give the DSP designers at TI,
Motorola, Lucent, Analog Devices, and elsewhere, the raw technology to design ever-more
innovative DSPs. So competitive technical advances will continue for years to come.
CONCLUSIONS
For high-capacity systems (greater than T1) that require multiple processors, DSPs will be
the preferred solution for at least another decade, providing roughly a 100-to-1 advantage
in terms of the number of channels per slot, or per chassis. But for low-end CT
applications, software on the Intel host will begin to replace DSPs for singleport
solutions later this year, and begin to overtake traditional four-port CT boards by 2001.
Computer telephony will remain a high-growth market for DSP chip vendors. But the DSP
chip applications will be at the high end, and at the very low end. The low end will
consist of dedicated applications less than a PC a telephone answering
machine, for example. The high end will encompass open telecommunications servers for
sophisticated applications at eightport, T1, E1, and higher capacities.
Brough Turner is senior vice president of
technology at Natural MicroSystems, a leading provider of hardware and software
technologies for developers of high-value telecommunications solutions. For more
information, call Natural MicroSystems at 508-620-9300 or visit the companys Web
site at www.nmss.com . |