The nature of services
and content within the next generation network (NGN)
and the way in which they are delivered to end-users
is changing. Content used to be text and some
graphics. In the future, it will not only be text and
graphics but it will also be voice, audio, and video.
Entirely new classes of devices will proliferate in
the home and enterprise that will have varying display
and input capabilities. Getting services and content
to these devices quickly, reliably, and with low
delay, will place new strains on the storage,
transport, and processing capabilities of the network.
THE PROBLEM
Ethernet clients will soon be making a transition from
100 Mbps to 1 Gbps bandwidth. Related transitions will
be occurring in the server, storage, metropolitan area
network (MAN), and wide area network (WAN) to 10 Gbps
and then 40 Gbps. The future unit of measurement in
the network will be 10 Gbps. At 10 Gbps, the packet
arrival rate is 35 ns. This doesn't give traditional
processors operating at 1 GHz much time to do things,
and it only gets worse at 40 Gbps with an associated
packet arrival rate of 8 ns. A 1 GHz processor can
execute about 35 cycles in 35 ns, and 8 cycles in 8
ns. Bandwidth (Gilder's law) is increasing more
quickly than processing capacity (Moore's law). The
implication of this is that for the foreseeable future
the number of processor cycles available
per-bit-per-second will continue to get worse even as
applications continue to make greater demands on
processing cycles.
The actual number of instructions/processor cycles
needed to process a packet will vary widely depending
on the application (e.g. communications,
switching/routing, security, media content processing,
transcoding, and filtering). Each application will go
through different phases during its "lifetime." For
security, these phases may consist of authentication,
key generation/exchange, compression/decompression,
and then encryption/decryption. The authentication and
key generation/exchange phases require a great deal of
state information as a security association between
communicating principals is established. The
compression/decompression and encryption/decryption
phases apply repetitive logical and mathematical
functions to each bit and do not need to maintain much
state information or make complicated branching
decisions.
Having 35 cycles in 35 ns may simply not be enough
for many applications. By way of an example, it is
estimated that a 10 Gbps TCP/IP connection would
require multiple Intel Pentium 4 processors simply to
perform the protocol processing. Additional processors
would be required to provide the headroom to run other
applications such as security, media, and content
processing. Sustainable packet processing at 10 Gbps
can be offered if 1) processing for every packet is
completed in less than 35 ns, 2) packets are
aggregated together (folded) and processed together in
less than 35 ns, or 3) processing is broken up into
multiple stages sequential to each other (pipelined)
and processed in each individual stage in less than 35
ns. Headroom can be increased by increasing the clock
frequency, adding more processors in the context of a
multiprocessing model, adding coprocessors, or
developing new hardware and software processing
models.
THE SOLUTION
The future is about transporting, cracking, and
stuffing packets. Traditional processing models are
reaching their limits as computing and communications
converge and bandwidth increases. These models will
feel further pressure as the need to support non-text
and rich media including voice, data, graphics, audio,
and video increases. New hardware and software
processing models will extend beyond the traditional
client and server-processing model to provide the
additional headroom needed to support wire and
fiber-speed processing in the NGN.
A packet-centric processor and logic-based
architecture can provide a continuum of optimized
processing for the different phases of an application
or service running at wire or fiber speed. Some phases
of an application may require heavy algorithmic
processing (e.g., bit math for encryption/decryption)
and other phases may require heavy decision processing
(e.g., sparse but large branching trees for intrusion
detection).
There are a wide range of processors and
configurable/programmable logic elements available
today consisting of application processors (e.g.,
Intel Pentium, Intel Itanium, and PowerPC processors)
, control processors (e.g., Intel Xscale and MIPS
processors), packet processors (e.g., microengines),
signal processors (e.g., DSPs), neural processors,
reprogrammable logic (e.g., FPGAs), and application
specific logic (e.g., ASICs). Network processors
typically consist of a control processor with one or
more packet processors on the same piece of silicon
and in some recent incarnations they consist of a
control processor with reprogrammable logic. Neural
processors provide hardware-based deterministic and
non-deterministic processing capabilities using neural
networks. Neural processors can potentially provide
capabilities to deal with data and content in its
native form and not as pure text (e.g., find all
images containing a rose in a visual-object database).
At first glance, these processor and logic elements
may appear to be distinct and separate, but their
different strengths can be positioned in a conceptual
framework that provides a powerful and
scalable-processing model. As described earlier, the
traditional measurement of performance is processor
cycles (i.e., MIPS). New processing models will be
measured on their ability to perform along three
different axes: statefulness, decision complexity, and
algorithmic complexity. The three different axes
define the space in which the particular capabilities
of processor or logic elements may be mapped.
Statefulness is that aspect of the application that
must remember something about the past in order to
make a decision in the present. Decision complexity
represents the number of possible branches that may be
taken on each bit of the packet.
A typical application that may require
high-decision complexity would be intrusion detection
or virus filtering. Algorithmic complexity, or
algorithmic intensity as some call it, represents the
number of arithmetic or logical operations that must
be performed on a bit. A typical application that may
require high algorithmic complexity would be media
transcoding between MPEG-3 and MPEG-4 video streams.
Applications and their "lifetime" can be plotted in
the same space as the processor and logic elements to
identify which elements are needed to efficiently
support an application.
The processor and logic elements will be tied
together by a common interconnect technology and
software architecture that spans the elements. The
programming interfaces will slide underneath existing
programming interfaces to allow transparent scalable
processing that doesn't affect existing applications,
but can also be logic elements to efficiently deal
with the different phases of the application.
CONCLUSION
Existing processing models will prove to be
insufficient for the NGN. The move from a text-based
world to a signals-based world, and the continually
increasing gap between processing cycles and bandwidth
will drive the demand for new processing models. New
models will provide a continuum of processors and
logic that are easily interconnected and designed to
support transparent scalable processing of text and
non-text-based signals at wire and fiber speed. The
shift to these new models will have a profound impact
on the capabilities of the NGN and the types of
content processing, content filtering, and other
services that can be provided to users from servers,
gateways, and service platforms.
Jeff Lawrence is Chief Technology Officer of Intel's
Network Communications Group. Jeff was formerly
President & CEO of Trillium Digital Systems, a
leading supplier of communications software solutions.
[ Return
To The August 2001 Table Of Contents ] |