April 2003

Rethinking Campus Architecture
BY TONY RYBCZYNSKI & PHIL EDHOLM
Critical elements in the design and operation of campus networks are the
number of switching tiers required and the location and distribution of
routing and intelligence. For our purpose, intelligence is functionality
above and beyond Layer 2 switched and Layer 3 routed operation, that
provides a broad range of functionality including security with user
authentication VLAN management, traffic management (including QoS and
multicast support), and content-aware services for server optimization.
While there are exceptions, most enterprises do not really care whether the
network is two- or three-tier, where routing is deployed, and how
intelligence is deployed. What they care about is meeting performance
requirements for converged networking at the lowest possible total cost of
ownership. Here we look at rethinking campus architectures with the
objectives of overall simplification leveraging technology advances.
What do we mean by performance in the context of campus networks?
Generally, performance implies consistently, reliably, and securely
providing the connectivity, bandwidth and delay required by applications.
Low-delay, high-bandwidth connectivity can be very cost effectively provided
across any campus network. While QoS is arguably not required for
communications within the campus, QoS handling is definitely required as
part of an end-to-end architecture and within a design principle of
minimizing consumption of the end-to-end delay budget (e.g., 150msec for IP
telephony). Security mechanisms are critically important to protect IT
resources and control application accessibility. The most demanding
requirement is for very high reliability to support mission-critical
applications including real-time voice and multimedia. The traditional
benchmark is five 9�s reliability though enterprises recognize that a
cost-benefit analysis is required to determine how close to this benchmark
they can come. While traditional data packet reliability is achieved through
dynamic routing around failures and TCP retransmissions, these do not work
for real-time applications such as voice. A key requirement, therefore, in
moving towards this reliability objective is sub-second recovery from
failures.
The Lay of the Campus LAN
A major influencing factor is the physical wiring plant distribution. All
LANs are based on wiring closets located within 100 meters of desktops, this
limit imposed by Ethernet over copper wiring standards. Some buildings have
three-tier wiring schemes with wiring closets, Intermediate Distribution
Frames (e.g., per floor) and centralized MDFs. The use of limited distance
multimode fiber (100Base-FX 2Km maximum reach over Multi Mode Fiber) and/or
highly distributed campuses like military bases can also impose a three-tier
networking architecture. That said, the general case is that wiring closets
are fiber-connected to a main distribution frame, from a physical two-tier
architecture.
At the Layer 2 and 3 networking level, a common approach for wired campus
networks is a 3-tier architecture consisting of a wiring closet edge tier,
an aggregation (a.k.a. distribution) tier and a core tier, even if the
physical wiring is two-tier. In the latter case the aggregation tier is
co-located in the wiring closet. While in some cases, a two-tier
architecture is targeted, a three-tier architecture emerges over time as
vendors try to meet customer requirements. The edge tier consists of
Ethernet switches and some hubs, but it is universally agreed that switched
Ethernet with standard power of Ethernet is a firm requirement for IP
telephony.
Where to put routing and service intelligence is likewise varied. Some
sophisticated customers believe that IP routing belongs at all points,
arguing that they require a higher degree of control and have the expertise
to manage this environment. Others want to contain IP routing to the core
and are therefore looking to Layer 2 functionality elsewhere to address
performance requirements. In some cases, given vendor biases and/or product
limitations, these same customers end up with routing in the wiring closet
as a �simplification� compared to per-VLAN spanning tree and as a way of
utilizing resources that are otherwise only used under failure conditions.
At the same time, intelligence is being added to the edge tier to meet
security needs and IP telephony and other application performance needs.
So what�s the ideal architecture for most enterprises? One that meets
performance requirements in the simplest way thus exhibiting the lowest
Total Cost of Ownership (TCO). There are two key parameters: the number of
tiers and the distribution of routing and intelligence.
Campus Simplification #1: Two Tiers If You Can � Three If You Must
The three-tier networking architecture developed in the early 1990�s,
providing a common solution for both two- and three-tier physical wiring
infrastructures. This architecture met the prevalent traffic patterns at the
time, ones dominated by workgroup and departmental communities of interest.
It also developed because of limitations of both the capacity of wiring
closet devices (12 or 24 ports per device without resilient stacking) and
the capacity of uplinks (often 100Mbps). Simply, the aggregation tier was
required to provide needed fan-out and port concentration between the edge
tier and the core. IP routing was constrained to the core, while the edge
and aggregation tiers operated predominantly at Layer 2.
What has changed? Firstly, the vast majority of the traffic now goes to the
core of the network, across the WAN and into the Internet. For example,
going to a local printer actually consists of going through a centralized
print server. Even the advent of peer-to-peer traffic, including IP
telephony, does not change this, as it is generally not within the community
of interest of a single wiring closet. Secondly, modular and resilient
stackable architectures with capacities in the hundreds of Gbps, multiple
Gbps of uplink capacity (evolving to 10 Gbps), and extremely high-capacity
core switches now represent the state-of-the-art in campus technologies.
Thirdly, the requirements have evolved for QoS-based networking, increased
reliability, and tighter security.
These changes, together with the ongoing pressures to do more with less,
contribute to an opportunity to evolve the network to meet new requirements,
while simplifying it by eliminating the aggregation tier completely, unless
required because of physical wiring constraints. For large, multiple
building networks, multiple core switches can be mesh interconnected via a
simple and fast transport network; for example, one using 1/10 Gbps links
potentially over Coarse or Dense Wave Multiplexing (CWDM/DWDM). In very
large campus networks, Ethernet switching could be introduced as a way of
providing more effective core switch interconnection, creating a third fast
transport tier � but this is an exception.
In financial terms, the typical network equipment cost per user is currently
$150 at the edge level, $60 at the aggregation level, and $40 at the core
level. Eliminating the aggregation tier can upfront eliminate approximately
a quarter of the initial cost and significantly decrease the TCO, given the
need to engineer, configure, and operate multiple devices at this level of
the network. Eliminating the aggregation tier not only results in
simplification, but also provides a higher reliability network at no
additional cost.
Since there is a perception that the aggregation tier increases network
reliability, how is reliability enhanced? It starts with reliable node
design at both the edge and core levels: redundant power and cooling, hot
swappability, failsafe stackability, and load shared dual active core
switching fabrics. Interconnection between the edge and the core and among
core switches uses MultiLink Trunking (known as the Ethernet link
aggregation IEEE 802.2ad standard), whereby traffic is distributed across
multiple links; when a link failure is detected, the traffic is distributed
in sub-second speeds across the remaining links following the established
traffic management policy. Key advantages of MLT are that failures are
handled without impacting TCP/IP operation and voice calls, and that it
makes it easy to add additional links as required to meet traffic demands.
Two important vendor-specific enhancements are available at Layer 2.
Distributed MLT allows the links to be terminated on different blades in a
switch. Split MLT is a further extension and allows the uplinks to home on
two core switches, while supporting full load balancing across these links.
Split MLT eliminates the need for spanning tree, with its notorious slow
recovery after failures. Fast spanning tree, defined in IEEE802.1w, can
provide sub-second recovery from failure, but has the characteristic that
the backup path (i.e., links and switches) are idle, except after failures,
resulting in a very poor utilization of network resources. To partially
address this problem, some vendors suggest running multiple spanning trees,
with users manually assigned to each (e.g., using a feature called Per VLAN
Spanning Tree). This is effectively manual load balancing and is very
management intensive.
Now, let�s look at how routing and intelligence should be distributed to
meet performance, functionality and reliability requirements, while
providing lowest TCO.
Campus Simplification #2: Centralize Routing and Distribute Service
Intelligence
For campus infrastructures, the relationship between devices, the location
of routing and the amount of intelligence is critical to building networks
that minimize TCO and meet performance objectives. Devices in the core of
the network tend to be more expensive, but the cost is amortized over all or
many users. Devices at the edge of network are less expensive, but cost
there is associated directly with each user. To reduce the TCO of the
network, edge switches should only have that intelligence that needs to be
close to the user, and target the lowest cost per user. Complexity and
intelligence not critical to the edge should be in the core switch.
The functions in the edge switch should be restricted to two major areas:
user security, and QoS and traffic management. Security defines the ability
to authenticate and control user access to network resources, including
standards such as IEEE802.1x for user authentication and IEEE802.11Q for
user segmentation via Virtual LANs or VLANs (e.g., separating wired and
wireless users, telephony and PC users, and user and network management).
QoS provides flow policing to assure that the flows are properly classified
and controlled, and includes flow classification, required for legacy
applications that do not provide DiffServ marking, and for traffic flows
received over untrusted ports. Traffic management includes multicast
spoofing to participate in multicast networking without having to support
multicast routing protocols. These functions include Layer 2, 3, and 4
services (i.e., processing packets based on information in the L2�4 packet
headers) Layer 3 services are not the same as Layer 3 dynamic routing, since
technologies such as Split MLT provide high reliability, and high bandwidth
and switch utilization. Scott Bradner of Harvard University noted that there
are 10,258 lines in index alone of the Cisco basic router manual and that no
new release is without 100 new commands. Why would one deploy routing in the
edge tier if all requirements can be met through Layer 2 switching and
embedded L3�4 service intelligence?
The core switch tier provides Layer 3 switching (including IP and
potentially IPX, SNA and other legacy protocols), dynamic routing (e.g.,
RIP, OSPF, BGP), multicast networking (e.g., DVMRP, PIM-SSM, PGM, IGMP),
policy-based security, core QoS and bandwidth management, and interfacing
into policy servers and network management systems. A range of high-speed
MAN and WAN interfaces (e.g., ATM, FR, PPP, POS). If servers are connected
directly to the core switches, content-aware functionality is provided to
include application switching, content caching, load balancing and SSL
acceleration. The number of acronyms in the above list illustrates the depth
of knowledge required to plan, engineer, and operate the core switching tier
consisting of a handful of high-performance switches located in a handful of
controlled telecom and computing facilities. It also illustrates the TCO
value of minimizing the functions in the edge tier distributed across
potentially hundreds of wiring closets in a large campus.
Conclusion
Two-tier campus architecture with security and traffic management services
at the edge and L3 routing and networking and service capabilities in the
core meet the performance requirements of campus networks, including the
reliability requirements of even the most stringent enterprise. They are not
for everyone, but in general are simple, minimize the number of devices, and
centralize intelligence, and therefore significantly reduce the TCO of most
of today�s network. This campus architectures provide a high-performance
converged network that can reliably support existing and new data voice,
video and multimedia applications. c
Tony Rybczynski is Director of Strategic Enterprise Technologies in
Nortel Networks. He has over 30 years experience in the application of
packet network technology. Phil Edholm is Chief Technologist and VP
Enterprise Network Architecture in Nortel Networks. For more information,
please visit the company online at www.nortelnetworks.com.
If you are interested in purchasing reprints of this article (in
either print or HTML format), please visit Reprint Management Services
online at www.reprintbuyer.com or
contact a representative via e-mail at
[email protected] or by phone at 800-290-5460.
[
Return To The March 2004
Table Of Contents ]
|