A Systematic Approach To Service Quality
BY RALPH PARKER
Networks around the globe are on a collision course. New network
architectures are being rapidly deployed and are increasingly interconnected
with other networks. Yet there is growing concern whether all these new
networks will deliver their promised economic and social benefits or be
plagued by interoperability problems. What is going to happen as a host of
network protocols collide in the interconnected digital sea of worldwide
communications and information transport?
A NEW DEVELOPMENT AND DEPLOYMENT MODEL
The telecommunications industry has had more than 100 years to perfect
circuit-switched networks. Users now enjoy high-quality service even when
their calls traverse multiple interconnected circuit-switched networks.
Quality was achieved through an orderly and lengthy service development and
deployment process for circuit-switched networks. For example, in the U.S., AT&T
turned up the first digital T1 circuits in the mid-1960s, deployed digital
tandem switches in the 1970s, digital end offices in the 1980s, and
announced a fully digital network in the early 1990s.
Now, a more chaotic development and deployment model has emerged. The
sheer number of carriers, vendors, and available technologies has changed
the way in which network architectures are designed and deployed, and how
services are offered to the consumer.
While large numbers of dedicated professionals work through the various
standards bodies to create technical standards, development of network
architectures by vendors and deployment by carriers is becoming less
standardized with the imminent extinction of telephone monopolies around the
Given the shift from orderly deployments under the regulated
monopoly/oligopoly structure to the more chaotic deployment model of the new
telecommunications world, how can the industry assure successful deployment
of new architectures while maintaining service excellence? What steps might
the industry take to make sure that emerging network architectures function
as reliably as the Public Switched Telephone Network (PSTN)? Much of the
answer lies in how well service providers and their vendors plan, implement,
and manage network turn-up and internetwork interconnection.
One proven approach is to replicate new architectures in a laboratory
environment and conduct end-to-end tests in advance of deployment. At Sprint,
where deployment of the Sprint Integrated, On-Demand Network (ION) is
underway, ION services are being rigorously tested in Sprint's test lab in
"To ensure quality in ION deployment, Sprint performs two-stage
testing starting with extensive interoperability testing in Sprint's ION
labs and then moving to field alpha and beta testing. Sprint's ION labs are
designed to replicate the production of telephony and data networks from end
to end including switching, signaling, and transport facilities of both
local and long-distance networks," reports Tom Moore, director of ION
Governmental action can drive the approach, in some cases for an entire
country. A recent example is Argentina, and the Wall Street Journal reported
on March 1, 2000 that "Argentina will attempt to secure $5 billion in
fresh telecommunications investments over the next 18 months through tough
deregulation policies that are part of an overall strategy to orient its
industrial base toward highly skilled labor."
"Regulators also get involved. The FCC chartered its fifth Network
Reliability and Interoperability Council in March to "evaluate and
report on the reliability of public telecommunications network services in
the United States, including the reliability of packet-switched
Additionally, a multitude of industry fora and standards bodies continue
to work on a range of standards and interoperability efforts. This includes
Committee T1, ETSI, IMTC/iNOW, the Packet Cable Group, IN Forum, ATIS's
Network Testing Committee, and others. ETSI for example, conducts
"bake-offs" where engineers get together to test implementations
against each other for the purpose of debugging standards and
implementations at an early stage. Bake-offs typically occur when a standard
or product is being developed. When standards are firmed up, efforts tend to
shift to the vendors as they concentrate on bringing products to market. As
decisions are made by carriers to deploy new architectures, they typically
work exclusively with their own vendors and integrators.
A METHODICAL DEPLOYMENT APPROACH
Given the enormity of the deployment task, how is the industry going to
deliver futuristic new services, work at Internet speed instead of telephone
time, satisfy the regulators, and delight the shareholders -- all while
maintaining service quality and reliability? It is often useful to break
such a large complex undertaking into more digestible parts to attempt a
solution. Reliable deployment of next-generation network architectures,
including integration with the PSTN, might be approached through a
three-step process that includes: Deployment planning, analytical
examination of issues, and integration testing.
The first suggested step is deployment planning. A good plan lays out a
comprehensive strategy with detailed steps for deployment. Besides program
management, the plan should address capacity, performance, security,
interoperability with other environments (such as operational and customer
care systems and other networks), and reliability. Deployment plans with a
complete and comprehensive strategy for reliability are likely to result in
a superior deployment to one where a lesser reliability component is adopted
by the service provider. A plan with a strong emphasis on reliability should
exhaustively address capacity, performance, and interoperability. Business
considerations may impose constraints on reliability planning, or could even
result in going too far in the opposite direction. Engineering and
operational groups should work closely with the strategic and marketing
departments to make sure that the deployment plan is consistent with company
and marketing strategies. Planners should also seek input from customers.
The second step is an analytical examination of deployment issues.
Integration of elements within a network needs to be well understood, along
with interconnections to other networks. This work sets the stage for
conducting tests during the third step. Again, the business philosophy and
resources of the carrier are a factor in how exhaustive the analysis will
be. Many carriers will contract with a systems integrator/consultant to
perform some or all of the pre-deployment analytical work. These firms
usually have established processes for such engagements, and a select few
offer powerful network modeling and analysis tools.
The bulk of the analysis step will consist of assessing capability,
capacity, performance, security, and reliability. The engineering and
operational analysts should work closely with strategic and market planners
to match the technical assessments with the planned footprint, scope, and
marketing of the network being deployed. Modeling capacity and traffic
growth over an appropriate planning horizon is an essential part of
assessing reliability, because reliability should be evaluated over time as
the network expands. Experience has shown that most network elements do not
necessarily scale linearly, necessitating upgrades of some elements before
others as the network expands.
Analysis will, by necessity, address intra-network interoperability and
reliability, and should also cover inter-network interoperability and
reliability. During the analysis step, requirements for each should be
identified and investigated. Each carrier will have to decide the point at
which they will be contacting potential carriers to negotiate
interconnection. Technical, operational, business, and regulatory
considerations weigh into this important decision as to when to initiate
contact, with whom, and for what types of interconnection.
The third step is integration testing. Comprehensive testing of network
elements and related systems prior to deployment is essential in ensuring
reliability. Where tests identify problems, corrective actions must be taken
and tests repeated to confirm fixes. Integration testing is most efficiently
accomplished through a methodical approach, which should include:
- Establishment of objectives and plans with corresponding budgets and
resources. Planning is very important in achieving maximum test coverage
with the fewest test cases. Include contingencies for Murphy's law --
things will go wrong during testing!
- Element conformance testing. Each element should be tested against
applicable standards or specifications to assure that the element
functions as intended.
- Element-to-element interoperability testing. By replicating the
network architecture to be deployed in a laboratory environment,
interoperability among the different elements can be tested. For
example, interoperability testing between a user gateway and an
- End-to-end testing. The network should be tested "end to
end" from the user's perspective. Drawing from the planning and
analysis steps, all relevant permutations of user services and network
configurations should be tested. Variables may be introduced to simulate
real-world operating conditions such as high traffic loads, element and
link failures, delay, etc. Security also needs to be tested.
- Systems integration testing. Operations, network management, customer
care, and billing systems will need to operate within the architecture
being deployed. Reliability is as dependent on operating processes and
systems as the network elements themselves. Support systems should be
part of the lab testbed to enable testing in a simulated production
- Interconnection conformance testing. If the network to be deployed
will be interconnecting with other networks and architectures,
requirements of the interconnecting carriers (plus any regulatory or
government requirements) will determine what testing is required.
Interconnection is likely to include, at a minimum, providing other
carriers with interface conformance documentation. In the U.S., most
incumbent carriers have specific requirements that must be met in order
to interconnect. Typically, carriers or vendors conduct or contract for
conformance testing and go through a certification process with the
- End-to-end interconnection testing. Reliability is often assumed on
the basis that if everything works as designed within a network, and the
internetwork interface conforms with applicable standards and
specifications, end-to-end reliability across interconnection is
assured. While this has often been the case for circuit-switched
networks, it's still a major question mark for packet and hybrid
packet/circuit-switched networks. Views often vary among vendors and
carriers depending on their business philosophies and experiences and
the economic, political, and regulatory climate.
A CALL FOR ACTION
Service quality and reliability are mission critical for the user
community. From fueling commerce to assuring priority services for national
security and emergency response organizations, telecommunications service is
a vital resource for users around the globe.
"The continued reliability of public networks, which are already
becoming a hybrid of packet and circuit-switched technologies, is being
closely watched by those of us responsible for providing reliable
communications for federal, state, and local agencies, says John Graves,
program director for Government Emergency Telecommunications Service (GETS).
"We're looking for confirmation from the industry that reliability will
be maintained while at the same time next-generation networks will provide
powerful enhanced capabilities for the National Security and Emergency
Users, carriers, vendors, shareholders, and regulators each have a stake
in how effectively the global telecommunications industry deploys and
interconnects emerging new network architectures. The three-step methodical
approach -- planning, analysis, and interoperability testing -- offers
carriers a way to step up and meet the challenges of deploying new network
architectures. Vigorous and vigilant intra-network testing by carriers and
their vendors is key to assuring intra-network reliability.
Carriers may also want to look at industry fora as viable venues to
address inter-network performance and reliability. Bona fide, multi-network,
multi-architecture, interconnection tests among emerging packet network
providers and PSTN carriers would provide the industry a means to
objectively assess reliability of interconnected networks. It may make
sense, within the applicable laws and regulations, for carriers and vendors
tackling similar interconnection and reliability challenges, to work
together. There have been numerous instances where competitors have found
common ground on which they can work cooperatively, and later compete
vigorously for customers.
Ralph Parker is a senior consultant in Telcordia's Network and
Operations Integration Practice. Telcordia's Network and Operations
Integration Practice provides integration testing and consulting services to
carriers, vendors, industry groups, enterprises, and the public sector
worldwide. Ralph can be reached at [email protected].
For more information, visit Telcordia's Web site at www.telcordia.com.
Survival Tips In The Service Provider
BY FRED ENGEL
Due to overworked and understaffed IT departments, today's corporations
are increasingly relying on service providers to maintain their networks and
applications. These service providers include traditional carriers, managed
network service providers, Internet service providers, Web hosters, and
application service providers (ASPs). In each case, it is imperative that
both parties (customer and service provider) be able to measure service
delivery against objective goals for availability, responsiveness, and
overall quality of services. This is typically accomplished by defining a
Service Level Agreement (SLA). From the service provider perspective,
customer satisfaction and loyalty can hinge on meeting these expectations --
and financial penalties can result from missing them. This article describes
how SLAs should be defined, how they can be managed, and how they can be
used as a valuable planning and marketing tool.
SERVICE LEVEL MANAGEMENT
The key benefit of an SLA is that it sets realistic expectations for
both the service provider and the customer. While in some cases the SLA is a
formal agreement with financial penalties attached to non-compliance, in
other cases it may be an informal understanding. In either case, the SLA
creates a need for effective service level management. Service delivery
problems must be quickly detected and remedial actions taken. Downtime or
delay in service delivery can mean millions of dollars in lost productivity
and revenue -- for the service provider and its customers. Typical service
level management involves a continuous process of evaluating and
reevaluating business requirements, translating them into performance
metrics, monitoring the results for compliance, and taking remedial actions
COMMON PERFORMANCE METRICS
To successfully manage an SLA, objective verifiable performance metrics
must be used. At the same time, a monitoring and reporting system must be in
place to ensure that the metrics can be measured accurately.
For example, a service provider may offer VoIP and Web hosting services
to their customers. In VoIP, the quality of the transmission is highly
dependent on frame loss, latency, and latency variation (jitter). On the
other hand, in Web hosting, users are primarily concerned about response
time, that is, the time required to download a page. To ensure a
consistently high level of service, the provider must monitor and measure
these metrics closely. The first chart shows that availability was generally
good during the report period. However, the service response was a cause for
concern. Service response is represented as a percentage of a user-defined
limit. The Response/Limit chart shows that many of the services (service to
Denver, NYC, and San Francisco in particular) had response times which
exceeded the limit. To further monitor the quality of VoIP, the jitter
between L.A. and Denver was measured. The chart shows that jitter was
minimal at less than 300 ms.
Since the application responses fail to meet the service level
objectives, remedial action must be taken immediately by determining the
root cause of the slowdown. This can be done by breaking down the response
time into client, server, and network. In other words, where is the
bottleneck -- is it the client, the server, or the network? If, say, the
network is the major source of delay, then the next step is to identify the
segment or the internetwork device (router or switch) which is slow. This is
usually done by examining the latency of each hop -- i.e., the link between
Denver and Chicago was determined to be the bottleneck.
THE POWER OF TREND ANALYSIS
To provide service level monitoring, a database of the vital historical
performance statistics must be kept. This information is also an invaluable
planning and marketing tool which may be used to answer questions like:
- What is the growth rate of traffic in the customer's network?
- What are the time-of-day or day-of-week traffic variations?
- How consistent is the traffic volume?
- Which circuits generate the highest volume?
- Which are the overutilized and underutilized circuits?
From the service provider's perspective, the information can be used to
optimize the core network. For example, successful service level management
can help providers identify bottlenecks and effectively allocate resources
based on service trends and utilization patterns -- as well as rapidly
detect and correct changes in service quality. From the customer's
perspective, resources (e.g., bandwidth) can be deployed where they are most
needed. Thus the information originally collected for service level
monitoring can now be used by both parties to fine-tune the network to
better meet the business requirements of the customer.
An effective service level management solution is an essential tool for
today's service provider. By providing accurate performance statistics, you
can ensure that the customer's expectations can be met -- gaining critical
customer loyalty. In addition, the historical and trend analysis can be used
for proactive network management that can benefit both the service provider
and the customer. For example, understanding how and when to introduce new
managed offerings, identifying the impact of new users and technologies on
the delivery platforms, and demonstrating the highest quality of service are
just a few of the challenges a service level management solution can help
Fred Engel is senior vice president of engineering for Concord
Communications. Concord develops next-generation management software that
enables effective e-business. Concord's products ensure fast, dependable
performance and 24x7 availability of Web sites and e-business services. For
more information, visit the company's Web site at www.concord.com.