When you buy a car, you are covered in most states under a consumer
protection statute affectionately referred to as the "Lemon Law." In
essence, the premise of this law is pretty straightforward: When you buy a
new car, you are entitled to safely assume that the car is operable for a
period of time immediately after the sale has been completed.
But in the world of networking, things aren't so "cut and dried." The
inner workings of a WAN, and the service level agreement (SLA) associated
with it, tend to be a little more complicated than that sport-utility
vehicle you thought would be a fun drive. Poor performance is less defined
in networking. How slow is too slow? Is any downtime acceptable? And what is
the dollar value of one minute's worth of failed service?
The fact is, network outages happen and they can be significant. One
major carrier, for example, recently had seven percent of its ATM network
crash when one WAN switch overloaded, leaving a swath of its customer base
with slow or no service for four consecutive hours. So, with millions of
operational dollars at stake, it's not difficult to appreciate the
importance of enforceable service level agreements.
However, a service level agreement isn't worth the paper it's printed on
if the enterprise and the provider aren't on the proverbial same page in
terms of the conditions and components the agreement covers. Nothing is
enforceable unless you're speaking the same language. Ultimately, it comes
down to a matter of shared perception. The network "metrics" that will drive
the SLA must not only be defined, but available to the customer --
preferably in real time.
That Was Then, This Is Now
At one time, SLAs were practically form-letter formalities, usually put on
the table only for larger "major account" customers who purchased large
amounts of telephony services from a provider. Less significant than they
now are, early SLAs were often vague and spelled out service commitments
that were fairly obvious and applicable to the generic "network." But today's
competitive business environments have changed everything. Networking has
become more creative, and both the enterprise and the provider rely on SLAs
as tools to stay competitive.
Ultimately, the language involved in an SLA becomes important since it
defines responsibilities. Sophisticated, real-time network monitoring
relative to performance becomes meaningless if the semantics of the
agreement are not agreed upon from the very beginning. If crafted clearly,
the SLA becomes an invaluable tool for the provider, a vehicle that allows
customer expectations to be managed at the onset of the relationship.
Bringing uniformity to the terminology becomes particularly critical when
SLAs are used to drive provisioning schedules in large-scale network
implementation projects. A typical WAN installation, for example, often will
encompass network elements provided by the local access provider, the CPE
vendors, and perhaps even onsite installation services. As is often the case
with complicated build-outs, SLAs will also be signed between the various
providers, detailing project responsibilities and the various provisioning
expectations associated with each leg of the initiative.
Rules And Regulations
Whether between the provider and the customer, or the provider and utilized
vendors, every SLA should stick to some basic rules. For starters, the
agreement should be very specific about all entities involved. If multiple
service providers are used to deliver a solution, but are not mentioned
explicitly within the agreement, the contributing vendor relationships
should, at the very least, be made clear; and agreement on the basic
terminology is also valuable.
Service providers that rely heavily on interconnection with other
providers should be held to this rule. Many carriers that do not provide an
end-to-end service offer SLAs that only encompass their portion of the
network. If problems arise in a part of the network not technically
controlled by the provider -- and the third-party carrier is not included in
the SLA -- then service problems encountered may not be covered. Typically
providers that can deliver the entire solution themselves usually have
greater control of service levels, given that they need only concern
themselves with their own network elements.
Addressing Service Levels
Aside from including some contractual givens, like how long the SLA contract
will be in effect, particular deference should be paid to the differing
service levels associated with various parts of the network. This is very
important, since clarity over this point directly impacts penalties
associated with non-performance.
For example, in frame relay networks, each permanent virtual circuit
(PVC) can be evaluated for adherence to SLA metrics. These metrics can
include network availability and frame delivery performance. These granular
views of the network's underpinnings are important. An SLA that tries to
oversimplify the network's complexity without offering more granular views
of the network's components often becomes a crediting guidebook, rather than
a tool used to assess and adjust a network's performance. The goal of a
sound SLA, therefore, is one that concerns itself as much with the "health"
of a network's various parts as with the overall, comprehensive assessment
of service.
Once the basic language is agreed upon, and the terms of the SLA have
been defined, the SLA must be enforced through proactive performance
monitoring. At one time, carriers concerned themselves with solely
monitoring the network infrastructure. If the various components of a
network were enjoying the acceptable levels of throughput and packets were
not being dumped, then the network was performing up-to-snuff. Today,
however, the user's experience must also be factored in.
The Proof Is In The Monitoring
Just as agreeing to the language used in the SLA is important, so, too, is a
shared understanding of how the performance of various network elements will
be monitored. By giving customers access to how their network is performing
in real-time, carriers can offer their customers verifiable proof that the
conditions of an SLA are being satisfied.
When the elements of an SLA are "mechanized," the SLA becomes a living
document, one that drives network optimization. The relationship between
provider and customer is suddenly quantifiable. This is important,
particularly since customers will no longer be placated with credits for
downtime. They want to see exactly where the problem occurred, concerned
more with preventing a reoccurrence than with restitution for a breach of
the SLA.
As the focus shifts from reactive crediting to proactive adherence to the
SLA, it behooves carriers to invest in infrastructure products that cull the
statistical data from their networks. Several equipment vendors manufacture
server and software products that can capture this data. However, it's the
product suites that also analyze this data and present it via a graphical
user interface (GUI) to the provider and to their enterprise customer that
make the most sense relative to SLA enforcement.
Some software applications on the market today allow providers and their
customers to input metrics associated with a negotiated SLA at the onset of
network monitoring. Once the software recognizes what service levels are the
adherence standard, it generates reports that visually chart out positive or
negative deviations from the contracted performance levels.
Of course, the benefits of using such monitoring systems are two-fold. On
the one hand there is the SLA compliance, but there is also the insight
carriers and their customers gain into how the overall network is
performing. A shared understanding of what is and isn't working allows for
mutually agreed upon reconfiguration of the network to increase traffic
efficiency and control over bandwidth that, without such monitoring, would
never be discussed.
Share And Share Alike
Many of the Web-based GUIs used also allow data to be exported to any
standards-based third-party reporting application. This is significant in
that authorized partners and other vendors that would also benefit from a
real-time understanding of the network's performance can also tap into the
data, and in many cases using applications that are already familiar to
them. In essence, these systems allow network performance to be interpreted
by other departments and affected outside parties, rather than by a select
group of network operators.
Sharing throughput and delay statistics with customers on a granular
level has immeasurable benefits in terms of not only the customer's
satisfaction, but also the provider's allocation of repair resources.
Problem PVCs are presented in a tabular format that is accessed via a
browser. Gaining a common visual perspective of the network, therefore, is
not reliant on deploying proprietary software at the customer's end. And
most Web-based GUIs offer high-end encryption, which satisfies customers'
security concerns.
Service level agreements that are not backed up by monitoring technology
will only invite disagreements and customer dissatisfaction. It is no longer
enough to promise a contract committed to better service. The details should
be supported by data. Educating customers on service level agreement metrics
is part of good customer service. Showing them you can deliver on what you
promise is just good business.
Martin T. McCue is senior vice president and general counsel for Global
Crossing North America. Global Crossing is a retail-level service
provider that has platform capability in Frame Relay, ATM, and IP-based
services, which the company has already extended internationally.
You thought you had it all figured out. The calculations, contractual "word"
problems, you aced it all; there was no way that this relationship was going
to fail. However, much to your surprise, the customer's existing resources
proved quite different. The service level agreement (SLA) you arrived at
didn't take into consideration these variables. Now, there's an irate IT
manager shaking a finger in your face and calling for penalty payments.
Does it feel that a prenuptial agreement was really what you should have
drawn up?
If you're in some way responsible for a customer network -- as an
outsourcer, service provider, or Internet telephony service provider -- this
scenario sounds familiar, no doubt. It can be disheartening, but the ability
to set a fair network management policy is not impossible. In fact, taking
this theme a bit further, there are vendors offering software technology
that can offer an accurate up-front assessment of what you're getting into --
a network fitness assessment. By using this, you can gain insight into the
mysteries of a potential customer's network, know where you stand when
establishing an SLA or pre-nup-type agreement, all while providing the
customer and its staff with peace of mind.
Basically, this type of network assessment should provide a "1,000 foot"
view of the network, particularly addressing fault, performance, and
inventory management. It protects your organization and supports senior
managers with budgetary sign-off responsibilities. Furthermore, as an
interesting byproduct, it offers the potential for establishing a
consultancy service dealing with network audits. But the real beauty of it
all is lost if it is not boiled down to a single page -- a thumbnail sketch
easily presentable by you and digestible by the customer and its hierarchy.
In the area of inventory such a network fitness assessment should address
the number of active and spare devices and related percentages, along with
their corresponding ports. This should cover switches, routers, and active
hosts (PCs and servers). For the performance category, the report should
provide information that addresses the average utilization of switch and
router ports.
The fault segment should minimally contain information derived from the
previous seven days' performance. This must include:
Brownout SLA graph exhibiting history for LAN and WAN ports; and
Average percentage of LAN and WAN ports experiencing problems.
A report containing this data will provide a comprehensive view into a
customer's network, addressing all the key areas necessary for establishing
an effective SLA and more. Even so, all the information in the world would
not be useful if it wasn't presented in an understandable manner. That means
effective "packaging" that not only informs and simplifies, but also sells
your services.
To start, that means the resulting report should be no more than a single
sheet of paper, employing colors and a graphical user interface suitable for
senior managers. Remember, this will also help your contact to make a case
for use of your services to their "higher-ups."
Finally, understand that implementation and use of this genus of software
tool can be quite easy. The application to generate an assessment should run
as a scheduled process during non-peak hours, such as over a weekend.
Presented as a report, accessed via Web server, or distributed through FTP,
all interested parties will then be able to gain a new understanding of
resources by Monday morning.
As for you -- the ITSP, outsourcer, or service provider -- you'll start
the week with a newfound ability to make sure you make the grade when it
comes to meeting the stipulations agreed upon in the SLA. And if you wish,
you've also just discovered a new profit center in the form of a potentially
lucrative network assessment consultancy.
Jeremy Tracey is president and co-founder of Entuity
Inc., a technology pioneer committed to creating solutions that manage
risk and control cost to meet the business needs of service oriented network
providers.
The Internet is an extension of today's business network. It facilitates the
simultaneous delivery of both mission-critical and unsanctioned applications.
The application-neutral nature of IP networks, the copious traffic generated by
many applications, and extreme access-speed disparity combine to form a variety
of performance problems.
Just as network infrastructure enables efficient data transfer, a new
infrastructure -- application performance infrastructure -- enables efficient
application performance. It's a distinct, systematic software/hardware framework
that overlays the network-plumbing layer and moves organizations from
connectivity to productivity.
An application performance infrastructure provides a structure for
application service-level agreements, provisioning, quality of service,
performance analysis, performance enforcement, application-based billing, and
differentiated services -- all of which are essential functions for inspiring
subscriber confidence and forging long-term outsourcing relationships.
Application Service-Level Agreements
Application Service Level Agreements (ASLAs) are precise, per-application,
measurable agreements specifying the nature and quality of deliverables. They
form the foundation of the contract between application providers and
subscribers. For example, an ASLA might state that at least 96 percent of the
transactions for a Great Plains application delivered with Citrix MetaFrame will
complete within 1.5 seconds.
Performance Analysis And Validation
It is in the provider's best interests to offer the subscriber confirmation and
validation of deliverables to build trust and avoid conflict. Both the provider
and the subscriber need to validate deliverables, although in different ways.
The subscriber wants intuitive confirmation that they received what they paid
for. The provider needs more comprehensive analysis including detailed
per-application response-time metrics describing delay portions attributed to
the server, the provider's network, and the subscriber's network.
Enforcement
Providers need to make sure they can and do meet their performance obligations
in ASLAs. The key to enforcing performance is to differentiate each application
needing special treatment and then to appropriately and precisely assign
resources, such as bandwidth, that are in contention. Flexible allocation should
allow per-application and per-session minimums and maximums. These two tasks
should be done automatically using a rule-based system that is configured once
in advance and maintained only as needed to reflect business policies.
Acceleration
As an increasing portion of corporate revenue is driven by online performance,
acceleration creates a compelling opportunity for providers of managed services
to offer acceleration as a value-added service. Techniques to reduce Web waiting
times include content compression, adapting content to suit real-time connection
speeds, re-ordering Web objects for each user's particular browser's rendering
habits, and caching.
Application-Based Billing
Providers needn't be locked into flat or inflexible rate structures. Providing
application-based billing allows providers to scale charges to fit costs, match
charges to the value of a service to the subscriber, and link response-time
guarantees to charges. Per-subscriber and per-application metrics about usage,
availability, and performance can feed billing software. Subscribers' concerns
about getting what they paid for are alleviated when they know they'll pay for
what they got.
The Application Performance Infrastructure Advantage
A sophisticated application performance infrastructure defines, analyzes, and
enforces application service levels. It provides the functions described above
and unifies previously standalone components into an integrated whole. The
result? An all-in-one solution that enables providers to rapidly and
cost-effectively deliver an expansive portfolio of application services.
Todd Krautkremer is vice president of worldwide marketing at Packeteer,
Inc. Packeteer's application performance infrastructure systems give
enterprises and service providers a new layer of control for applications and
services delivered across intranets, extranets, and the Internet.
In this new world of accountability that is spurring the widespread adoption
of network Service Level Agreements (SLAs), participants simply must have the
facts. Making the SLA process work effectively requires providers and users
alike to have access to the statistics that back up these agreements. Once the
providers and the customers have the details, both parties can do analyses to
make sure the service packages that they have negotiated are being delivered
accurately, and are right for their businesses. But how do they collect this
information to analyze?
The problem with gathering these statistics is that no one can afford to slow
down the flow of traffic to collect them, especially if the SLAs specify support
for applications requiring Quality of Service (QoS). As fate would have it,
customers are now demanding SLAs with application-level QoS, and robust security
to boot. The need for these capabilities is most pronounced in VoIP where end
users are acutely aware of any degradation in service, and they expect their
conversations to always be kept completely confidential.
The foundation for enforceable SLAs was laid in the first generation of edge
switches, which collect some basic, but very important statistics about traffic.
They can track the number of packets in a flow at the beginning of the journey
through the network, and they can effectively keep track of the source and
destination of the interaction. Unfortunately that is where most of their value
ends.
With customers now demanding fine-grained QoS, it is necessary to specify
some very precise tolerances for key applications like video conferencing and
VoIP, and to predefine criteria for handling out-of-conformance packets. But, as
difficult as it may sound, detailed statistics on all this can be collected at
optical speed, even with encryption protection, by the new generation of IP
service edge switch coming to market. In addition to flexible network-based
virtual private network (VPN) services with QoS and security, the difference in
detail reporting capability provided by this new class of devices is dramatic.
The next-generation edge switch has been designed from the start with
detailed statistics collections in mind. This is a difficult prospect, and to do
it right, the switch must be developed with the instrumentation to include this
capability. This functionality can't be added later. Custom ASICs that perform
at optical speeds are a proven way to collect information at high rates, without
impacting performance, and therefore an excellent solution to this problem.
With the designed-in ability to collect the detailed level of statistical
information needed to support finely drawn SLAs, all parties are provided with a
great deal of flexibility in defining those SLAs.
Statistics are required on all criteria of the service definition, including
each of the various ways in which the traffic may be classified. The following
traffic classification methods must all be tracked for effective management of
advanced SLAs:
Source/Destination IP Address
Previous/Next Hop Router
Physical Port
Logical Interface (VC, Tunnel, etc.)
DiffServ Code Point
TCP/UDP Port (Layer 4)
Stateful Analysis (Layer 7)
Additionally, it is necessary to track the following QoS parameters defined
in these SLAs, including:
The switch should be able to determine the number of packets in conformance
with the SLA as well as the number of packets out of conformance. However, it is
not enough to track packet conformance, it is also necessary to enforce a series
of options for what to do with the traffic in either case.
For example, how should the system handle non-conforming traffic? The formula
for non-conforming traffic needs to be defined in the SLA, and must be enforced
in the switching device for both ingress and egress services. The options for a
device that enables truly enforceable SLAs should include: queuing the traffic,
decreasing the QoS associated with that traffic, dropping the traffic
altogether, marking it for later action, or simply billing that traffic at a
higher rate. The latest style service edge switch can even keep track of
encryption, network address translation (NAT) and managed firewall services, for
monitoring and billing purposes.
In the end, this level of detail should help service providers better meet
the objectives set forth in the SLAs, and give the customer peace of mind that
they are getting what they bargained for.
Tim Hale is director of product marketing for Quarry
Technologies, Inc. Quarry is an innovative, privately held developer of
service-enabling systems for the telecommunications market.