ITEXPO begins in:   New Coverage :  Asterisk  |  Fax Software  |  SIP Phones  |  Small Cells

Feature Article
June 2001


Meaningful SLAs: Network Monitoring Replaces Fluff With Facts


Go Right To:   
>>Network Fitness Assessment: Making the SLA Grade
>>From Connectivity To Productivity
Collecting The Stats To Track Today's SLAs

When you buy a car, you are covered in most states under a consumer protection statute affectionately referred to as the "Lemon Law." In essence, the premise of this law is pretty straightforward: When you buy a new car, you are entitled to safely assume that the car is operable for a period of time immediately after the sale has been completed.

But in the world of networking, things aren't so "cut and dried." The inner workings of a WAN, and the service level agreement (SLA) associated with it, tend to be a little more complicated than that sport-utility vehicle you thought would be a fun drive. Poor performance is less defined in networking. How slow is too slow? Is any downtime acceptable? And what is the dollar value of one minute's worth of failed service?

The fact is, network outages happen and they can be significant. One major carrier, for example, recently had seven percent of its ATM network crash when one WAN switch overloaded, leaving a swath of its customer base with slow or no service for four consecutive hours. So, with millions of operational dollars at stake, it's not difficult to appreciate the importance of enforceable service level agreements.

However, a service level agreement isn't worth the paper it's printed on if the enterprise and the provider aren't on the proverbial same page in terms of the conditions and components the agreement covers. Nothing is enforceable unless you're speaking the same language. Ultimately, it comes down to a matter of shared perception. The network "metrics" that will drive the SLA must not only be defined, but available to the customer -- preferably in real time.

That Was Then, This Is Now
At one time, SLAs were practically form-letter formalities, usually put on the table only for larger "major account" customers who purchased large amounts of telephony services from a provider. Less significant than they now are, early SLAs were often vague and spelled out service commitments that were fairly obvious and applicable to the generic "network." But today's competitive business environments have changed everything. Networking has become more creative, and both the enterprise and the provider rely on SLAs as tools to stay competitive.

Ultimately, the language involved in an SLA becomes important since it defines responsibilities. Sophisticated, real-time network monitoring relative to performance becomes meaningless if the semantics of the agreement are not agreed upon from the very beginning. If crafted clearly, the SLA becomes an invaluable tool for the provider, a vehicle that allows customer expectations to be managed at the onset of the relationship.

Bringing uniformity to the terminology becomes particularly critical when SLAs are used to drive provisioning schedules in large-scale network implementation projects. A typical WAN installation, for example, often will encompass network elements provided by the local access provider, the CPE vendors, and perhaps even onsite installation services. As is often the case with complicated build-outs, SLAs will also be signed between the various providers, detailing project responsibilities and the various provisioning expectations associated with each leg of the initiative.

Rules And Regulations
Whether between the provider and the customer, or the provider and utilized vendors, every SLA should stick to some basic rules. For starters, the agreement should be very specific about all entities involved. If multiple service providers are used to deliver a solution, but are not mentioned explicitly within the agreement, the contributing vendor relationships should, at the very least, be made clear; and agreement on the basic terminology is also valuable.

Service providers that rely heavily on interconnection with other providers should be held to this rule. Many carriers that do not provide an end-to-end service offer SLAs that only encompass their portion of the network. If problems arise in a part of the network not technically controlled by the provider -- and the third-party carrier is not included in the SLA -- then service problems encountered may not be covered. Typically providers that can deliver the entire solution themselves usually have greater control of service levels, given that they need only concern themselves with their own network elements.

Addressing Service Levels
Aside from including some contractual givens, like how long the SLA contract will be in effect, particular deference should be paid to the differing service levels associated with various parts of the network. This is very important, since clarity over this point directly impacts penalties associated with non-performance.

For example, in frame relay networks, each permanent virtual circuit (PVC) can be evaluated for adherence to SLA metrics. These metrics can include network availability and frame delivery performance. These granular views of the network's underpinnings are important. An SLA that tries to oversimplify the network's complexity without offering more granular views of the network's components often becomes a crediting guidebook, rather than a tool used to assess and adjust a network's performance. The goal of a sound SLA, therefore, is one that concerns itself as much with the "health" of a network's various parts as with the overall, comprehensive assessment of service.

Once the basic language is agreed upon, and the terms of the SLA have been defined, the SLA must be enforced through proactive performance monitoring. At one time, carriers concerned themselves with solely monitoring the network infrastructure. If the various components of a network were enjoying the acceptable levels of throughput and packets were not being dumped, then the network was performing up-to-snuff. Today, however, the user's experience must also be factored in.

The Proof Is In The Monitoring
Just as agreeing to the language used in the SLA is important, so, too, is a shared understanding of how the performance of various network elements will be monitored. By giving customers access to how their network is performing in real-time, carriers can offer their customers verifiable proof that the conditions of an SLA are being satisfied.

When the elements of an SLA are "mechanized," the SLA becomes a living document, one that drives network optimization. The relationship between provider and customer is suddenly quantifiable. This is important, particularly since customers will no longer be placated with credits for downtime. They want to see exactly where the problem occurred, concerned more with preventing a reoccurrence than with restitution for a breach of the SLA.

As the focus shifts from reactive crediting to proactive adherence to the SLA, it behooves carriers to invest in infrastructure products that cull the statistical data from their networks. Several equipment vendors manufacture server and software products that can capture this data. However, it's the product suites that also analyze this data and present it via a graphical user interface (GUI) to the provider and to their enterprise customer that make the most sense relative to SLA enforcement.

Some software applications on the market today allow providers and their customers to input metrics associated with a negotiated SLA at the onset of network monitoring. Once the software recognizes what service levels are the adherence standard, it generates reports that visually chart out positive or negative deviations from the contracted performance levels.

Of course, the benefits of using such monitoring systems are two-fold. On the one hand there is the SLA compliance, but there is also the insight carriers and their customers gain into how the overall network is performing. A shared understanding of what is and isn't working allows for mutually agreed upon reconfiguration of the network to increase traffic efficiency and control over bandwidth that, without such monitoring, would never be discussed.

Share And Share Alike
Many of the Web-based GUIs used also allow data to be exported to any standards-based third-party reporting application. This is significant in that authorized partners and other vendors that would also benefit from a real-time understanding of the network's performance can also tap into the data, and in many cases using applications that are already familiar to them. In essence, these systems allow network performance to be interpreted by other departments and affected outside parties, rather than by a select group of network operators.

Sharing throughput and delay statistics with customers on a granular level has immeasurable benefits in terms of not only the customer's satisfaction, but also the provider's allocation of repair resources. Problem PVCs are presented in a tabular format that is accessed via a browser. Gaining a common visual perspective of the network, therefore, is not reliant on deploying proprietary software at the customer's end. And most Web-based GUIs offer high-end encryption, which satisfies customers' security concerns.

Service level agreements that are not backed up by monitoring technology will only invite disagreements and customer dissatisfaction. It is no longer enough to promise a contract committed to better service. The details should be supported by data. Educating customers on service level agreement metrics is part of good customer service. Showing them you can deliver on what you promise is just good business. 

Martin T. McCue is senior vice president and general counsel for Global Crossing North America. Global Crossing is a retail-level service provider that has platform capability in Frame Relay, ATM, and IP-based services, which the company has already extended internationally.

[ Return To The June 2001 Table Of Contents ]

Network Fitness Assessment: Making The SLA Grade


You thought you had it all figured out. The calculations, contractual "word" problems, you aced it all; there was no way that this relationship was going to fail. However, much to your surprise, the customer's existing resources proved quite different. The service level agreement (SLA) you arrived at didn't take into consideration these variables. Now, there's an irate IT manager shaking a finger in your face and calling for penalty payments.

Does it feel that a prenuptial agreement was really what you should have drawn up?

If you're in some way responsible for a customer network -- as an outsourcer, service provider, or Internet telephony service provider -- this scenario sounds familiar, no doubt. It can be disheartening, but the ability to set a fair network management policy is not impossible. In fact, taking this theme a bit further, there are vendors offering software technology that can offer an accurate up-front assessment of what you're getting into -- a network fitness assessment. By using this, you can gain insight into the mysteries of a potential customer's network, know where you stand when establishing an SLA or pre-nup-type agreement, all while providing the customer and its staff with peace of mind.

Basically, this type of network assessment should provide a "1,000 foot" view of the network, particularly addressing fault, performance, and inventory management. It protects your organization and supports senior managers with budgetary sign-off responsibilities. Furthermore, as an interesting byproduct, it offers the potential for establishing a consultancy service dealing with network audits. But the real beauty of it all is lost if it is not boiled down to a single page -- a thumbnail sketch easily presentable by you and digestible by the customer and its hierarchy.

In the area of inventory such a network fitness assessment should address the number of active and spare devices and related percentages, along with their corresponding ports. This should cover switches, routers, and active hosts (PCs and servers). For the performance category, the report should provide information that addresses the average utilization of switch and router ports.

The fault segment should minimally contain information derived from the previous seven days' performance. This must include:

  • Brownout SLA graph exhibiting history for LAN and WAN ports; and
  • Average percentage of LAN and WAN ports experiencing problems.

A report containing this data will provide a comprehensive view into a customer's network, addressing all the key areas necessary for establishing an effective SLA and more. Even so, all the information in the world would not be useful if it wasn't presented in an understandable manner. That means effective "packaging" that not only informs and simplifies, but also sells your services.

To start, that means the resulting report should be no more than a single sheet of paper, employing colors and a graphical user interface suitable for senior managers. Remember, this will also help your contact to make a case for use of your services to their "higher-ups."

Finally, understand that implementation and use of this genus of software tool can be quite easy. The application to generate an assessment should run as a scheduled process during non-peak hours, such as over a weekend. Presented as a report, accessed via Web server, or distributed through FTP, all interested parties will then be able to gain a new understanding of resources by Monday morning.

As for you -- the ITSP, outsourcer, or service provider -- you'll start the week with a newfound ability to make sure you make the grade when it comes to meeting the stipulations agreed upon in the SLA. And if you wish, you've also just discovered a new profit center in the form of a potentially lucrative network assessment consultancy.

Jeremy Tracey is president and co-founder of Entuity Inc., a technology pioneer committed to creating solutions that manage risk and control cost to meet the business needs of service oriented network providers.

[ Return To The June 2001 Table Of Contents ]

From Connectivity To Productivity


The Internet is an extension of today's business network. It facilitates the simultaneous delivery of both mission-critical and unsanctioned applications. The application-neutral nature of IP networks, the copious traffic generated by many applications, and extreme access-speed disparity combine to form a variety of performance problems.

Just as network infrastructure enables efficient data transfer, a new infrastructure -- application performance infrastructure -- enables efficient application performance. It's a distinct, systematic software/hardware framework that overlays the network-plumbing layer and moves organizations from connectivity to productivity.

An application performance infrastructure provides a structure for application service-level agreements, provisioning, quality of service, performance analysis, performance enforcement, application-based billing, and differentiated services -- all of which are essential functions for inspiring subscriber confidence and forging long-term outsourcing relationships.

Application Service-Level Agreements
Application Service Level Agreements (ASLAs) are precise, per-application, measurable agreements specifying the nature and quality of deliverables. They form the foundation of the contract between application providers and subscribers. For example, an ASLA might state that at least 96 percent of the transactions for a Great Plains application delivered with Citrix MetaFrame will complete within 1.5 seconds.

Performance Analysis And Validation
It is in the provider's best interests to offer the subscriber confirmation and validation of deliverables to build trust and avoid conflict. Both the provider and the subscriber need to validate deliverables, although in different ways. The subscriber wants intuitive confirmation that they received what they paid for. The provider needs more comprehensive analysis including detailed per-application response-time metrics describing delay portions attributed to the server, the provider's network, and the subscriber's network.

Providers need to make sure they can and do meet their performance obligations in ASLAs. The key to enforcing performance is to differentiate each application needing special treatment and then to appropriately and precisely assign resources, such as bandwidth, that are in contention. Flexible allocation should allow per-application and per-session minimums and maximums. These two tasks should be done automatically using a rule-based system that is configured once in advance and maintained only as needed to reflect business policies.

As an increasing portion of corporate revenue is driven by online performance, acceleration creates a compelling opportunity for providers of managed services to offer acceleration as a value-added service. Techniques to reduce Web waiting times include content compression, adapting content to suit real-time connection speeds, re-ordering Web objects for each user's particular browser's rendering habits, and caching.

Application-Based Billing
Providers needn't be locked into flat or inflexible rate structures. Providing application-based billing allows providers to scale charges to fit costs, match charges to the value of a service to the subscriber, and link response-time guarantees to charges. Per-subscriber and per-application metrics about usage, availability, and performance can feed billing software. Subscribers' concerns about getting what they paid for are alleviated when they know they'll pay for what they got.

The Application Performance Infrastructure Advantage
A sophisticated application performance infrastructure defines, analyzes, and enforces application service levels. It provides the functions described above and unifies previously standalone components into an integrated whole. The result? An all-in-one solution that enables providers to rapidly and cost-effectively deliver an expansive portfolio of application services.

Todd Krautkremer is vice president of worldwide marketing at Packeteer, Inc. Packeteer's application performance infrastructure systems give enterprises and service providers a new layer of control for applications and services delivered across intranets, extranets, and the Internet.

[ Return To The June 2001 Table Of Contents ]

Collecting The Stats To Track Today's SLAs


In this new world of accountability that is spurring the widespread adoption of network Service Level Agreements (SLAs), participants simply must have the facts. Making the SLA process work effectively requires providers and users alike to have access to the statistics that back up these agreements. Once the providers and the customers have the details, both parties can do analyses to make sure the service packages that they have negotiated are being delivered accurately, and are right for their businesses. But how do they collect this information to analyze?

The problem with gathering these statistics is that no one can afford to slow down the flow of traffic to collect them, especially if the SLAs specify support for applications requiring Quality of Service (QoS). As fate would have it, customers are now demanding SLAs with application-level QoS, and robust security to boot. The need for these capabilities is most pronounced in VoIP where end users are acutely aware of any degradation in service, and they expect their conversations to always be kept completely confidential.

The foundation for enforceable SLAs was laid in the first generation of edge switches, which collect some basic, but very important statistics about traffic. They can track the number of packets in a flow at the beginning of the journey through the network, and they can effectively keep track of the source and destination of the interaction. Unfortunately that is where most of their value ends.

With customers now demanding fine-grained QoS, it is necessary to specify some very precise tolerances for key applications like video conferencing and VoIP, and to predefine criteria for handling out-of-conformance packets. But, as difficult as it may sound, detailed statistics on all this can be collected at optical speed, even with encryption protection, by the new generation of IP service edge switch coming to market. In addition to flexible network-based virtual private network (VPN) services with QoS and security, the difference in detail reporting capability provided by this new class of devices is dramatic.

The next-generation edge switch has been designed from the start with detailed statistics collections in mind. This is a difficult prospect, and to do it right, the switch must be developed with the instrumentation to include this capability. This functionality can't be added later. Custom ASICs that perform at optical speeds are a proven way to collect information at high rates, without impacting performance, and therefore an excellent solution to this problem.

With the designed-in ability to collect the detailed level of statistical information needed to support finely drawn SLAs, all parties are provided with a great deal of flexibility in defining those SLAs.

Statistics are required on all criteria of the service definition, including each of the various ways in which the traffic may be classified. The following traffic classification methods must all be tracked for effective management of advanced SLAs:

  • Source/Destination IP Address
  • Previous/Next Hop Router
  • Physical Port
  • Logical Interface (VC, Tunnel, etc.)
  • DiffServ Code Point
  • TCP/UDP Port (Layer 4)
  • Stateful Analysis (Layer 7)

Additionally, it is necessary to track the following QoS parameters defined in these SLAs, including:

  • Service Classes
  • Priority Levels
  • Peak/Committed/Minimum Information Rates
  • Probabilistic Drops/Queue Drops/Delay Bound Discards
  • Traffic Shaping Rates

The switch should be able to determine the number of packets in conformance with the SLA as well as the number of packets out of conformance. However, it is not enough to track packet conformance, it is also necessary to enforce a series of options for what to do with the traffic in either case.

For example, how should the system handle non-conforming traffic? The formula for non-conforming traffic needs to be defined in the SLA, and must be enforced in the switching device for both ingress and egress services. The options for a device that enables truly enforceable SLAs should include: queuing the traffic, decreasing the QoS associated with that traffic, dropping the traffic altogether, marking it for later action, or simply billing that traffic at a higher rate. The latest style service edge switch can even keep track of encryption, network address translation (NAT) and managed firewall services, for monitoring and billing purposes.

In the end, this level of detail should help service providers better meet the objectives set forth in the SLAs, and give the customer peace of mind that they are getting what they bargained for.

Tim Hale is director of product marketing for Quarry Technologies, Inc. Quarry is an innovative, privately held developer of service-enabling systems for the telecommunications market. 

[ Return To The June 2001 Table Of Contents ]

Today @ TMC
Upcoming Events
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
Austin, Texas
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Austin, Texas
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Austin, Texas