June 2008 | Volume 11/ Number 6
Service Providers Take On Network Monitoring
By: Richard "Zippy" Grigonis
Although we’ve long championed LAN and WAN monitoring for business, Service Providers (SPs) also need to constantly monitor networks for traffic congestion and failing systems, be it their own or the ones they take care of in managed services scenarios. Some systems can automatically detect and respond to threats and performance issues in real time, as well as predict upcoming problems, though the fanciful idea of a completely self-healing network still resides in the world of science fiction.
Network monitoring involves various combinations of active (“intrusive”) and passive (“non-intrusive”) probes operating at various points in the network, with data being analyzed and presented at a particular administrative location. For example, GL Communications (News - Alert) has long offered a variety of solutions for network wide monitoring and surveillance consisting of both intrusive and non-intrusive probes for TDM, VoIP, and wireless networks. The probes, deployed at strategic locations in a network, transmit and collect voice, data, protocol, statistics, and performance information, and relay this information to a central / distributed Network Management System (NMS). This NMS may be client/server-based or a web-based system and consists of a database and applications for controlling, collecting, and analyzing the information provided by the various probes
Whereas GL’s current NMS solutions for digital T1/E1 line monitoring, testing and diagnostics use both intrusive and non-intrusive probes, its SS7, ISDN, wireless protocol monitoring and surveillance system, as well as its packet and VoIP Voice Quality and VoIP monitoring and surveillance systems can work with just non-intrusive probes.
Dan Teichman, Senior Product Marketing Manager at Empirix, says, “We’re involved in five important areas. First, we give SPs solutions so they can know the quality of their VoIP service before their customers do, which means that if everything is good, it’s good, and if it’s not good, you don’t find out when your customers call to complain. Second, you can’t always be proactive, so we’re building solutions to make our customers run as quick and efficiently as possible when they have to react to network congestion or equipment issues. Certainly accuracy of information has to do with monitoring both the actual media and not just the signaling associated with calls, so we what we provide involves media measurements, producing metrics by looking at every RTP packet in an RTP stream and providing an assessment of voice quality based on that.”
“The third item is carrier class scalability,” says Teichman. “We know that what customers encounter when they enter the market is not what they encounter when their networks are scaled up. Workarounds and processes when you have 1,000 customers don’t necessarily work when you have 100,000 or a million customers. You want to know whether the investments you need to make can be made today and ‘grown’ tomorrow. So, having the right information and the proper solutions for scalability are critical for success in the large carrier space.”
“The fourth item in which we’ve seen a lot of traction is what I call ‘avoid the blame game’,” says Teichman. “Every one of the VoIP SPs today is interconnected to one or more other carriers and probably the PSTN. So, obviously, you can’t cross your fingers and hope that your interconnected SP is going to provide good quality and likewise they can’t just hope that you as a SP are giving them good quality service. So there really needs to be both a way to measure voice quality on both sides of an interconnect, and a way to share that information, so you don’t find yourself in the blame game.”
“Fifth and finally, accuracy of information is important,” says Teichman. “We have found that many people provide VoIP quality monitoring solutions, but the question arises over whether or not a SP can trust the accuracy of that information and ensure that they can take action on that information as a competitive differentiator. You can have service level agreements with customers’ interconnected parties, in which case you must be sure that those metrics being measured are accurate so that you can commit to service level agreements without penalties.”
“Certainly there’s going to be tremendous value in integrating the reporting of quality metrics from the customer premise,” says Teichman, “involving solutions such as IP media loopback — being able to run a loopback test — from an SP’s network to a customer’s network to measure service quality or having the customer premise equipment actually report quality measurements that they take through some standard mechanisms such as an RTCPXR. So being able to provide good quality measurements comes down to truly knowing the quality that the customer is experiencing on the actual call itself, or the aggregate of all of the calls.”
Seeing is Believing
Symmetricom (News - Alert) designs, manufactures and markets atomic clocks, oscillators and network synchronization and timing solutions used in wireline and wireless telecom networks, space, defense and avionics systems, and enterprise IT networks.
Gary Croke, Symmetricom’s Director of Marketing, says, “We recently created a new line of business, which is Quality of Experience [QoE] monitoring. We did this in 2007 by acquiring a couple of different companies in the video technology space and combining and packaging together the technologies into a QoE monitoring solution.”
“Video in particular can bring about quality challenges on the network,” says Croke. “Video is real-time traffic. It has very strict delay and delay variation requirements. Not only does video ‘stretch the envelope’ in terms of network traffic, but what’s worse is that end users are more sensitive to video errors than voice errors. This is the major point fueling the drive behind all of the monitoring efforts. The end users get frustrated and complain and end up moving to another service.”
In fact, Symmetricom commissioned a study to gain insight into the recognition of quality issues, the use of video quality monitoring, and where the video quality issues are coming from.
“The study reveals that 77 percent of people said that video quality, or lack thereof, was the main cause of customer churn,” says Croke. “Also, 78 percent was the main cause of customer calls in IPTV (News - Alert). You really get the sense that video quality is important to these services. Video also acts differently than other services to errors. For example, a 10 millisecond (ms) loss of service is more noticeable on a video service. TCP/IP just retransmits and there’s no impact. Voice service failure is tolerable up to 50 ms. 10 ms of voice is about 80 bytes of data. But 10 ms of HD video is 10,000 bytes of data. So you’re talking about a great deal of traffic that’s impacted by a network failure. Networks that appear okay don’t always behave well when you send video over them. This further drives the need for network monitoring.”
“Monitoring approaches used today basically monitor QoS,” says Croke, “of which there are four metrics: packet delay, jitter, loss and bandwidth. Overwhelmingly, we see that this is not sufficient for two reasons. First, impairments come from areas other than the network. So if there’s an impairment in the content itself, such as a camera introducing blur, or artifacts introduced by the electronics, then just monitoring the network from a QoS perspective is not sufficient. Furthermore, even if you look at the QoS metrics and the network, that doesn’t tell you exactly what’s happening in terms of how the end users see the video. In video some packets are more ‘noticeable’ than others.”
“Fortunately, Symmetricom offers a technology that goes beyond just looking at QoS, and instead examines video QoE,” says Croke. “We monitor the video from an end user perspective. We catch all of the different impairments from all of the different places, and look at how the impairments are impacting the end user’s video by modeling the human vision system to determine whether or not the problem will actually be seen by the user. A packet disturbance right after a scene change probably isn’t going to be noticed as much, for example. Our V-Factor QoE Platform can handle QoE management for triple play services. Perceptual video quality is measured via deep content analysis, and network impairments are correlated to content. The system understands what’s really going on since it’s based on a human vision system model. It’s a highly adaptable platform and has been deployed by leading cable companies worldwide.”
Have You Checked Your FCAPS Today?
Virtela is a “Super” Virtual Network Operator (VNO) that designs, builds and manage complete customized network solutions that fully integrate into existing network architecture. Virtela is a combination of a traditional VNO, a large network integrator and managed service provider (MSP). Virtela’s Regional Policy Centers (RPCs) are situated worldwide. They understand that network monitoring is critical to the widespread adoption of such things as global unified communications.
Mark Hansard, Vice President of Systems and Security for Virtela, says, “In terms of network monitoring and security, we started out, like everyone else did, with the FCAPS [Fault, Configuration, Accounting, Performance, Security] network management model. Within each area we did certain things to cover the base. For example, on the ‘Fault’ side we employed HP OpenView in the early days, using it as our fault detection tool for all of the network services that we were providing and built those into the apps and so forth. We found it had some severe limitations at that time, and we had to create our own stopgaps. For example, one limitation would be the provisioning of the devices within OpenView. It was very ‘manual’ to administrate, and we needed to automate those functions, but it was difficult to do that within HP’s user interfaces of the time. When managing devices of multiple customers, some overlapping of network device IP addresses would occur, since there could exist shared IP address schemes, from one customer’s network to another customer’s network. OpenView had difficulties dealing with that overlapping IP space, so we had to deviate to some tools developed on some open source technology and then we took it in-house. So for about the past five years, we’ve used custom plug-ins and algorithms on top of the open source Nagios host, service and network monitoring program, to create a custom fault monitoring environment.”
“We have about 10,000 devices under management,” says Hansard, “and about 15 different vendors represented in that management set. Some analysis we wanted to perform on the collected data was beyond the capabilities of Nagios and that’s when we developed some custom analysis tools. We had in the early days some off-the-shelf products to do those things. They were great for an enterprise but their pricing model made them way too expensive for a carrier. So we built our own.”
“We then kept increasing and integrating security by using security information management platform tools such as ArchSight and netForensics,” says Hansard, “that allow us to do security behavior analysis on top of the network ‘basics’, if you will, and then look for security correlations. A good example of that is when we have network devices, firewalls, IPSs, remote access devices and even information from internal virus or vulnerability scanning packages, and we needed to bring all of that into a single analysis engine and then you see quite a different picture when you’re able to pull various things together than if each item is analyzed independently in its own ‘silo’.” IT
The following companies were mentioned in this article:
GL Communications (www.gl.com)
Today @ TMC
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center
Jive Communications by LogMeIn has Signed on as a Platinum Sponsor for 2019 Collocated Events, ITEXPO and MSP Expo