November 2008 | Volume 11/ Number 11
The Vast World of Network Management
By: Richard “Zippy” Grigonis
Network Management continues to evolve, increasing in sophistication and capabilities. Over 20 years, ago, the Simple Network Management Protocol (SNMP) appeared to help manage the first distributed networks by polling network devices every few minutes. Although SNMP is still used, it’s showing its age, particularly since its leisurely polling process can miss the kind of transient, 150+ millisecond congestion events that disrupt VoIP and IP video communications. SNMP is now joined by a boatload of newer technologies and tools capable of real-time packet flow analysis, comprehensive reporting, and many other functions. At the service provider level, vast hybrid TDM/NGN/IMS networks multiply the challenges of network management.
ILECs, CLECs, and companies providing broadband services all face the challenges of quickly identifying network performance and availability issues to maintain customer satisfaction.
Xangati’s David Messina, Vice President of Marketing, says, “Our focus is on a solution area that we call Rapid Problem Identification or RPI. The idea is that, given the increasing complexity of the service provider infrastructure out there — which includes a whole new set of dynamic applications that they’re delivering, not to mention a whole new set of different technologies residing in the digital home — the increasing volume of subscribers who are leveraging broadband and the complexity of the problems that exist in their infrastructure and among their subscriber base, are growing exponentially. So we devised Rapid Problem Identification, which creates a new model for those who are managing the infrastructure in a service provider network and for subscribers, to really rapidly isolate specific problem sources; in other words, unearthing very quickly where the problem might lie. Specifically, we’ve built a technology with that explicit focus.”
The Xangati RPI system is said to accelerate problem identification efforts by at least 20 percent, which results in major productivity gains for service provider technical teams and increased subscriber satisfaction. Xangati RPI can do such things as quickly and proactively identify spammers to avoid being email-blacklisted by ISPs, achieve visibility into subscriber identity and activity, improve customer responsiveness by quickly identifying problems, determine if the problem is on the subscriber side (or related to your network or service), provide the foundation to provide support for IPTV (News - Alert) and VoIP, and avoid expensive backbone capacity upgrades by curtailing “breach of contract” Internet use such as cutting spammers off of their network or subscribers acting as hosting sites.
“Our solution is delivered as an appliance that leverages data being generated by the switches and routers that are in the infrastructure,” says Messina. “There’s a technology called ‘flow technology’ or ‘flow data’ that works with all of the various routers that are out there — be they Cisco, Juniper or Nortel (News - Alert) — that can generate summary information about the traffic that’s running across the infrastructure. That fuels our system. The benefit of leveraging flow data, let’s say, instead of other technologies in the market, is that it allows us to be a nonintrusive appliance, just another IP device on the network. We don’t sit in the packet path and we’re not inspecting packets. We don’t require a redesign to get our management solutions into the infrastructure. As a result, customers are able to install our RPI appliance in about an hour and immediately achieve visibility into their network and subscribers. So they can get some real value very quickly as they move forward.”
“Although the term ‘middleware’ isn’t officially quite the right terminology, RPI does in a sense ‘sit in the middle’ and is non-intrusive,” says Messina. “It also allows us to gather information on all of the key elements that are in that service provider infrastructure. Traditional management solutions are optimized for different elements. For example, there’s a management system for the Cisco (News - Alert) routers, and there’s one for the access devices. There are some tools used by customer support desks, and there are overall network performance management solutions that exist in the organization too. But the truth is that those traditional tools are very focused on specific areas of expertise. A tool may be great at monitoring an interface, but you need another tool that’s terrific at monitoring Cisco routers. In the case of the Xangati solution however, it starts out by discovering and identifying everything on the infrastructure, including subscribers, hosted web servers, or any key network connections for the service provider, such as an upstream connection to a backbone provider. Amongst other kinds of views, it gives you live visibility into the different interrelationships among the different infrastructure elements.”
“That’s important,” says Messina, “because we’ve seen time and time again that if you talk to a service provider, their orientation is really kind of schematic — green, yellow and red lights relating to all of the management systems. There are countless times we’ve seen service providers that have ‘green lights’ on all of their management systems, yet they’re busy firefighting a problem. That can happen because the challenges and problems out there actually transcend the different elements — it’s about the relationships between, say, the subscriber, the applications they’re using and the network. You can’t just discount the situation by saying, ‘Oh, it’s just the application.’ For example, if there’s a whole slew of subscribers that are starting to do peer-to-peer traffic, and that’s consuming the upstream connection, the network could easily get blamed, but again, you need to have visibility into the element relationships as to what caused an upstream connection to be blocked.”
“That’s why Xangati really differentiates itself with its ability to discover all of the different infrastructure elements,” says Messina, “and it tracks those relationships in the sense that it cuts across the different domains of expertise in the different ecosystem elements. Moreover, it gives you a profiled understanding of the normal behavior of every one of those elements. By that I mean that for every application — which could be email, VoIP, p-to-p or whatever — we establish a set of parameters around not only performance but element relationships, and have the system learn that. It can even learn the behaviors of a subscriber household or a hosted server. It ‘understands’ the high and low watermarks.”
“For example, if a server stops serving its community — i.e., if a mail server has dipped down to serving only two users instead of 100 or more, that’s a very noticeable piece of information,” says Messina. “Or if a subscriber starts to communicate with a 100 different other different peers on the Internet with email, that indicates the subscriber has become a victim of malware, specifically a spambot. So the beauty of the Xangati solution is not only its ability to track relationships, which in itself is incredibly important in finding and solving problems, but it also supports the idea of profiling the behaviors and making that a system-learned capability, which allows us to set up a framework wherein the service provider organization can actually take a proactive stance in problem identification. That’s one of our solution’s major differentiators. Because we understand normal system behavior, when something strays from that pattern we can catch problems before they become problems, unlike traditional management products.”
Of course, over the years network “testing” became “monitoring” and now everything is sloshed together under the general term “network management”. Some useful products are not even considered network management systems per se, such as Zeugma (News - Alert) Systems’ ZSN, an inline subscriber management and broadband aggregation device that performs functions that simplify and streamline carrier network and service management. Basically, these functions operate in two broad areas: information extraction and policy enforcement.
The ZSN collects deep statistical information on all traffic flows traversing the network. This information is collected on a highly granular per-subscriber, per-service basis and can be used by network management personnel for troubleshooting and auditing. The ZSN also can be used to develop real-time Mean Opinion Scores (MOS) for all voice and video sessions, further enhancing the carrier’s ability to manage subscriber services. More important, however, is the application of policy enabled through the ZSN. As a layer 3 Service Delivery Router (SDR), the ZSN dynamically creates queues for all sessions traversing the network and applies unique Quality of Service (QoS), usually bandwidth-related, parameters to those sessions. Many times, dealing with such things as network outages or unforeseen congestion, this capability enables the network to automatically respond to failure conditions using pre-determined policies.
The ZSN is founded on a high-performance, fully federated compute grid supporting the Zeugma Open Application Sandbox (OAS). Just as in general-purpose operating systems, the Zeugma OAS allows service providers to embed applications within the ZSN that are developed by Zeugma, by the service providers themselves, or even by select third parties. Thus one can extend the system until it is a full-blown network management system.
How to Effectively Pinpoint Network Problems — A Guide to Addressing the Physical Needs of Your Virtual Network
By Kenneth Klapproth
1. Be Proactive, Rather Than Reactive! — Just as with high blood pressure or high cholesterol — the human silent killers — operating your network at capacity, or beyond its threshold with unaddressed issues, could be the “silent killer” of your enterprise. By implementing a proactive diagnostic capability, your business network will stay in good health, and you can prevent catastrophe by addressing what is lurking below the surface.
2. Find a Happy (Capacity) Medium — While excess capacity may be inefficient and reduce the return on hardware investment, running with little or no capacity can be downright dangerous and leads to sizable unplanned expenses. Enforcing a capacity-planning tool can map existing network inventory in a matter of hours, enabling your company to allocate resources more efficiently than ever before.
3. Build in Redundancy — With advances in hardware and communications protocols, you are probably pressed to remember the last time you had a complete network outage. However, a false sense of security can cause you to miss the most pervasive and increasingly frequent instability issues. By increasing visibility in the framework, you can anticipate, without fearing, your network’s health.
4. Swim Upsteam! — Over 80 percent of all IT problems are a result of change. As networks have grown in complexity and in strategic importance to businesses, visibility to the change alone is no longer sufficient since the downstream effect of the change can be far more detrimental. Swim upstream: look at topology and connectivity status to speed troubleshooting and repair. If you wait for the downstream effect, change under the radar could impair company productivity for days or weeks.
5. Work in Tandem — When it comes to managing your network, virtualization doesn’t always reign supreme. Sure virtualization produces many benefits, including cost reduction, increased agility and responsiveness, reduced downtime and even faster software development cycles. However, virtualization also creates new challenges when its operation is considered independent of your physical network. Although deployed in “virtual machines” they execute real (read mission critical) business services on physical servers that users access across physical networks. You need to effectively operate a combined virtual and physical environment to address all of your business’ needs.
Kenneth Klapproth is Vice President of Marketing with Entuity (www.entuity.com), a leading provider of network management and service delivery solutions.
Convergence and More Convergence
Juma Technology (News - Alert) is a top telecom and IT systems integrator specializing in converged voice and data network deployments. Their expertise spans contact centers, mobility, network infrastructure and security.
Juma’s CTO, Joe Fuccillo, says, “At a high level, we see that customers are not happy with the many ‘point’ products they’ve purchased, which don’t ‘stitch things together’ well. They give you a narrow view. People have purchased some type of VoIP management, such as a voice quality tool, which turns out to be not very ‘end-to-end’ in nature, and they need things like probes. Many of my customers tell me they don’t want to position probes all over their network, and they’re not happy with some current ways of adding network management for voice.”
“Another trend we see is that the ITIL specifications are starting to drive people toward using dashboards, services and service views, as opposed to alarms and alerts,” says Fuccillo. “They want things that work end-to-end, and they want an application delivered as a service or a series of applications integrated as a business process, and linking them end-to-end, measuring service level agreements against the end-to-end service or process as opposed to an individual component. Customers like to use the dashboards and metrics to provide information both to the business and to the IT staff.”
“Years ago, a company initially dealt with many ‘silos’,” says Fuccillo, “and the network management team used one set of tools, the server management group used a different set, and the PBX (News - Alert) guys had almost no tools — other than configuration tools — and they had almost no monitoring capability other than some alarms concerning facility issues. Now different technologies, even wireless, are converging into something that enables people to look into the overall applications as a total service view. Storage had its own tools — the database groups had their own tools for monitoring certain database systems. Not that the tools aren’t important, because they all add very interesting information. But an overall view of end-to-end performance availability, services and how that’s functioning for the business is really what we see changing, particularly as groups on the management side become more unified. At the director level, more and more groups report to a consolidated management team.”
Faster and Faster
Aspera (News - Alert) has devised software that runs at each end of a link, capable of overcoming network throughput bottlenecks in a way that differs from conventional file transfer protocols and “WAN acceleration” approaches. As more and more content becomes digitized and moves through the infrastructure, Aspera can provide maximum bandwidth utilization, the fastest possible end-to-end transfers, and guaranteed delivery times, regardless of the distance of the endpoints. The efficiency holds across the dynamic conditions of the network and for even the most difficult satellite, wireless, and unreliable international links. Moreover, the transfers are completely secure. Aspera works with telco customers such as Verizon (News - Alert), Comcast and EchoStar.
Michelle Munson, President and Co-Founder of Aspera, says, “We play a role in the world of file-based workflows that are replacing the more traditional ways of moving tape-based content to media distribution channels. Our crossover with Internet telephony has to do with the convergence of entertainment content with more traditional service providers. Specifically, we’ve created a core technology that solves the underlying performance problems associated with traditional file transfer protocols, in particular TCP-based protocols. Our software is a new generation of file transfer technology that provides ideal bandwidth efficiency, independent of network conditions. For organizations that move or desire to move large amounts of digital content as files over IP networks, the Number One benefit our software provides them is the most efficient use of the bandwidth and therefore dramatically faster file transfers — so much so that customers have experience orders of magnitude faster in terms of enhanced transfers, depending on how poor the performance of the traditional protocols are, based on network distance and such fundamental factors as packet loss and delay associated with the network paths.”
“Aspera has grown up as a company and a technology providing highspeed transfer to these companies,” says Munson. “As they move from shipping physical media to setting up networks and transferring digital media files through this supply chain — for example between movie studios and the TV producing broadcasters and the VOD [Video on Demand] providers such as Comcast (News - Alert) and now the IPTV people such as AT&T — there’s now a huge context in which that transport gets applied. Factors include everything from how you set up and do the transfers that scale between companies, to how you track, monitor and manage what’s going on, to how you fit into the workflows. One of the most interesting examples concerns a traditional system of gathering content for VOD from the providers of that TV and movie content, called Pitcher/Catcher, and it goes over satellite. Each recipient has a satellite receiver to grab the content ‘pitched’ to them over that satellite link. That’s one fundamental place in the supply chain that’s switching to file-based transfer. In those cases Aspera is a substitute for that traditional Pitcher/Catcher system, and certainly a substitute for FTP. So that’s the basic context in which we operate.”
“We are deployed at many of the content suppliers as well as the cable and MSOs that receive content to distribute as high-speed file transfers,” says Munson. “We’re also deployed at many of the on-line video aggregators that are new distribution channels, of which iTunes was a sort of prototype, and now there are things such as Hulu (News - Alert).”
The Security Angle
It’s difficult these days to separate security from overall network management. It sometimes appears that the whole world is attempting to hack networks, and so what was once an afterthought is now a major consideration. Take Fortinet, founded in 2000 by the legendary Ken Xie of NetScreen (sold to Juniper for $3.5 billion), a major provider of Unified Threat Management (UTM) security systems for business communications. Its security systems and subscription services protect the networks of more than 20,000 customers worldwide, including carriers, service providers and enterprises of various sizes.
Fortinet’s Director of Product Strategy, Chris Simmons, says, “Our basic idea is to provide a consolidated security platform. As it pertains to telcos and your larger carriers and provider entities, we offer a series of chassis UTM systems, which we call the FortiGate 5000 Series. It gives them a platform onto which services can be integrated. It’s really good for operationalizing security. Also, in terms of network management, we do have features such as priority queuing, quality of service, traffic shaping of applications, and all of those great things that can help to shape the network and ensure that the critical traffic gets through, and that applications don’t consume more bandwidth than whatever is desired.
“With that said, we just announced 10 gigabit Ethernet support throughout our whole 5000 Series chassis,” says Simmons. “That really opens up the bandwidth whenever we’re talking about providing security on the network, and we’re now able to do all kinds of load balancing configurations at 10 gig speeds. Hence, we’ve made an order of magnitude jump in terms of what we’ve capable of pushing through one of our FortiGate 5000 Series devices. We have different sized models for the FortiGate Series so we can hit just about every market. We offer everything from a 5-user model all the way up to carrier network-class devices that support huge numbers of users. The great thing about the FortiGate Series is that a provider or carrier can provision customer services on a single hardware platform and create a managed firewall service, VPN service, intrusion prevention service, and they can filter their mail for spam — with all of this on a per-customer basis, which gives them a high level of granularity while still doing it all in one physical hardware platform. You can cram quite a bit of customer density into a single appliance by means of virtualization.”
“FortiGate also has great traffic shaping and monitoring capabilities, particularly as it pertains to voice traffic,” says Simmons. “We can fully decode SIP, H.323 and other protocols and we can monitor the voice network for the attacks that are known today, but we can also prioritize the traffic and it give it a quality of service so that it will get through the device expediently.”
Ethernet Satisfies the Bandwidth Urge
Fujitsu (News - Alert) Network Communications, headquartered in Richardson, Texas, provides IT and carrier-class telecom solutions for the North American service provider and Cable TV markets. Thanks to their tie-in to Fujitsu Labs, they can provide innovative and fully integrated IT/Telecom solutions that deliver traditional and next-gen services to many types of metropolitan transport networks, as well as regional, long-haul applications.
Denise Provencher, Director, Element Management Products at Fujitsu Network Communications (News - Alert), says, “Our history has been the development of transmission equipment for carriers and service providers. The biggest trend we see is network convergence. Migrating the core, inter-office network from TDM-based services to packet-based services, particularly the use of Ethernet as a transport, as opposed to SONET or SDH. Also, carrying that directly over WDM [Wavelength Division Multiplexing] fiber. Many vendors, ourselves included, have announced products that fall into what they call ‘packet optical networking’. The products themselves combine DWDM [Dense Wavelength Division Multiplexing] or ROADM (News - Alert) [Reconfigurable Optical Add-Drop Multiplexing] capabilities, SONET and SDH, and now also Ethernet switching — that’s all in one platform.”
“What’s driving all this is the need to carry more video, wireless, traffic and everything being carried as packets,” says Provencher. “The explosion in bandwidth requirements forces the need to make efficient use of existing legacy networks, and the best way to do that is via Ethernet, which can be made to transport packets efficiently. From a network management perspective, the difficulty or challenge is, how do you manage all of those things when, historically, those have been managed in different domains? We’ve had network operations centers that managed SONET networks, and perhaps they also managed the DWDM networks, but they were completely independent of the enterprise and many IP router and switching networks, which were basically overlays.”
“Now you’ve got products that combine all of these things, along with three layers into one, so how do you help people manage that?” asks Provencher. “At Fujitsu, because we are obviously selling hardware that does these things, we also have a network management system, the NETSMART 1500, that manages all of the layers and enables our customers to manage everything from the WDM layer to the SONET layer to the Ethernet layer, to the VLAN on top of that, with Ethernet being a transport in this case. When you lose your optical network or you have a problem at the WDM layer, you want to know what that actually affects. Indeed, what are all the different layers that are affected and how? To me, that’s one challenge.”
“Getting back to the matter of having different operations centers, especially the large carriers, they do have pretty segregated operations,” says Provencher. “So while we can provide a single management system that allows them to look at the entire network, they may still want to maintain segregated operations centers. They can use our product as a sort of ‘building block’ in their operations architecture, and integrate it with things at a higher layer. So they may have two different surveillance systems, one for their IP network and one for their historically TDM network. They can use the NETSMART 1500 to, say, forward alarms for the TDM network to one center, and alarms that are at the Ethernet layers and above can queue the other center. That’s one of the ways we’re working with our carriers today to help them sort out the management of such a complicated and converged network.”
SevOne offers a scalable, flexible network and application performance management system that plugs seamlessly into your existing environment. With SevOne, the performance of even the most complex enterprise can be managed from a single web-based view. SevOne delivers fully customizable reports and graphs specific to today’s technologies such as VoIP and virtualized servers.
Mike Phelan, CEO of SevOne, says, “We’re entering places such as Comcast, HBO and Credit Suisse — some very large companies that had been using legacy products. Why would they consider using SevOne? Well, we have several unique differentiators. First, they already have a great deal of functionality, but not ‘from a single pane of glass’ so to speak. So our first differentiator is that we give them all of the flow technologies, all of the e-hit lists and SNMP technologies from a single pane. Next, we offer a very flexible environment. For example, I asked J.P. Morgan Chase why they bought our SevOne product, and they responded that, because of all of the consolidation in the banking industry, they had inherited a veritable Noah’s Ark of tools. They had two of everything. But they decided to establish some standards, and they picked TAKs. When they asked each of their providers and asked, ‘How long and how much will it take to make it’s a standard across your product lines?’ They got quotes back ranging from $250,000 to $1.5 million, in a time frame of six months to a year-and-a-half. As it happens, we at SevOne were able to do it in about five days.”
“We’ve instituted a process to quickly add enhancements that our customers need,” says Phelan. “We have a well defined roadmap. However, we will alter that roadmap, particularly if it provides ubiquitous help to our product. For example, Comcast asked us to do something that’s appearing in our new release. We call it ‘deferred data’. They wanted to take other datasets that didn’t come from routers, switches or MIBs; instead, they wanted to take business-derived datasets and enter them into our system and correlate that against IT infrastructure performance. We did it for them, and there were some specific reasons they had for wanting to do that. But what we found is that has some interesting value related to other environments.”
SevOne’s CTO and Co-Founder, Vess Bakalov, says, “For example, one of our service provider customers told us that they can now go ahead and get a composite metric of all of their traffic going to a particular metro area. They can still look at device-by-device communications, using our technology, they can make calculations based on all the delays coming into the metro area, and figure out the composite load and quality of service, and that minimizes the number of things they need to look at, but they can still drill down quickly to figure out what the details of the problem may be.”
Adds Phelan, “We haven’t wrapped any marketing fluff around this capability yet, so we don’t have a catchy term for it, other than ‘deferred data’. Regardless of what you call it, it’s a terrific concept and we’ve seen whole companies being built on less. The fact that we can take business metrics and correlate them is great, but even here we’re just scratching the surface. I had a conversation with a CIO of a very large credit card company, and he called it ‘failed acquisition tracking’. He was referring to those little cards and online applications you fill out to get a credit card. Those people who fail to complete the whole process are tracked, but the company doesn’t correlate that against network performance. We give them an ability to take that dataset — when it happened, when they stopped doing it — and correlate that against performance issues. The company would wait for 20 seconds for an application to be filled properly, and but for some reason the would-be customer doesn’t complete their application and that leads to delays. The CIO said that these events cost them tens of millions of dollars. Now, they don’t know if network or applications performance issues are to blame, but they will know if they use our product. So there are some really interesting business tie-ins relating to our product that I haven’t seen elsewhere.”
“We offer an appliance that’s grounded in a unique peer-to-peer architecture,” says Phelan. “All of these other monitoring and management systems work with collectors and a reporter. You take many collectors and put them anywhere you need them geographically, and they all funnel back into one reporter. On the other hand, the way our peerto- peer architecture works is that all of our appliances are part of a sort of ‘hive’ if you will, and in that hive every single appliance is both a collector and a reporter. Each one has its responsibilities, but when a report is requested, or a function is requested that automatically requires the cooperation of the whole hive, they act in cooperation with each other. Let’s say that Credit Suisse has one system in London, two in Germany, two in Chicago, one in New York and one in Washington, D.C.. When you make a request for a certain report, the appliances communicate among each other and very quickly pull back just the information they need to create the report from the particular area for which each appliance is responsible. Unlike our competitor’s products, which tend to slow down as they scale up owing to bottlenecks caused by centralization, ours actually increases in speed, because it leverages the ‘grid’ of processors.”
Open Source is Ripe for Management
Sophisticated management techniques have even come to the world of open source telephony and Digium (News - Alert) Asterisk IP PBXs. For example, recently Six L’s Packing Company, a big American tomato and vegetable grower, deployed Packet Island’s PacketSmart Enterprise Platform to manage its 25 site Asterisk (News - Alert) VoIP deployment. The PacketSmart VoIP/data management solution enables QoS-based network assessment and 24x7 VoIP-data flow monitoring, providing distributed enterprise customers a way to manage heterogeneous voice/ data networks at remote sites. PacketSmart can determine everything from whether a DSL line is adequate for running voice on, to remotely troubleshooting complex signaling issues.
Prior to deploying PacketSmart, Six L’s had many VoIP quality issues, and toyed with the idea of going back to using traditional telephony. Although their traditional SNMP-based network products did a good job of monitoring the health of their network devices, they weren’t very good at troubleshooting transient VoIP issues. Indeed, those products couldn’t even characterize the problem, let alone pinpoint the root cause.
After deploying the PacketSmart platform and installed Packet Island (News - Alert) micro-appliances at each of its 25 sites, Six L’s could quckly characterize the nature of the transient problems reported by its branch sites. “A majority of the problems were isolated to QoS configuration issues and certain out-of-capacity network devices,” says Drew Middleton, Six L’s system and network administrator. IT
The following companies were mentioned in this article:
Juma Technology (www.jumatechnology.com)
Packet Island (www.packetisland.com)
Zeugma Systems (www.zeugma.com)
Today @ TMC
ITEXPO West 2012
October 2- 5, 2012
The Austin Convention Center
The World's Premier Managed Services and Cloud Computing Event
Click for Dates and Locations
Mobility Tech Conference & Expo
October 3- 5, 2012
The Austin Convention Center
Cloud Communications Summit
October 3- 5, 2012
The Austin Convention Center