VoIP is the single biggest thing to happen to enterprise networks since the telephone. IP Telephony will make communications cheaper, easier, and more flexible. Convergence will fundamentally change how we work and how we communicate. In short, VoIP will save your life.
You’ve probably already heard all of this before, as well as dozens more hackneyed platitudes and sound bites. And you’re probably already quite sold on the importance of upgrading your network to VoIP, but you have questions and concerns. And while many of these can be addressed by the many VoIP vendors and system integrators vying for your business, there may be a few they’re not so able to answer.
One question, that often comes up toward the end of a deployment: “Once I upgrade my enterprise communications network to VoIP, how will I manage it?” Of course, a better question to ask: “How do I keep management in mind while I plan and execute on my VoIP network upgrade so I don’t need to worry about disconnects, poor sounding calls, application failures, and other problems?” Unfortunately, few are very interested in detailed conversations around the need to adequately manage the “easy-to-use” equipment they’re trying to sell you.
The truth is that VoIP has already started a major transformation in how we communicate, and things are going to get bumpier before they get easier. Yes, one day you’ll plug into the wall and have dial tone, video, data, and more, in much the same way we enjoy electricity and water today, but despite the best efforts by many service providers around the world, that day is a long ways off.
The good news? Your organization can very likely achieve significant cost savings, tangible productivity increases, and move to an underlying technology with almost limitless room for growth and extension — all while accepting modest risk, modest cost, and (relatively) stress-free nights. How? A strong management solution, so that whatever vendor equipment you deploy, no matter its configuration, and no matter which SI you use, you’ll know the VoIP application is available, meeting customer needs, and secure. And, if there is a problem, you’ll know immediately, and have the tools to drill-down and find both the problem and solution quickly and efficiently — hopefully without end-user impact.
In order to understand what management system must be built, let’s first review the five major challenges of VoIP management. Keeping these five challenges in mind, as well as challenges unique to your organization, will help you build the right management solution, whether you build your own, buy commercial off the shelf products, or outsource to a managed service provider.
Challenge 1: Anything Can Break
Chances are quite good that your data network is complex — very complex. You probably have a headquarters, at least several distributed sites, and at least a few telecommuters. And if you’re anything like your competition, you’ve been adding more and more business critical features and functionality to your network, from things as simple as e-mail and instant messaging to sales force automation, workflow applications, and ERP systems. Now you’re going to add VoIP, which runs on the same network, but adds a whole new level of application fickleness.
Consider an example where Boris wants to communicate with Natasha. If he sends an e-mail, and due to transient network congestion it takes five minutes to arrive, there’s no issue (let’s assume for the moment that Natasha is not eagerly awaiting news on Bullwinkle’s whereabouts). But now if Boris picks up the IP phone to call Natasha, and a link is temporarily out of service or even just has a considerable delay, they can’t connect. Bottom line: the data network you’re running today may or may not be ready for VoIP. If you’re like most companies, you’re going to need to do some upgrading of both equipment and management systems. That’s healthy and natural: you didn’t need this level of resiliency before, but VoIP needs it now.
Challenge 2: Understanding the Data
Most data equipment can be rather chatty when there’s a problem (or even when there isn’t!) — from Simple Network Management Protocol (SNMP) traps to database records to Element Manager Systems (EMS) reports to anything else. Sometimes this information can be hard to understand, let alone act upon. Do you really know how much of a problem it is if a certain router is running at an 82 percent CPU usage? Adding VoIP-specific and VoIP-enabled equipment makes this more complex. Consider the simple case of an IP PBX. A pure IP PBX has probably only been in existence for a few years. Even an IP-enabled PBX has many features (certainly the IP ones) that are relatively young, even by technology standards. In the never-ending quest to get your business, vendors add new features and apply patches all of the time, many of which result in changes to event formats and even meanings. You need a system to understand these events and adapt to changing ones. That doesn’t just mean it will work today — but you need to be comfortable that when something changes tomorrow, such as a new feature or an IP PBX being demoted from production to a test server, that you have a way to alter how you act upon its events.
Challenge 3: Understanding the User Perspective
While managing the underlying network components and services is critical, it’s all moot if your users experience poor quality calls. The key word here is “experience.” If a user picks up the phone and doesn’t get dial tone, or makes a call and hears an echo, low volume, or choppy speech, she’s going to formulate a negative opinion about the phone service at the time. It doesn’t matter that there were no problems reported on the network, that twenty other users made calls at the same time with no issues, or that someone was downloading several new mp3s for their iPod and caused network congestion.
If you’re lucky, she’ll complain, so you can log a ticket and research the error. More likely, she’ll just form a negative opinion of the service, use it less (perhaps using her mobile phone instead), and not complain until she’s experienced a number of problems with a clear and demonstrable effect on productivity. Only then will you get a complaint, and it won’t usually have exact times and dates attached as much as it will have an understandable dissatisfaction and unhappiness with the service.
One way to handle this is to encourage users to report bad calls, perhaps even with a simple to use Web form. An even better way: leverage technology that allows actual monitoring of user calls, or at least a representation of them — this is generally called “passive monitoring.” Related and also quite useful, “active monitoring,” allows synthetic calls to be placed on a network to generate periodic and on-demand calls as well. A number of vendors have passive and/or active monitoring tools available today, and most are evolving these technologies to work even better. Virtually all allow this information to be passed “northbound” so it can be viewed in the context of other network information.
Challenge 4: Correlating User Experience with Network Information
So you’re collecting data from all the equipment on the network, as well as actual service quality data from both your SQM management tools and your actual users. Now, how do you get this to work together, so you can quickly resolve issues, preferably before they become major? The first step is to make sure all of your information is going to a central system, so that it can be analyzed in the context of all the available information. Next, you need to start prioritizing events so you know where to focus first. Even if the supply closet’s phone has failed three times over the last week, even a single failure in the receptionist’s phone can be much more disruptive for business operations. Finally, you need to do the hard part: start looking for patterns of failures, especially between user reported or SQM software reported events and actual network problems.
For example, you may have a user who likes to download mp3s over his lunch hour. A misconfiguration on your network may result in those mp3s receiving the same priority as VoIP calls, which can result in network congestion. This congestion will be reported in SNMP traps and Syslog messages, and can be used to help further the diagnosis.
While this can be done by hand, it’s a lot easier to use off the shelf software to help automate many of these problems. Some of these packages will do totally independent automation, others will allow arbitrary service correlation and modeling (you map out the service it is to manage), and the best will do combinations. Also look for strong visualization capability, so that the entire network and all important services can be viewed in multiple ways. For instance, sometimes you’ll want a layers 1–3 view, other times you’ll want a layer 7 application view, and other times you’ll want something else. Remember that no single paradigm will find every problem, so look for software that has a good variety of capabilities.
Challenge 5: Baselining and Performance Management
Once your VoIP network is running optimally, you need to baseline it’s performance, use trending tools for capacity planning, and compare the past results to current results when problems occur. A strong performance management solution is critical here — whether you build your own or buy off the shelf. While your VoIP deployment may be complete, you’ll likely be deploying extra phone features, unified messaging capabilities, and more IP-based services that will increase both the bandwidth utilization and the complexity of your network. By tracking the network over time, you can determine needs before they become critical, such as a new IP PBX, increasingly low disk space in a voice mail server, or an increasingly slow backup cycle in your customer database.
With all the complexity of managing a VoIP network, many enterprises are looking to outsource their operations. This is especially true of companies that are already comfortable with outsourcing, or are small enough that they simply can’t afford to have a dedicated technical team. This approach can be very beneficial, as it allows both the enterprise and the managed service provider (MSP) to focus on their core competencies. Clearly, there are a lot of factors to consider when picking an MSP. Many enterprises look for the safety and security of a larger MSP, while others like the personal and customized services of a small MSP, perhaps even leveraging the systems integrator they may have used to help with the original installation. Whoever you choose, make sure that you’re both working with someone you trust as well and you establish clear Service Level Agreements so that if problems occur the MSP will feel the pain almost as much as you do. If problems occur, you want a rapid fix, not finger pointing and denial. Also, don’t be shy to ask for a hybrid environment, where perhaps you’ll manage some of the equipment but they’re responsible for the remainder as well as the total service. Most MSPs are quite eager for your business, and are willing to work within your constraints to get you what you need.
For those that choose to manage their own VoIP deployments, either exclusively or with an MSP assisting, the right management tools are critical. As is often the case in technology, no single vendor will have every solution you need, but the best will have most of the right components and have no trouble working with your other best-of-breed selections. You need to build the right network management solution both for today and the future — you certainly don’t want to train your team on one solution and replace it in one year. As is always the case with vendor selection, you’ll need to find the right blend of several factors, including innovation and stability. Sometimes you’ll want the safety of a large vendor, other times you’ll require the innovative approach of a smaller one. Whatever direction you choose, look for good technology and a strong customer base as prerequisites.
Finally, remember that your network will almost certainly grow quickly and dramatically, even if your core business is only experiencing modest growth. Metcalfe’s Law states that the value of a network increases exponentially with the number of connections. Every time you add a user, a new application, or a new service, you’re adding complexity. And, unfortunately, every time you add network complexity, management complexity also increases. With the right planning, people, and tools, you can minimize that complexity and focus your efforts on building the right applications and services for your user base, rather than on trying to determine what failed and why. IT
Jeremy Bloom is Director of Product Management at Micromuse, a provider of scalable, real-time business and service assurance software solutions. For more information, please visit www.micromuse.com.
If you are interested in purchasing reprints of this article (in either print or HTML format), please visit Reprint Management Services online at www.reprintbuyer.com or contact a representative via e-mail at [email protected] or by phone at 800-290-5460.