It’s the call that no network manager wants to get – your CEO is trying to make a VoIP call and the connection is choppy, making it very difficult to conduct business.
Communicating via Internet telephony requires real-time bandwidth, yet the Internet was not built to carry this type of communication information quickly. Anything else that requires a large amount of bandwidth (images or videos) can interrupt VoIP calls.
While Netflix and YouTube (News - Alert) are able to just load a few minutes of a video at a time, VoIP calls are happening in real time, and network managers face the task of creating crystal clear and frustration free communication.
Here are six things network managers can do to keep their VoIP systems running smoothly.
1.) Deploy monitoring points far and wide
You can't troubleshoot a problem that you can't see. Yet, network administrators continually run into infrastructure that has been insufficiently instrumented and monitoring points are only set up at the perimeter.
To get a broader view into potential bottlenecks, network managers should use sources such as NetFlow or sFlow that can be generated by any router or switch inside the network. This is a quick and easy solution that doesn’t need any extra investment.
NetFlow and sFlow act as a pen register for individual communications at specific points. Monitoring device, volume, and time they fluidly track the ongoing competition for network resources, and provide fast answers to questions like:
- Is a link down, overloading an alternate?
- Is someone listening to Internet radio, watching Netflix or YouTube, or downloading large files?
- Is the CEO’s VoIP call being assigned the correct QoS level?
- Who is using bandwidth at each link, and what are they doing?
- Are the highest consumers of bandwidth internal or external?
- Is the high bandwidth sustained (leading to more uniform choppiness), or are there sporadic peaks (leading to bad-connection bursts)?
Knowing the type of traffic (VoIP) and the time of the choppy call makes it simple to zero in on and visualize the utilized links – whether they are the expected links for that traffic or not – and identify at a glance which segments are seeing high traffic and what that traffic is. In this way, a network operator makes best use of its time, and stands a much better chance of clearing up poor performance fast.
2.) Understand typical link usage and learn users’ peak times
Knowing your network is key to keeping your VoIP infrastructure up to par. Generally there are times throughout the day when problems are most likely to occur or users need more bandwidth. Additionally, certain locations within the network tend to be more prone to bottlenecks.
The human mind is actually the best anomaly detector, so having knowledge of this typical link usage is crucial for quickly finding and solving potential issues. For example, you may determine that any link that approaches 70 percent peak utilization is a potential hazard to your VoIP calls. Or you may find that specific links are more or less tolerant of high utilization, and so need different rules, or that a specific link in the critical path is prone to bursts of high traffic.
While some network tasks generate bursts of traffic by design – automated backups, database synchronizations, operating system updates, and other scheduled tasks that can be turned off or rescheduled – much of a network’s emergent behavior stems from human users. Break times during which people surf the web, meeting times prior to which the participants all download materials, quitting time when people start checking in code and saving files to shared drives, all can seem to synchronize out of nowhere and cause bottlenecks. Zooming out to see multiple days’ traffic levels divided by application will make these patterns clear, and show where a confluence of factors leads to a perfect bandwidth storm.
It’s important to ask as well, which recent VoIP calls were clear? If 70 percent peak utilization is a potential hazard, even 65 percent could be fine. Knowing the times to examine and what the outcomes were will help you learn exactly how your network affects your mission.
3.) Watch for bandwidth hogs
Unmetered content that can be downloaded at maximum line rate can pose a major risk to your crystal clear VoIP calls. YouTube videos, streaming Netflix, WebEx and Facebook (News - Alert) can temporarily suck available bandwidth, and degrade voice calls, even when marked as bulk traffic. Priority is not destiny: Your devices may make a best-effort attempt to prioritize VoIP traffic, but even polite protocols can, in sufficient quantity, clog a network.
Network managers should use their widely deployed sensor networks to watch for common paths this bulk traffic takes. They can then determine if they are able to isolate it, traffic-shape it, create a rate-limit, reclassify it as scavenger or best-effort, or reroute the traffic altogether. Expanding the range of quality of service classes in use can allow for more fine-tuned managing of traffic. Rather than adding to the monitoring burden, this can actually make monitoring easier.
Knowing details of the traffic at each QoS level will also help. Bulk TCP traffic will attempt to manage congestion by backing off, but will also attempt redelivery of lost packets, leading to greater bandwidth use over time. Streaming or interactive video can be a one-two punch of high and unpredictable bandwidth with higher DSCP numbers than any user traffic except voice, and yet when deployed is often valuable. This is where bandwidth and policy intersect, and where scrupulous traffic monitoring can inform decisions when mission priorities conflict.
4.) Beware of log anomalies
Most switches start dropping packets well before they fail. Additionally, most routers show a high CPU load well before they go offline. Anomalies such as packet drops or high latencies, especially during low traffic loads, are often an indication that you’re about to stay at work for the night.
Being aware of what has been working well, and what has given you trouble in the past, will help you solve any log anomaly problems easier.
5.) Secure your network
If systems on your network have been subverted by a hacker, they may be used in a distributed denial of service attack. This would result in immediate loss of many VOIP calls as network resources are consumed outside the expected bounds and without regard to ordinary policy. Such traffic may be tagged with high QoS levels or may otherwise masquerade as high-priority traffic. Knowing your network will help you spot these fakes.
Keep your security tools up to date for maximal protection. Subscribe to malware detection mailing lists to keep your pulse on the threats seen by other companies and learn what warning signs to look for in your traffic. Lists of known-bad IP addresses and suspicious ports are easy to turn into monitoring or firewall rules.
6.) Plan, plan, plan
Capacity planning is everything. If you build up an understanding of the maximum capacity you need, then planning your network becomes much easier. Here are a few of things to keep in mind:
- the maximum traffic capacity required if all phones were used at the same time;
- the paths this traffic takes and the required capacity;
- your failover options;
- security defenses that directly examine traffic and generate alerts, and the network manager time needed to investigate those alerts; and
- a monitoring infrastructure that will grow with your network and allow you to assess the effectiveness of the above points.
Failover options are of upmost importance. Whatever your failover strategy may be, you must test it. There is nothing worse than a backup plan that doesn't work because it wasn't tested. If you study your network and know when it’s busy, the best times to test will be clear.
By keeping an eye on your network and planning ahead, the calls about choppy VoIP calls should decrease, and you’ll have a lot more peace of mind.
Vincent Berk is founder and CEO of FlowTraq (http://www.flowtraq.com/
Edited by Stefania Viscusi