Business Intelligence At Cox: A Technical Case Study
TMCnet - The World's Largest Communications and Technology Community
TMC Launches New Sites ::  NGC  |  4GWE  |  Green Tech  |  Satellite  |  IT |  ITEXPO  |  Healthcare  |  Smart Grid  |  M2M  |  Smart Products  |  AstriCon News  |  SATCON News
Share
TMCnews

[October 22, 2003]

Business Intelligence At Cox: A Technical Case Study

Cox Communications is a Fortune 500 Network Services Provider (NSP) that delivers cable television, local and long-distance telephone, and high-speed Internet to more than six million customers across the US. The key to their success is the performance and availability of their service-providing infrastructure. Cox uses a number of point solutions to monitor network devices, but until the deployment of the Network Monitoring Analysis Solution (NetMon) lacked a clear understanding of the health and performance of their infrastrucure, as a whole, over time. The solution lay in the power of business intelligence: Cox invested in Microsoft technology and Gold-Certified Partner Intellinet to build an industry-standard, enterprise data warehouse to give them the knowledge and insight needed to achieve service excellence and maintain a competitive advantage.

CUSTOMER OVERVIEW
Cox�s service-providing infrastructure (hereafter referred to as simply �infrastructure�) is made up of many �sites� that provide cable television, telephone, and high-speed Internet to customers in a specific geographic region. A Site Operations Center (SOC) monitors each site and is ultimately responsible for that site�s infrastructure�these are the people who actually service the equipment. Cox�s Network Operations Center (NOC), located in Atlanta, monitors all of these sites and all �upstream� infrastructure not visible to the sites.

WHY BUSINESS INTELLIGENCE
The NOC is flooded with thousands of device alarms every day. They use several network monitoring tools that provide �point in time� device event and configuration data, but while useful these tools do not allow NOC staff to perform sophisticated data correlation or trending. They can�t analyze device performance over time, assess the performance of the devices within a specific system or geographic region, or evaluate the performance and efficiency of the NOC itself. Before NetMon, NOC staff analyzed data through �one-off� reports that were both time-consuming to build and of questionable accuracy. The bottom line: from executives to engineers, nobody at Cox had ready access to the kind of information required to consistently and cost-effectively provide a high level of service to their customers.

Intellinet introduced Cox to the power of business intelligence in 2001 with the development of the Cornerstone Voice (CSV) data warehouse. Cornerstone Voice technology delivers telephony over hybrid fiber-coax networks and is a core component of Cox�s local and long-distance telephone service. CSV logs are extracted from thousands of devices across the country using a distributed, scalable architecture with multiple data collection points. The logs are parsed and loaded into a data warehouse against which NOC staff run reports on device health and configuration.

But why did Cox see a need to invest in another business intelligence system like NetMon? Because NetMon is the next logical step in the evolution of Cox�s business intelligence strategy. To Cox management, the real question was, �What is the cost of not knowing about our infrastructure?� Says Cox NOC Tools Manager Tim Winebarger, �[We] want to shift to a proactive approach� What are the devices doing? What areas need to be tweaked? What kind of growth should we expect?� CSV addressed one subject area: telephony. NetMon completed the picture by adding Cox�s two other core services: cable television and high-speed Internet.

KEY POINT
A successful business intelligence system often inspires the addition of more subject areas as users discover what can be accomplished with useful data.

ARCHITECTURAL OVERVIEW
NetMon merges device alarm, configuration, and help desk data from six data sources; loads it into a data warehouse; from it builds a sophisticated OLAP environment; and delivers web-based turnkey and custom reports to thousands of users. Even more impressive is the fact that it was built almost exclusively with �out-of-the-box� business intelligence features of Microsoft SQL Server 2000 Enterprise Edition: Data Transformation Services, Analysis Services, Reporting Services, and Notification Services.

DATA SOURCES
NetMon extracts data from six sources: three primary and three contextual. The primary data sources are the NOC�s element management system for cable television, telephone, and high-speed Internet device alarms; CSV for Cornerstone device alarms and configurations; and the help desk system for trouble tickets. The contextual data sources provide additional alarm and configuration data from the cable television and high-speed Internet networks, allowing NetMon users to analyze and prioritize device alarms in the context of their overall impact on Cox�s infrastructure.
Extract, Transform, and Load (ETL) Process
The following are key features of the NetMon�s ETL process:

� Data Transformation Services (DTS) is the basis for the entire ETL process.
� Several top-level workflow packages manage the ETL process.
� DTS package configuration settings are stored in database tables.
� The workflow packages log ETL process status in database tables.
� The DTS packages use parallel database connections to improve performance.

NetMon�s ETL process is one of its most innovative features and illustrates the power and flexibility of DTS. NetMon�s DTS packages are modular in design, allowing for efficient code reuse and minimizing the impact of changes in underlying data sources. Several top-level workflow packages execute and monitor all of the data-source-specific ETL sub-processes and stores process state in audit tables; a simple query will return the status of a given process and performance metrics such as number of records processed and the total processing time for a job. The Microsoft SQL Server Agent schedules and executes the workflow packages.

BEST PRACTICE
Modular Application Design. NetMon�s modular design is inherently easier to troubleshoot, enhance, and manage. The hierarchical DTS packages help to insulate the bulk of the ETL logic from changes in source data structures.

The largest of NetMon�s data sources is the NOC�s element management system, from which it retrieves a batch of 15,000 device alarms every ten minutes for a daily total of approximately 1.9M records. The following steps give an overview of the ETL process:

1. Retrieve configuration. The ETL process beings by querying configuration tables to dynamically set package properties, e.g. database connection settings.
2. Establish database connections. Using OLE-DB providers, it establishes three parallel connections to both the element management system�s Oracle database and NetMon�s staging database: one each for inserts, updates, and deletes. Parallel database connections reduce processing time, thereby increasing overall system performance.
3. Extract data. Next, it determines the time of the most recently extracted batch and retrieves the next batch of records.
4. Transform data. It then applies the appropriate transformation logic, using Microsoft SQL Server 2000 DTS transformation tasks and ActiveX script transformations, to process the data and load the staging database.
5. Load data warehouse. Next, it queries the staging database and loads the data warehouse, which contains multiple fact tables for Active and Historical Alarms. The DTS process handles changes to slowly changing dimensions (SCD) and adds any new members to the 38 dimensions tables using a set-based approach for fact table loading that is more efficient than the traditional cursor-loop approach. The DTS process creates temporary dimension members should a fact arrive for which no corresponding dimension member has been defined. This occurs occasionally due to network interruptions and volatile source data.
6. Load cubes and dimensions. Finally, it feeds data into Analysis Services, which continuously processes 14 cubes with between 12 and 18 dimensions each. Historical alarms are partitioned into cubes, based on alarm date, which are processed in parallel.

Parallel Processing and Process Monitoring. NetMon�s modular architecture allows multiple instances of the same DTS package to be run in parallel, reducing overall execution time and making data available to users more quickly. Experienced data warehouse architects will not only automate ETL processing, but also will design mechanisms by which these processes can be monitored.

DATA STRUCTURES
The data warehouse is a hybrid star-snowflake schema with 11 fact tables supporting the alarm and ticket subject areas which are surrounded by 38 dimension tables. The fact tables have a cluster index on date; since alarm and ticket data is typically queried by date or date range, this improves system report performance.

Key alarm dimensions include:

� Alert Types�Categories and types of alarms, e.g. Loss of Service, System Standby or Low Battery.
� Dates�Standard year-quarter-month-day and year-week-day date hierarchies.
� Hours�Hierarchies organizing time of day by shift and hour.
� Location�Describes the physical location of a network device.
� Network Devices�Individual network devices, from voice ports to switches. It contains two hierarchies: device type and physical topology. The former allows devices to be grouped by type; the latter allows NetMon users to drill-down through the network to individual devices.
� Severity Levels�Describes the severity of an alarm, e.g. minor or critical.

The Analysis Services environment overall contains 14 cubes, each with between 12 and 18 dimensions to support analysis. Alarms are grouped into Active, Recent, and Historical cubes which are partitioned according to the following logic:

� Active Alarms cube contains alarms which are currently open in the system without resolution. The single active alarm partition is processed hourly.
� Recent Alarms cube contains two partitions, one containing alarms from the last two days; Another containing alarms from the first of the current month through two days prior to the current date. Recent Alarms cubes are processed daily.
� The first 11 partitions of the Historical Alarms cube contain data from the last 11 months; the 12th contains alarms 12 months and older. Historical Alarm cubes are processed monthly.

Alarm cube base measures include:

� Event Count�The number of events.
� Alarm Count�The number of alarms for each event.
� Time To Resolve�The total alarm duration from the initial alarm until resolution.
� Time To Acknowledge�The time it took to acknowledge an alarm.
� Acknowledge Time To Resolve�The time from alarm acknowledgement to resolution.

Relative Time Dimension References. The NetMon data structures includes dynamic dates calculated members of the Dates dimension, which include Today, Yesterday, and the Last 30 Days. These dynamic time references are integrated in all of the cubes to provide means for viewing current and relative time activity. Reports can be generated based on these time references for ongoing current use.

The Analysis Services environment also contains several other cubes related to the ticket and telephony subject areas, including Frequency Hops, Open and Closed Tickets, Outages, and Device Counts.

Key ticket dimensions include:

� Priority�The priority level of the ticket.
� Assigned To Queue�The assigned group of technicians responsible for followup and resolution.
� Device Vendor�Details of the device referenced in the ticket.
� Discipline�The service refernced in the ticket: telephony, video, or data.
� Problem Type�The category and type of network problem.

Ticket cube base measures include:

� Count of Tickets�The number of tickets.
� Time To Referral�The time from ticket opening to until the issue was correctly referred to technitians in the field.

Data Partitioning. Partitioning fact tables and cubes based on data type and usage patterns can improve data availability, data loading, and reporting performance. It also allows partitioned data sets to be processed in parallel.

REPORTING
NetMon provides users with access to predefined reports, powered by Reporting Services, and sophisticated data analysis using TARGIT 2K3� AnalysisNET�, a customized thin-client Analysis Services front end from TARGIT A/S®. The NetMon portal is a custom web-based application developed for Cox using Microsoft Visual Basic.Net.

"With the addition of Reporting Services, SQL Server has continued to be our choice for new applications and an extensible platform for application consolidation. Now we can add the solid, scalable reporting infrastructure of Reporting Services to our Business Intelligence environment where we have already fully leveraged Data Transformation Services, the SQL Server relational engine, Analysis Services for OLAP and Data Mining and Notification Services for alert applications," says John Hall, Cox NOC Director.

NetMon�s predefined reports include Average Outage Times, Average Acknowledgment Times, Network Device Typology, Outage Counts By Hour/Day, and Acknowledgement Times By Site/Device/Type/User.

Analysis Services allows users to drill down from cube summary data through to the underlying detail records in the data warehouse.

A select group of power users have access to TARGIT 2K3� AnalysisNET. Integrated into the NetMon user interface, it provides a rich decision-making environment with multiple interactive and linked data visualization objects, views that call one another and support for member properties and cell-level actions.

Iterative Software Development. In order to address Cox�s needs but still deliver a working system in a timely manner, Intellinet developed NetMon in phases, clearly defining all requirements and deliverables for each phase. This approach helps to minimize risk for all parties involved.

NOTIFICATION
When the NOC�s average acknowledgement time exceeds an acceptable threshold, management needs to know. Since Cox already owned Notification Services, They configured it to send email notifications on the threshold exceptions. Notification Services also has the ability to communicate using a variety of other mechanisms including cell phones and pagers, which Cox may exploit later. Notification Services has also been leveraged to send alerts to NetMon administrators anytime an ETL process fails.

Notification Self-Enrollment. Cox plans to implement self-enrollment for NetMon notifications�NetMon users will be able to specify those notifications they wish to receive.

SECURITY
NetMon is available to 20,000 Cox employees, select partners, and subsidiaries. In order to access NetMon, users must authenticate to the Windows 2000 domain via Active Directory.

There are five types of NetMon users: NOC executives, management and engineers as well as site-level management and engineers. Users in different groups are typically interested in different types of reports, so when a user authenticates they are presented with reports that are most relevant to their job function.


Interface Personalization. NetMon presents users with the reports that are most relevant to their job function.

User-Configurable Dashboard. Cox plans to implement a user-configurable NetMon dashboard�users will be able to create a home page displaying those reports that, combined, create the most meaningful picture of the performance and health of the infrastructure.

NetMon has been in production since Q1 2003. The development team has made incremental updates to it since then, including refining the ETL logic; adding data structures, OLAP cubes, and dimensions; and enhancing the user interface. An early adopter, Cox added reports using Reporting Services and 64-bit Windows 2003 Server in Q2 2003. Says Cox�s John Hall: �The graduation to 64-bit has given us seemingly endless performance overhead where we may not have been able to scale our solution to the same levels on 32-bit. I am just not sure we could have done it without Windows 2003 and SQL Server 2000�s 64-bit edition.�

HARDWARE AND SOFTWARE
The Staging and Production environments consist of two four-processor Xeon servers running Windows 2000, SQL Server 2000 Enterprise Edition, with Multi-Instance SQL Clustering on a RAID 1+0 disk array. The Analysis Environment is a 64-bit 1.5Ghz four-processor Itanium2 server with 6MB on-board cache and 16GB RAM running Windows Server 2003 and SQL Server 2000 Analysis Services. The Presentation Environment consists of three four-processor Xeon servers running Windows 2000, Reporting and Notification Services, and TARGIT 2K3� AnalysisNET. The Presentation Environment resides behind a load-balancing switch.

Managing the high volume and poor quality of incoming event data was the single greatest challenge faced by the NetMon project team. The 1.9M event records extracted from the NOC�s element management system every day require extensive cleansing during ETL processing. To overcome this obstacle, a two-server cluster dedicated to ETL processing serves as the Staging Environment before data is loaded into the data warehouse. In addition, NetMon manages server load intelligently by concurrently executing DTS packages on multiple servers. This approach helps to minimize any impact to the performance of source systems or the data warehouse by offloading ETL processing.

NetMon extracts records from Cox�s element management system in batches of 15,000 every ten minutes, for a daily total of approximately 1.9M that is projected to grow to 7.5M by year end. Data is available in the warehouse in less than 10 minutes, and in the cubes in less than one hour. Cornerstone Analysis contributes 780,000 records per day and the contextual data sources contribute an additional 555,000 records per day. Under peak loads, NetMon processes 135 transactions per second.

NetMon is already a large system: the data warehouse is approximately 500GB and expanding at a rate of 4.5GB per week. The largest cube contains 290 million rows and is approximately 1GB. At the time of this writing, NetMon has approximately 2,000 named users and comfortably supports up to 50 concurrent, simultaneously active users.

NetMon�s distributed architecture is designed for scalability and availability: both the Staging and Production environments are clustered. Although the Analysis Environment is housed on a single server, Cox maintains a standby server that can be put into service on a moment�s notice. Finally, the Presentation Environment is distributed across three load-balanced servers.

Scalable, Available Architecture. NetMon uses a tiered architecture that can be scaled both horizontally and vertically. The use of server clustering eliminates single points of failure.

Microsoft Windows Server 2003. Cox plans to leverage the 64-bit power of Windows Server 2003 for the entire NetMon environment.

Before NetMon, Cox had poor visibility into their infrastructure. �[We] now understand how the company lives and breathes� Cox is winning awards for customer service,� says Cox�s Tim Winebarger. Microsoft technology and Intellinet�s business intelligence expertise were the keys to NetMon�s success: �I believe NetMon is built on a technology that offers us incredible flexibility,� Winebarger says. As for Intellinet, �[They] gave us value for the dollar�it was an accelerated value because it was something that didn�t take months and months�it took days� Intellinet wants to walk in the door, provide value, empower the customer, and walk away.�

Cox is ready to take over the maintenance and enhancement of NetMon. They chose to use industry-standard Microsoft technology and an IT consulting firm committed to the success of their client. �We�re confident [that we can support] 80% of it right now� We�ll be able to [hire] the people [to do the rest],� says Winebarger.

Return on investment from NetMon was realized quickly: once the level of visibility it affords into Cox�s service providing infrastructure became apparent, management accelerated the project schedule to make its reports available for critical 2002 end-of-year business. �I had several of the directors of engineering for the whole company.. they control the engineering budget.. they�ve come and looked at this data� and it wasn�t a painful process for them to get the information they needed to plan to spend or save, � says Winebarger. NetMon has given Cox management a centralized source of reliable, timely data with which to support business decisions. Now, says Winebarger, �[Management is] willing to say, �What do you need to make this [solution] really thrive?��
Best Practice

Client Empowerment. Intellinet ensured that Cox IS staff were integral to the development of the NetMon, received the appropriate technical training, and are well-prepared to take on its maintenance and to perform future enhancements.

About Intellinet
Intellinet (http://www.intellinet.com) is a Microsoft consulting firm located in the Southeastern US and holds Gold Partner Certifications in Business Intelligence, E-Commerce Solutions, Enterprise Systems and Security Solutions. With their Intelligent Enterprise integration solution, clients benefit from systems that meet current needs and can scale to future demands. Intellinet�s proven approach integrates enterprise systems including networking, security, messaging, collaboration, custom applications, data, and mobility. Celebrating 10 years as a professional technology services firm, Intellinet has offices in Atlanta, GA and Charlotte, NC.

[ Back To TMCnet.com's Homepage ]


Discussions:
Be the first to post a comment on this page!
 
By  
TMCnet
Featured White Papers
Top Stories
Related VoIP News

Today @ TMC
Upcoming Events
ITEXPO East 2010
January 20-22, 2010
Miami Beach Convention Center
Miami, FL
4G Wireless Evolution Conference
January 20-22, 2010
Miami Beach Convention Center
Miami, FL
Subscribe FREE to all of TMC's monthly magazines. Click here now.