The Rise of Open Systems in Network Management and Design
BY STEPHEN B. JOHNSON
With the worst of the telecom doldrums behind them, service providers are once again hard at work building out their packet networks. This time, however, cost containment is a top priority. So telecom OEMs (TEMs) who want to provide the equipment for these new packet networks will have to offer solutions that help service providers minimize both capital expenditures (CAPEX) and operational expenditures (OPEX).
As competition among TEMS stiffens, traditional home-grown solutions are quickly giving way to open systems approaches that reduce cost and speed time to market. Utilizing standard platforms like PICMG 2.16 (CompactPCI Packet Switching Backplane, or cPSB), and its heir apparent, AdvancedTCA (ATCA), TEMS are finding that they can produce versatile, scalable, high-availability systems more quickly and inexpensively by leveraging plug-and-play, best-of-breed, off-the-shelf products.
The economies of scale that TEMS are exploiting as they transition to open systems will ultimately be reflected in reduced CAPEX for service providers. Of equal if not greater importance to service providers, however, are the long-term savings that open systems offer for OPEX.
The standard management framework defined by PICMG for cPSB and ATCA systems offers substantial savings on both counts. Based on the Intelligent Peripheral Management Interface (IPMI), the PICMG management framework reduces CAPEX by reducing the time and cost associated with building telecom equipment. It also reduces OPEX by increasing RASM (Reliability Availability Scalability and Manageability), which makes it possible to deliver scalable services at a lower cost with greater uptime.
RASM & IPMI
RASM is a system-level term that encompasses hardware and software reliability combined with active monitoring and controlling of individual components. Active monitoring and control is especially important for high-density systems utilizing large numbers of high-performance processors, where thermal control and power management are major concerns.
IPMI facilitates a standard approach to system management that enhances RASM by allowing multiple boards and subsystems to be integrated and managed in an effective manner using standard messaging. IPMI utilizes an I2C-based physical interface known as the Intelligent Peripheral Management Bus (IPMB) to link chassis management with board-level FRUs (field replaceable units) that support IPMI.
IPMI can be used to monitor physical system health characteristics such as voltages, fan speeds, temperatures, and power supply status. It can also be used for automatic event notification, remote shutdown/restart (typically on a slot by slot basis), and maintaining a system event log (SEL). Together, these facilities give system engineers and administrators flexible, interoperable access to a broad range of platform information. This, in turn, facilitates active monitoring and proactive fault detection that contributes to an increase in system reliability.
IPMI system management is primarily driven by events (such as temperature and voltage sensor events) initiated by the system under management. PICMG-compliant single-board computers (SBCs) use a dedicated IPMI controller and I2C-based IPMB to gather and convey information for parameters like temperature and voltage levels deemed vital to the board’s normal operation to chassis management. This controller operates independently from the rest of the SBC and derives power directly from the backplane, which enhances reliability and availability by making the system management process impervious to board-level power failures.
IPMI event information conveyed over IPMB is sent to a chassis management system, which makes decisions and takes the necessary system actions to ensure proper operation and report boards that need attention or servicing. The chassis management system typically records the events in a SEL (System Event Log), which enables technical personnel to look through the logs, detect trends and patterns, and determine the sequence of events that led up to a given fault condition.
The FRU inventory data provided by field-replaceable SBCs enables operators and technicians to easily query what is present in a particular system. This ability not only reduces service trips (to locate a particular board), but also provides a means of extracting board-specific information normally provided only on labels without having to extract the board from the system. The FRU data typically contains board and product information such as the manufacturer’s name, part number, serial number, manufacture date, and revision level.
Page 1 of 2 [Go to Page 2]