Big Data/Analytics Zone is brought to you in partnership with:

Architect of the Java software libraries Hemi, Dalli and the Java application Compass. Hemi is used to remotely control and monitor thousands of servers in Data Centers. Member Open Compute Project. Java programmer since 2001. Hank has posted 3 posts at DZone. You can read more from them at their website. View Full User Profile

Scaling IPMI to the Data Center: Object Building Blocks

03.06.2013
| 1951 views |
  • submit to reddit
It's hard to overestimate the number of smart devices and computers monitored by Java applications. If Java is the standard for Write Once Run Anywhere then its counterpart in monitoring and controlling smart devices and computers is the Intelligent Platform Management Interface (IPMI) standard. This article is the first in a series on how to design a Java application that scales to manage thousands of IPMI enabled devices.

Since its introduction in 1998 IPMI has taken over the world. It's found on AMD, ARM, Intel, Power and FPGAs architectures and most Data Center servers including the Open Compute, AdvancedTCA and VPX. There is no alternative to IPMI that is widely accepted. IPMI can be broken down into data retrieval and control functions.

IPMI Data retrieval functions:

  • CPU/DIMM temperatures
  • air inlet/exhaust temperatures
  • Power levels
  • Fan speeds
  • air flow
  • blown fuses
  • LED state and color
  • model/serial and firmware version numbers

IPMI control functions:

  • turn on/off LEDs,
  • set Fan speeds
  • write the text to LCDs
  • reboot CPUs even when no operating system is present.
  • remotely display BIOS screen
  • set boot parameters
  • set LAN IP, Gateway, netmask and VLAN IDs.

IPMI and Java were introduced at approximately the same time and could not be more opposite. Java with its object-oriented hierarchy of classes consumes dramatically more resources than IPMI's bit-mapped oriented architecture. They don't even agree on the numeric range of a byte. IPMI maps an 8 bit value as the range of 0 to 255 (i.e. unsigned) and Java's native byte is -127 to +127 (signed). IPMI is little Endian and Java's ByteBuffer by default is big Endian. IPMI specification does not even follow the standard way to write a hexadecimal number. For example to represent the decimal value of 10, instead of writing 0xA0 IPMI writes A0h.

This article is about how to design a Java application called a System Manager that runs in a single JVM and scales to monitor tens of thousands of IPMI enabled devices. The System Manager communicates with a Service Processor that can power on/off the main CPU. The monitored device can be as simple as digital sign or as complex as Data Center server. Since this is machine to machine communication the rate of packets exchanged between the System Manager and monitored devices is tremendous. A typical large scale System Manager receives 3 million sensor readings per minute of CPU temperatures, DIMM temperatures and fan speeds. At the same time it collects inventory data such as model and serial numbers. 

The System Manager to Service Processor block diagram.

IPMI Protocol

IPMI defines not only the data structures used to represent a computer's state but the protocol used to communicate with the server. IPMI's Remote Management Control Protocol (RMCP) defines User Datagram Packets (UDP) for use with java.nio.channels.DatagramChannel. This has an important side effect for Java. Unlike the TCP/IP protocol which provides end-to-end error free communication, UDP can not be offloaded into a Operating System kernel or a specialized chip. RMCP Engine is usually implemented in the same JVM as the System Manager. To make this more complex the System Manager must respond to each packet sent by the monitored device within five seconds or the packet may be permanently lost.

Fortunately RMCP packets are relatively small. Without any type of security the packets are all 32 bytes or less and with security are between 78 and 140 bytes. The challenge is scaling Java to handle the quantity of these small packets. For example to read three million sensors per minute that means the JVM must handle three million outgoing packets and three million incoming.

Concurrency

Building a multi-thread application is scary. Building one to handle 10,000 or 50,000 Service Processors is terrifying. Before examining how to scale a System Manager to handle these large packet rates immutable objects are needed to pass between threads. Lets look at Java objects crafted from the IPMI specification.

Enumeration

Start with the basics because IPMI quickly gets complex. To scale a Java IPMI System Manager enumerate everything, absolutely everything. It's surprising how many enum types are in IPMI. The Hemi IPMI library implements 280 pages of the 600 page IPMI specification and there are 94 enum types. Almost one enumeration every 3 pages.

The enum types are an excellent starting point for making immutable objects that can be passed between threads without synchronization. An enum type is noteworthy because of the restrictions placed on it by the Java Language Specification .

 Restriction  Page # in JLS(1)

Enums can not have abstract methods

182

Compile time error if a class extends the enum type's super class java.lang.Enum

190

Enum types can not be cloned

254

An enum type is implicitly final making sub classes impossible

254

Reflective instantiation of enum types is prohibited

254

Deserialization mechanism ensures that duplicate instances are never created(2)

254

The constructor must not be public or protected

246

The equals(Object obj) method can not be overridden

256
  1. The Java Language Specification Java SE 7 Edition, 2012-07-27
  2. Why this is in the JLS and not the JVM specification is strange. If anyone know why, please comment!

While the Java compilers enforce these restrictions the enum type is still not immutable. Adding the restriction that all fields are marked final ensures that Java re-entrant locks (or any other form of synchronization) are not needed.

All enum types are stored in the “method area” of the heap. The method area is unique because multiple threads can access objects in it without locking. There is locking used when placing an object in the method area. The JVM specification allows each implementation to choose if it garbage collects or compacts the “method area”. If your JVM vendor does not do garbage collection of the “method area” this helps in performance testing by removing thousands of objects from the garbage collection process.

Using the enum type's ordinal() to return a value found in the IPMI specification is very error-prone. IPMI has an unusual technique in allocating numeric values to fields that can be enumerated . IPMI allocates values, not only byte values but bit fields, from both the low and the high end of a range leaving a gap between the two as reserved or undefined. For example, IPMI may allocate a byte value and define only five values: 0,1,3, 254 and 255.

The solution is a pattern for the enum class called IPMI Value Enum Lookup (IVEL). The pattern defines a final int field containing an IPMI defined value and an array of enum instances to map between the IPMI numeric value and instance of the enum type. The array contains a null for each value not defined in IPMI. While wasteful in space the approach allows constant time lookup speeds independent of the number of instances and the speed does not vary when new versions of IPMI introduce new values for the enum type. An additional benefit is that the size of the “method area” does not change when IPMI introduces new values for the enum type.

This is an example of an enum type implementing IVEL using an IPMI Sensor Type Code. See The IPMI 2.0 Version 4 (June 12, 2009), Table 42-3, Sensor Type Codes (page 502). The table allocates these three ranges:

  • 0x0 thru 0x2C are sensors for Temperature, ACPI power state, LAN heartbeat, Boot error, etc

  • 0x2D to 0xBF are reserved

  • 0xC0 to 0xFF are OEM defined

Listing 1 shows IVEL pattern for the IPMI Sensor Type Code.

package com.jblade.lib.hemi.cx.doc;

/**
 * Identify a IPMI 2.0, Table 42-3, Sensor Type. This enum 
 * implements the IVEL pattern. Do not use ordinal() instead
 * use getValue() to return the the IPMI defined numeric 
 * value. When using toString() be aware specifications such
 * as PICMG 3.0 AdvancedTCA redefine how a numeric value 
 * maps to a string name. For example
 * PICMG 3.0 redefines the sensor types 0xf0, 0xf1 and 0xf2.
 * These are the PICMG sensor number to sensor name mappings.
 * 0xf0 FRU Hot Swap 
 * 0xf1 IPMB-0 Physical Link 
 * 0xf2 Telco Alarm Input
 *
 * @see http://download.intel.com/design/servers/ipmi/IPMI2_0E4_Markup_061209.pdf  IPMI 2.0, Table 42-3, Sensor Type Codes
 */
public enum IpmiSensorTypeEnum {

    /**
     * IPMI Sensor Type Code 0x00.
     */
    RESERVED(0x00),
    /**
     * IPMI Sensor Type Code 0x01.
     */
    TEMPERATURE(0x01),
    /**
     * IPMI Sensor Type Code 0x02.
     */
    VOLTAGE(0x02),
    /**
     * IPMI Sensor Type Code 0x03.
     */
    CURRENT(0x03),
    /**
     * Other IPMI values omitted for this sample 
     * implementation. The values between 0x2D and 0xBF are
     * undefined and reserved by IPMI.
     */
    /*
     * OEM Sensor 0xFE
     */
    OEM_FE(0xfe),
     /*
     * OEM Sensor 0xFF
     */
    OEM_FF(0xff);
    /**
     * An array holding all possible instances of this class. 
     * The indexes from 0x2D to 0xBF contain null because 
     * IPMI 2.0, Table 42-3 does not define any values. This 
     * array is used as a high speed way to find an instance of
     * this class.
     */
    private static IpmiSensorTypeEnum[] VALUE_ARRAY = 
                                      new IpmiSensorTypeEnum[256];

    static {
        /*
         * Not all of the 256 sensor values are present as 
         * enum. Some are reserved. Walk thru the list of 
         * enums an populate the array with the valid sensor
         * types.
         */
        for (final IpmiSensorTypeEnum sType : IpmiSensorTypeEnum.values()) {
            VALUE_ARRAY[sType.getValue()] = sType;
        }
    }
    private final int value;

    private IpmiSensorTypeEnum(final int value) {
        this.value = value;
    }

    /**
     * Convert from a numeric value to the enumeration.
     *
     * @param sensorTypeIndex the value of the enum as it 
     * appears in IPMI 2.0, Table 42-3. This value is not
     * the ordinal value.
     * @return the enum equivalent to the sensorTypeIndex. 
     * A null is returned if the sensor type is between 0x2D 
     * to 0xBF because these
     * values are undefined by IPMI 2.0, Table 42-3. Do not 
     * call this with Java byte type. If the top bit of the 
     * byte is set Java will
     * silently sign extend the value to a negative number.
     * @throws IllegalArgumentException if the sensorTypeIndex
     * is outside the range of 0x0 to 0xff.
     */
    public static IpmiSensorTypeEnum getInstance(final int sensorTypeIndex)
            throws IllegalArgumentException {
        IpmiSensorTypeEnum tmpSensorType = null;

        /*
         * Validate the argument sensorTypeIndex is within
         * the bounds of the array.
         */
        if ((sensorTypeIndex >= 0) && (sensorTypeIndex <= 255)) {
            tmpSensorType = VALUE_ARRAY[sensorTypeIndex];
        } else {
            throw new IllegalArgumentException(
                    "Invalid sensor type value. Expected a value between 0 and 0xff. Actual value "
                    + sensorTypeIndex);
        }

        return tmpSensorType;
    }

    /**
     * the numeric value of the sensor type as defined in
     * the IPMI specification. This value has no relation
     * to the Java ordinal() value.
     *
     * @return the numeric value of the sensor type as 
     * defined in IPMI 2.0, Table 42-3, Sensor Type Codes.
     * The range of the return value is from 0x00 to 0xff.
     */
    public int getValue() {
        return value;
    }
}

The quantity of enum types for IPMI and the IVEL pattern has a cost. For the Hemi library the 94 enum types use 1,070,212 bytes. On an Intel Xeon CPU E5-2620 @ 2.00GHz elapsed time to load, verify and place them in the “method area” takes 263 milliseconds.

The results were obtained using the Oracle JVM Java HotSpot 64-Bit Server VM 23.5-b02 and the classes were compiled using the Oracle Java version 1.7.0_09.

IPMI Addressing

There are three types of numeric address defined by IPMI. Each is an enum type and can be combined to provide a unique address for any device attached to the IPMB bus within a Service Processor.

 Name  Range  Use

IPMB Address

200 values out of a possible 255. Restrictions on odd values. See IPMI 2.0. Table 5-4, System Software IDs

A location on the physical IPMB bus called an IPM Controller or a logical software device

LUN ID

0 to 3

A logical device within a IPM Controller

FRU ID

0 to 255

A device within a LUN ID

The various address combinations are heavily used in System Manager. Not only are addresses shared between threads but the addresses are shared between Service Processors. It is desirable the addresses are created only once to minimize memory use, and immutable so they can be shared among threads without synchronization. These immutable address classes are the fundamental building blocks that will be used with the java.util.concurrent package to transfer data between threads.

The simplest combination of the enum address types is a IPMB address and LUN ID to form a LUN Address. This address is mandatory for all IPMB commands moving to/from the System Manager. A large scale System Manager may have between two and four million LUN addresses in use a time. Since there are only 800 instances of this (200 IPMB address * 4 LUN IDs) statically allocating all of them is often the best approach.

The most difficult addressing combination is when an IPMB address, LUN ID and FRU ID are used to form a FruAddress class. The FRU address can identify an EPROM, AdvancdMC, Fan Tray, etc. There are many FRU addresses in a computer system.

IPMI allows for: 200 IPMB Addresses *4 LUN IDs * 255 FRU IDs= 204,000 FruAddress class instances.

Then using a 64 bit JVM the memory use is 204,000 instances * 32 Bytes/instance= 6,553,600 bytes.

The majority of Systems Managers do not need to allocate all 204,000 FRU addresses and normally use less than600 Addresses. A System Manager with large amounts of memorycan call an initialization method that allocates the FRU addresses into static fields while the System Manager still has a single thread. The alternative is to create a FRU Address cache using ReentrantLocks.

IPMI Commands

IPMI commands define the actions executed on the Service Processor and a result is returned to the System Manager. The IPMI breaks a command into two parts. The request command is sent by the System Manager to the Service Processor. The response command is the result returned to the System Manager.

The techniques used to create the immutable requests and responses objects are very different. They should be. Sending an IPMI Request is the conversion from an Object to a ByteBuffer. An IPMI Response is the opposite, converting from ByteBuffer to an immutable Object.

While many of the IPMI Request commands contain data too diverse to make them candidates for immutable Objects, the following IPMI Requests contain zero or one byte of data and should be static immutable objects.

  • Get Device Id
  • Get System Guid
  • Get Sel Allocation Info
  • Get POH Counter
  • Reserve Device Sdr Repository
  • Get Device Sdr Info
  • Get Sel Info
  • Reserve Sdr Repository
  • Get Sel Time
  • Get Channel Info
  • Get Sensor Data

The IPMI Request command “Get Sensor Data” is the most important command to implement as a static immutable. Each “Get Sensor Data” request contains a single sensor number. IPMI allows 255 sensor numbers resulting in 255 “Get Sensor Data” objects. This is them most important immutable IPMI Request because a large System Manager may have 700,000 references to these 255 objects at any given time.

Selecting IPMI Response commands for static allocation is much more difficult. Each response contains data specific to the Service Processor or an error code.

While responses from two identical Service Processors could be converted to a single static object often the overhead of comparing all potential responses uses more CPU than creating a unique response object that must go through garbage collection. If a System Manager is going to implement a cache of static IPMI Response objects these should be included:

  • Get Device ID
  • Get Session Info
  • Get Channel Info
  • Get User Name
  • Chassis Control
  • Clear SEL
  • Set SEL Time

Conclusions

Immutable objects are the fundamental building blocks for large System Managers. They are proven design principles for Java applications. Implementing IPMI using these techniques uses less than 5 MB of memory and scales to hundreds of threads.

Published at DZone with permission of its author, Hank Bruning.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)