Modius Data Center Blog

Data Center Monitoring: Out-of-Band versus In-Band.

Posted by Mark Harris on Wed, Jun 02, 2010 @ 12:02 PM

There was a time where x86 hardware systems and the applications and operating systems chosen to be installed upon them were considered good, but not 'bet your business' great. Reliability was less than ideal. Early deployments saw smaller numbers of servers, and each and every server counted. The applications themselves were not decomposed well enough to share the transaction processing, so failures of any server impacted actual production. Candidly I am not sure if it was the hardware or software that was mostly at fault, or a combination of both, but the concept of server system failures was a very real topic. High Availability or "HA" configurations were considered standard operating procedure for most applications.

The server vendors responded to this negative challenge by upping their game, designing much more robust server platforms and using higher quality components, connectors, designs, etc. The operating system vendors rose to the challenge by segmenting their offerings to offer industrial strength 'server' distributions and 'certified platform' hardware compatibility programs. This made a huge difference and TODAY, modern servers rarely fail. They run, they run hard and are perceived to be rock solid if provisioned properly.

Why the history? Because in these early times for servers, their less than favorable reliability characteristics required some form of auxillary bare metal 'out of band' access for these servers to correct operational failures at the hardware level. Technologies such as Intel's IPMI and HP's ILO became commonplace discussion when looking to build data center solutions with remote remediation capabilities. This was provided by an additional small CPU chip called a BMC that required no loading, no firmware, nothing but power to communicate sensor and status data with the outside world. The ability to Reboot a server in the middle of the night over the internet from the sys admin's house was all the rage. Technologies like Serial Console and KVM were the starting point, followed by these Out-of-Band (ILO & IPMI).

Move the clock forward to today, and you'll see that KVM, IPMI & ILO are interesting technologies and critical for specific devices which are still considered critical to core businesses as they are mostly applicable when a server is NOT running any operating system or the server has halted and is no longer 'on the net'. In most all other times, when the operating system itself IS running and the servers are on the network and accessible, server makers have supplied standard drivers to access all of the sensors and other hardware features of the motherboard and allow in-band remote access with technologies such as SSH and RDP.

 

Today, it makes very little difference whether a monitoring system uses operating system calls or out-of-band access tools. The same sensor and status information is available through both sets of technologies and it depends more on how the servers are physically deployed and connected. Remember, a huge percentage of Out-of-Band ports remain unconnected on the back of product servers. Many customers consider the second OOB connection to be costly and redundant in all but the worst/extreme failure conditions. (BUT critically important for certain type of equipment, such as any in-house DNS servers, or perhaps a SAN storage director)

Topics: data center monitoring, data center temperature sensors, Protocols-Phystical-Layer-Interfaces

Data Center Infrastructure: Monitoring via LAN or Serial Interfaces

Posted by Mark Harris on Wed, Feb 24, 2010 @ 07:00 PM

When the topic of data center infrastructure comes up, there exists some confusion regarding how the two technologies, Serial and LAN, relate. Let me start by saying that nearly every piece of equipment built in the last 20 years includes at least one form of core controller interface. In fact, the engineering teams that build this type of equipment will tell you that one of the very first portions of a control system developed is the console/monitor access interface because it is this interface that typically is used to help continue to develop and debug the controller itself (as well as check it along the way for proper operation). Hence, every server, switch, router, and firewall, as well as every PDU, UPS, CRAC and Generator, has one – some form of interface exists in all Enterprise-class devices!

That said, the technology for these interfaces has changed over time. RS232 (and the very similar RS485) were all the rage for connectivity (due to simplicity and low costs) in the 70s and 80s. With the advent of true ‘networking’, Ethernet became popular (ironically for the very same reasons) in the early 90s and continues to be widely deployed as the interface standard to this day. However, the mechanisms to interact with these two types of interface are vastly different.

With Serial interfaces, the most common protocol is an ASCII-based ‘Text’ command line protocol. Commands are built using strings of characters, and the results are returned as strings of characters. For instance, a user could build a text command (of 18 characters) such as “SHOW SYSTEM UPTIME” which may result in the resulting series of 9 characters “1D 23H10M” to show 1 day and 23 hours and 10 minutes. The key point with regards to a Command-Line Interface protocol is that is specific to each and every vendor and in many cases each model number within that vendor’s catalog. This ultimately requires very model-specific device awareness in order to be able to communicate with this type of serial interface. Ultimately, the information being retrieved from these interfaces is going to be consumed by network attached servers and monitoring applications. Consequently, there are two steps needed to be able to deal with serial interfaces: 1) A physical conversion to get the information into a format suitable for the network to transport; and 2) the logical translation of ASCII commands and responses to networked packet values in tables. This is done using a device(s) sometimes referred to as a “gateway.” While these two conversions could be separated, they typically are included in a vendor-supplied gateway devices with an RS232 or RS485 port on one side, some small conversion processor inside, and the LAN port on the other side.

With LAN-based (Ethernet) interfaces it is much easier. Many standardized protocols exist to communicate natively from the device to the network, with the most common of these being SNMP and its inclusion of MIBs to describe how the informational packets are organized. SNMP (and the MIB) allows a network inquiry to be made against a table of operational values within the target device, and the results are formatted as expected values within the returned data packet. While there are some detail peculiarities, in general network-based protocols are much more standardized and widely accepted as the modern means.

What does this mean to you? If a device has a network interface, then a high probability exists that you’d be able to easily access and understand the performance values without any conversions whatsoever. Any modern intelligent iPDU (or Power Strip) is a great example of a device like this. It has a LAN connection and can report (in a known format) the power at each outlet and temperature of the unit itself by inquiring with a simple SNMP command. Devices like these have IP addresses and appear on the corporate network just like any other component. Conversely, if a particular device has ONLY a Serial interface, then look for a physical and logical gateway solution to do the conversion. These gateways are very specific (purpose built) for each model device and are usually supplied by the application provider that intends to consume the performance information.

Topics: data center monitoring, BACnet, Protocols-Phystical-Layer-Interfaces, device interfaces, modbus

Latest Modius Posts

Posts by category

Subscribe via E-mail