Blade Servers are now an integral part of infrastructure in a datacenter. The demand for space and power are ever on the increase and the choice to switch to an alternative technology that ropes in all these benefits is but natural. Space, power, cost, reduced cabling, and easy manageability are the key factors that drive the datacenters to switch to Blade Servers. Its a server chassis that holds multiple server blades with each ‘blade’ being a server by itself. The resources are centralized and the non-core computing services required to manage the blade servers such as managing the power, temperature, connectivity etc, are pushed to the blade chassis/enclosure.
The overall health and performance of the BladeServers is ensured by monitoring the Blade health, chassis temperature, and the power module besides the other system hardware resources. The temperature and blower are the components that have frequent issues. The important variables that reflect the proper health and functioning of IBM BladeCenter H Series devices including these components are:
- Temperature: The chassis temperature (caused due to the heat generated by the active blades), must be maintained at an acceptable level and the administrator likes to be notified if it exceeds a certain threshold. When the temperature shoots the limit, the full unit is shut down leading to downtime. For the Blade Server to function to its capacity, effective cooling of the Chassis is important. Improper chassis cooling leads to an increased chassis temperature and results in poor performance of the blades. The outcome can be a potential downtime.
- Blower: The blowers in the BladeCenter servers are used to cool the chassis. The health of the blower is determined based on its speed capacity to blow the air (which is 325 cubic feet per minute according to IBM), and also based on its state. The BladeCenter H series has 2 high-speed blowers for redundancy.
- Power: The chassis provides the power services for the the blades that it encloses, eliminating the need to manage the power on/off operations and maintenance efforts on individual blades. Watching the health and performance of the power module is therefore important. There are LEDs to indicate the state of the module.
- Blade Health: The health of a blade on the chassis is determined based on the availability of each blade on the system and the performance of hardware resources on each blade. Monitoring the status of the blade and the resource performance is therefore indicative of the blade health.
OpManager monitors all these critical resources using SNMP. The relevant SNMP OIDs are implemented in The BLADE MIB and BLADESPPALT MIB. Visit this page for the particulars on the OIDs for which monitors are configured, and how these resources determine the health of performance of a Blade server.