Enterprises use Clustered Environment to scale for Failover Support. It is highly critical to monitor all nodes within the cluster. Here is a diagrammatic representation of a 2 node SQL cluster.

Monitoring a Clustered network adds new challenges to the Operations Team. For example, If you have a 4 node cluster, it could be enough if 3 nodes out of this cluster are active. If one node in this cluster is down for any reason, you really do not  have to worry much about that. In this case, the Operations Team should be notified with the appropriate Priority or Status message.

Let us now elaborate this scenario with Applications Manager.

Step 1: Add all nodes of the cluster through New Monitor.

Step 2 : Create a New Monitor Group , say SQL Cluster Group 1.

Step 3 : Associate all nodes within the SQL cluster to this Group using the ‘Associate Monitors’ option.

Let us now monitor the Health & Availability for this cluster. Any node in this cluster is of the highest importance. If any node in this cluster is down, then the Cluster is in warning state. If 2 or more nodes in this cluster are down, then the Cluster is in Critical state. In order to configure this rule, click Configure Alarms from the Monitor Group Snapshot page and select ‘Configure Health’. In the popup window, choose Define Alarm Rules.

You can choose to configure the status of the Cluster Group as Critical or Warning, based on certain rules. A few examples :

  1. Monitor Group is Critical if Any 2 Monitor’s Health status is critical
  2. Monitor Group is Critical if any Selected Monitor ‘s Availability status is critical
  3. Monitor Group is Warning if any 1 Monitor’s Health Status is warning.
  4. Monitor Group is Down if any 1 Monitor’s Availability is down

You can configure the status of the Cluster Group based on Health and/or Availability of the individual monitors.

You can extend these Alarm Rules and define advance cluster relationships by defining Sub Groups to existing Monitor groups. This helps monitor Business Services like CRM, Email, FailOver, Backup etc more effectively.

Isnt it interesting to configure such alarm rules?  Try this out for a SQL or Exchange cluster and share your experiences with me.

Kevin