What to monitor on IBM BladeCenter H Series?

Jan 23 2012 02:03:34 AM Posted By : vidya
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it

Blade Servers are now an integral part of infrastructure in a datacenter. The demand for space and power are ever on the increase and the choice to switch to an alternative technology that ropes in all these benefits is but natural. Space, power, cost, reduced cabling, and easy manageability are the key factors that drive the datacenters to switch to Blade Servers. Its a server chassis that holds multiple server blades with each 'blade' being a server by itself. The resources are centralized and the non-core computing services required to manage the blade servers such as managing the power, temperature, connectivity etc, are pushed to the blade chassis/enclosure. 

The overall health and performance of the BladeServers is ensured by monitoring the Blade health, chassis temperature, and the power module besides the other system hardware resources. The temperature and blower are the components that have frequent issues. The important variables that reflect the proper health and functioning of IBM BladeCenter H Series devices including these components are:

  • Temperature: The chassis temperature (caused due to the heat generated by the active blades), must be maintained at an acceptable level and the administrator likes to be notified if it exceeds a certain threshold. When the temperature shoots the limit, the full unit is shut down leading to downtime. For the Blade Server to function to its capacity, effective cooling of the Chassis is important. Improper chassis cooling leads to an increased chassis temperature and results in poor performance of the blades. The outcome can be a potential downtime.

  • Blower: The blowers in the BladeCenter servers are used to cool the chassis. The health of the blower is determined based on its speed capacity to blow the air (which is 325 cubic feet per minute according to IBM), and also based on its state. The BladeCenter H series has 2 high-speed blowers for redundancy. 

  • Power: The chassis provides the power services for the the blades that it encloses, eliminating the need to manage the power on/off operations and maintenance efforts on individual blades. Watching the health and performance of the power module is therefore important. There are LEDs to indicate the state of the module.

  • Blade Health: The health of a blade on the chassis is determined based on the availability of each blade on the system and the performance of hardware resources on each blade. Monitoring the status of the blade and the resource performance is therefore indicative of the blade health.

OpManager monitors all these critical resources using SNMP. The relevant SNMP OIDs are implemented in The BLADE MIB and BLADESPPALT MIB. Visit this page for the particulars on the OIDs for which monitors are configured, and how these resources determine the health of performance of a Blade server.

Jeff’s free feedback grabs a bunch

Apr 11 2011 06:41:37 AM Posted By : Kalyan Ram
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
Though it is an old post, it is blog worthy. I take this opportunity to thank Jeff for his genuine feedback. What really interests me is to see more customers share a similar feedback on the ease of use.
....I needed a quick, and preferably easy one to set up.....It grabs a bunch of the normal statistics but where I found it most useful was the alarms and notifications.... I am liking it way better than WebCacti!

IIn most cases, what Jeff has explained holds good. It is not feasible for admins to spare a huge amount of time to build what they want, especially when they have limited resources.

I happened to speak to few of our customers who have used traditional script based monitoring tools before choosing OpManager. Their concerns were similar to what Jeff has stated above. In addition, there was another common concern, i.e. customers felt the technical knowledge inheritance as the biggest barriers of all. That is, when a new heir or a less experienced IT professional joins the team, it took them more time to train them with the complex scripts they already created. More than concentrating on their new role or activities, they were held responsible to modify or correct the scripts they had created.

So, all they need is an easy to deploy network monitoring software that works out-of-the-box.

OpManager’s bundled web-server and database, lets one quickly deploy the product for production. Further the automatic network discovery and wide collection of monitoring templates helps admins to start monitoring their devices in a minute.

I know what you will ask next!

There is a Free Edition as well and as the name say, it never expires. It just has a cap on number of devices (10 devices) and number of users (1 user). Apart from that all other features available in Professional Edition are made available in Free Edition too. For more information on what's included in Professional Edition, please refer to our edition comparison page.

Once again, thanks to Jeff for pointing out this.

-
Kalyan Ram
Team OpManager
Network monitoring software from ManageEngine


Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
Have you ever chuckled at a person wearing a full rain coat with the hood but also carrying an umbrella in his hand on a rainy day? Actually, this analogy is no different from your company's Internet service.

With the demand to provide 100% availability for the internet service, every organization goes for multiple Internet services. However due to cost constraints the speed and the bandwidth may vary between service providers. It is obvious, in terms of backup link because you don’t want your users to browse any social or unproductive web sites when your main service link is down.

Recently, I happen to visit one of our customers for training and implementation. Their Internet Service Provider (ISP) had provided them with 3 types of internet services to ensure 100% uptime for their Internet service i.e.
  • An Optical Carrier [OC] based internet - Primary
  • A Radio Frequency [RF] based internet - Secondary
  • A VSAT based internet – Tertiary – Backup link

With these 3 Internet services, the ISP was able to provide high redundancy and availability of services to this customer at all times. All the 3 internet services were always active at any point of time, with the traffic flowing only across OC link primarily.

In the case of primary link going down, the routing table on the core router uses a static route that switches over to the secondary link without any interfaces coming up or changing the IP. Needless to say the ISP's SLA for Network uptime is 97% and is always achieved.

The real catch here is the Response time which is very poor for RF and VSAT links when compared to the OC. Especially with VSAT which has a high latency and a minimum Round Trip Time (RTT) as 550 ms.

The customer was unhappy with the slow internet connectivity with its branches, because at times it took more than 20 minutes to complete a business transaction from the branch offices.

Before upgrading the links or contacting their ISP, the IT team wanted to get all the Ws’ right? I mean…
  1. Who causes the delay? Is it the application or users?
  2. Whether is it possible to achieve some trade-off in terms of bandwidth usage to provide a better service?
  3. Where does the latency happen? Is it at the service provider’s end or something internal?
  4. When and for how long has the high latency been prevailing?
The IT team was able to recognize the answers for the first two questions by using Traffic and bandwidth analysis module; NetFlow from ManageEngine. However, they were not aware of Cisco IP SLAs and how it can help them monitor their WAN links.

Similar to NetFlow, Cisco IP SLAs is also a part of Cisco IOS. Cisco IP SLAs uses active monitoring techniques to let you know how the link is performing! To know about the Cisco IP SLAs  features that are supported in the IOS versions click here

More at http://www.cisco.com/en/US/products/ps6602/products_ios_protocol_group_home.html

I had created the WAN RTT monitors from their core router (Cisco 1841, IOS v12.4) to the branch offices and set a threshold of above 150 ms. This is because the link between the Source and the Destination is always the same and only the quality (Latency/ RTT) of the internet connection changes for OC, RF and VSAT.

Now, if the link latency increases beyond the threshold point, an alarm would be raised immediately to the network team. When they receive an alert from OpManager the first thing they verify these days is the HOP graph, which lets them identify where exactly the high latency was induced.They could also deduce the type of internet service that is currently in use by measuring the RTT as well checking the IP addresses in the HOP graph.

OpManager WAN RTT dashboards, lets administrator monitor the response time and availability round-the-clock. To know more about OpManager’s WAN monitoring, click here

Now, they use OpManager’s WAN RTT monitoring extensively with over 100 WAN links to monitor.

I am sure this use case will give you a fair idea on where and how we can use OpManager’s WAN RTT monitoring to help eliminate & isolate latency,RTT issues across WAN links.

Signing off for now… and Wishing you happy holidays!

- S Arun Kumar

Image courtesy: chumpysclipart.com

Year 2010 - A quick roundup

Dec 02 2010 05:28:23 AM Posted By : vidya
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
It has been an exciting year at the OpManager bay. With a couple of awards to its credit, OpManager is all set for the next major release in the new year to come. We would like to take this opportunity to thank all our users for the wonderful support extended to us and we look forward to delight you with new features in the coming year!

While there is little doubt that you, our invaluable users, have kept us going, here's a quick summary of what has kept us busy this year:

Our Oscars -2010

Not to blow our own trumpet, but we can do with some nominations, awards, and press attention every now and then :-) The credit goes to you wonderful people for helping us reach here:

1. 2010 Global Product Excellence Award Winners

Info Security Products Guide, the industry's leading information security research and advisory guide, has named OpManager, a winner of the 2010 Global Product Excellence. This customer trust honor is the greatest endorsement to the fact that OpManager is ahead of the curve when it comes to the best of the products that can provide holistic network management.



2. WindowsNetworking.com Readers' Choice Award

 
ManageEngine OpManager was voted WindowsNetworking.com Readers’ Choice Award Winner – Second Runner Up. OpManager becomes the only awardee to have remained in the top 5 ranks over three consecutive years while improving rank position each year.


3. Network Management Software - Review

OpManager really hits the mark for reliable network monitoring. It has a great range of features, and is versatile enough to monitor a wide range of devices. It does a great job alerting administrators to network problems. Read more.


From the Bee-hive

Our developers, the busy-bees in the team, together with the testing team, doled out 8 different releases this year. The release numbers and the highlights are given below:

1. Hotfix 8051 (view details)

  • Faster device discovery
  • Improved WMI monitoring on 64 bit Windows server OpManager installations
  • Intermittent issues with OpManager & SDP integration (both applications in latest builds) have been handled
  • Smooth alarm escalation configuration
  • French - locale specific issues handled

2. Hotfix 8052 (view details)
  • A new CLI-based monitor for Partition Details of a device is included.
  • Alarms Details Page Enhancements
  • The name of the log rule that triggers alarms for Syslogs, Event Logs and Traps is shown now.
  • Provision included to edit the rules.
  • Down Time Schedule – Status of the Downtime Schedule (in progress or not) is shown in the schedule listing page.
  • Re–branding – Option added to revert back to default settings added
3. NCM Plug-in 5450 (view details)
  • Provision for viewing the configuration versions of all/any device(s) by specifying a custom date range
  • Provision to mark configuration changes as authorized/unauthorized in bulk
  • Support for several new device models.
4. GA release of OpManager 8.7 (view details)
5. NCM Plug-in 5500 (view details)
  • Support for MSSQL back-end  (Note: If you are looking to make use of MSSQL back-end, please upgrade to OpManager 8.7 and apply the latest version of NCM Plug-in over that). 
  • Support for additional Syslog message formats for real time change detection for Cisco and Enterasys devices
  • Support for installing the NCM Plug-in Windows 64-bit machines.
6. Hotfix 8721 - (view details)
  • This particular hotfix had a record number of enhancements, changes and fixes.
7. NetFlow Plugin 8500 (view details)

8. Hotfix 8722 - We are a futuristic lot if you have been following our forums. This release is expected in a couple of weeks, a promise we plan to keep this year

We are curious like that!

Well, not exactly prying, but we certainly wanted to have a peep into your network management needs. Here are the few surveys we carried out and each one has yielded a wealth of information and is sure to keep the team on toes for several months from now!

Most of the survey results have been useful development inputs and it has helped align our development goals. Besides, the insights from the surveys were used in a couple of webinars we recently hosted.


The buzz on the web

Webinars this year has turned out a huge hit and thanks again for the overwhelming response. We opted not to host a session this month as you guys have a bigger priority ahead - The holiday season :) 

OpManager Webinars & Videos

Useful Resources

And finally, here's for some stage presence :)

Interop 2010 




OpManager at the Interop


Gitex 2010



OpManager at the Gitex

Thanks for coming this far! Have a happy weekend and a wonderful season ahead!!

Vidya Vasudevan
OpManager Team

Smart Network Monitoring

Nov 19 2010 01:57:29 AM Posted By : vidya
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it

opmanager-iphone gui
Create your own Toon

Go shopping, catch your favorite game of Football, party hard, or simply chill out this Thanksgiving..

With OpManager SmartPhone GUI, manage IT from anywhere, anytime. Check out the iPhone GUI:



Happy Thanksgiving!

Vidya
OpManager Team

Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
OpManager has a long list of WMI monitors that cover even Active Directory, MSSQL, Exchange etc. Here are a few self-learned tips to solve some of those common WMI issues easily. 

First, the WMI Diagnosis Utility-all out troubleshooter.


WMIDiag.vbs is a VBScript script designed to help you ascertain the current state of the WMI service on a computer. The download package includes the utility itself, a ReadMe file that discusses how the tool works (and how to best use it), and sample spreadsheets that provide information about the default WMI configuration on various versions of the Microsoft Windows operating system.


from WMI Diagnosis Utility

When you run this, it automatically repairs WMI services and generates a report of what is missed and what needs to be done. 

Okay. WMI is working fine. What next if you find some WMI counters not showing values in a particular device? How to check if the device has problems or not? Is there an easy way to query the device?

Yes, you can do that by using WMI Administrative Tools. Here is the overview from Microsoft's site

WMI Tools include: 
WMI CIM Studio: view and edit classes, properties, qualifiers, and instances in a CIM repository; run selected methods; generate and compile MOF files. 
WMI Object Browser: view objects, edit property values and qualifiers, and run methods.

Download the tool from WMI Administrative Tools. You can use this to query the WMI classes in the device and get the values for those classes. It's better than the default wbemtest tool located in C:\windows\system 32\wbem where you need to type the query in the SQL query format.

Happy Monitoring.

Rajasankar

When to define a new Device Template?

Oct 12 2010 01:17:57 AM Posted By : vidya
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it

The device templates in OpManager contain predefined rules based on which a device is categorized and relevant monitors are associated soon after discovery. The ability to define custom templates for new device types or modify an existing template to accommodate yet another device type in the same template provides great flexibility to administrators.

That said, as an administrator, you will be able to make the best out of this feature if used to its full potential. For instance, some might end up defining a template for every variant of a device type instead of creating one template that can encompass all the variants.

Ideally, defining device templates before you initiate discovery helps in proper classification. Over 650 device templates are available out of the box. If SNMP is enabled on the monitored devices, and if proper credentials are configured in OpManager, most devices fall into the correct category. Modify an existing template or create a new one based on need.

1. When should I modify/update an existing template?

Assume you have purchased a new Cisco 805 router and you would like to monitor it using OpManager. OpManager already has a device template for Cisco 800 series routers with few sysOIDs in these series updated in the template. All you need to do is edit this template to include the sysOID of Cisco 805 router if it is not present already.


2. When should I create new templates?

Scenario 1

Let us now assume you have purchased a Cisco 11000 Series Content Services Switch. OpManager does not have a template yet (this is as of Build no.8721!). Just go ahead and create a new template.

Scenario 2

Assume you have a whole new set of Environment Sensors that are manageable (that supports SNMP). These devices cannot be classified under any of the default categories like servers, routers, switches etc, and deserve a separate category. The managing parameters too differ for this new device type. An ideal situation where you will define a new category view (Eg: Sensors) and define a fresh device template. You can have different models of sensors from the same vendor in a template or even combine Sensors from multiple vendors in the same template.

The steps you'd follow here would be as follows:





3. How to check for SNMP response to sysOID

Even before you proceed to add or modify a template to accomodate a new device type, make sure the device is SNMP enabled and it responds to query from OpManager. Invoke the MibBrowser utility bundled with OpManager to check for response. Let me show you how to check for SNMP response.

1. From OpManager/bin, double-click to execute MibBrowser.bat.
2. In the MibBrowser GUI, enter the device name, the SNMP port, and the read community string (default is PUBLIC).
3. RFC1213 mib is loaded by default. Expand the mib to org->dod->internet->mgmt->mib2 ->system and select sysObjectID from this table.
4. From the toolbar above, click on the Get SNMP Variable icon (7th icon from the left) to see the response in the text area to your right.


Trust you will find the details shared in this post useful. Feel free to raise your doubts if any. Will be glad to assist you!

Regards

Vidya

OpManager Team

Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
Thanks everyone for attending the webinar on "Microsoft Infrastructure Monitoring using OpManager" and making it a success. So, here is the presentation used in the session, hope you find it useful.



To see it in Slideshare.net click here .

If you’ve missed it, no worries! Here is the copy of recorded webinar. This will also be shared in our upcoming newsletter.



As usual, just feel free to share your feedback and suggestions to me at “kalvin@manageengine.com”.

Cheers,
Kalvin Ram
Team OpManager
Network monitoring software from ManageEngine
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
As a part of routine IT maintenance, it is common to bring the device down or restart it. This even applies to your critical IT devices such as a router, switch, or a server. Though you are aware of such maintenance activities, what if your network monitoring software pushes onto you too many associated alerts?

Most applications have options to inherit your on-field intelligence by scheduling a downtime. It enables the network monitoring software to pause the routine monitoring presets for those selected devices during the maintenance time slot.  

Are there any fixed patterns to schedule such IT maintenance activities? 

Absolutely no! At times, they are specific to industry, organization, IT infrastructure or even an adhoc maintenance plan that has to be scheduled instantly. 

As you all know, OpManager has the Downtime Scheduler ("Advanced fault management") functionality with options to schedule maintenance period Once, Everyday or Every week. With the latest hot-fix, we’ve also included an option to schedule downtime Every month, for instance… 1st of every month or last day of every month or even third Sunday of every month and more...

Here is the snapshot of the new downtime scheduler screen, Hope you find it useful.



So, for those who are using downtime scheduler, feel free to try the new inclusion and share your feedback with us at “opmanager-support@manageengine.com”. If you’ve not tried it before, give it a try and let us know how it worked for you.

---- 
Kalvin
Team OpManager
Network monitoring software from ManageEngine 

VMware Monitoring - what to watch out for

Aug 12 2010 03:58:02 AM Posted By : sreelesh
Share this article: Tweet this Connect with Facebook Add to Digg Stumble it
  • When any VM has a CPU Ready time of over 20% and the host CPU utilization is also over 90%, it signals CPU overcommitment & would need addition of CPU or VMotion enabling
  • More than 1MB/s of swap in or swap out rate signals memory overcommitment
  • From an EMA benchmark report, average performing enterprises had physical CPU utilization of 45% while best performing enterprises had this at 70%. Where do you stand?
The above were just a few “rule of thumb” points and VMware metric descriptions that were covered in our webinar titled “VMware Performance Monitoring – the Must Haves”. The presentation that was used for the webinar is embedded below. While we get ready to host the recorded version, you can write to us if you are interested in the transcript of the webinar: opmanager-marketing[at]manageengine[dot]com