News spreads like fire and bad news spreads like wildfire. The whole World witnessed the panic that a network outage caused among the passengers of United Airlines last week and its impact has been more than just financial impact.

The fact that every news website has covered the issue has resulted in bad press. To add to that, social media like Facebook and Twitter were erupting with posts and tweets from frustrated passengers where they generously fumed against the airline company for the disruptions and delays in service.

Long queues, embarrassed client-facing front-end employees, clueless middle managers, frustrated passengers, crowded airports, anxious people, a sensation-hungry media, the modern-day ranting grounds: Twitter & Facebook, one can imagine the chaos.

Let us try conducting a root cause analysis.

The network outage hit one of the most critical systems within an airline network, the passenger reservation and ticketing system. Why did it happen? No information found! Probably, nobody knows.

Traffic volumes can be quite volatile in a network such as passenger reservation and ticketing systems. With an uncertain economy, the spending patterns and hence the demand patterns are quite unpredictable. One must also understand that the hospitality industry is extremely competitive and things such as bad customer service, disruption of services, associated bad press etc. are recipes to invite brickbats.

The costs of unavailability are not restricted to financial. Reputation, customer loyalty, customer retention rate, brand image etc. are all at stake.

The only way to ensure round-the-clock availability of network is to have stern control over it. The first step towards being able to take control of your network is to monitor traffic and remain aware of the applications, protocols and conversations in the network.

How to fight the battle?

Any system is vulnerable to outages. We do not live in a perfect World and disturbances are inevitable although they are undesirable. I am not getting philosophical here but the fact remains that there is always a small possibility for something to go wrong and the smart ones prepare themselves for such possible incidents. It could be a minor fault in the system, some junk traffic or an application that slowed down the network by eating way too much bandwidth. No matter what the cause of the problem is, the impact is huge.

Being prepared for such an instance means having the right tools in place and knowing what to do and how to do in the shortest possible time.

When an outage occurs, the first step is to troubleshoot it and it involves a great amount of skill and expertise. Minimum troubleshooting time minimizes downtime and hence cuts costs. To accelerate troubleshooting one needs the right tools.

Having a network traffic analytics tool in place is one way of ensuring damage control. With a tool like NetFlow Analyzer, one can ensure with one glance what applications and conversations are going on in the network. This way, an upcoming abnormality can proactively be identified and prevented.

In the case of United Airlines, we see that it has taken 3 hours to restore services and that was enough time to frustrate both employees and customers. The troubleshooting time and may be the entire incident could have been avoided if a powerful tool had been employed.

Learning from the Story:

Being proactive about knowing what is happening in the network is mandatory for any industry. This is of special significance in the case of hospitality services where the slightest of network disturbances could have a rippling and even a crippling effect leading to damages such as financial loss, loss of brand reputation, bad PR and lost customer loyalty.

Use a monitoring tool and stay away from being in the news for the wrong reasons.

Reference links:

http://news.cnet.com/8301-1001_3-57502077-92/united-airlines-network-outage-snarls-air-travel/

http://www.cbsnews.com/8301-505123_162-57502163/united-hit-by-network-outage-widespread-delays/