Last week, what seemed like a routine update from CrowdStrike, a “behind the scenes” leading cybersecurity firm, turned into a massive IT catastrophe causing system outages that impacted businesses and individuals globally. Imagine waking up on a typical Friday morning to learn that a crucial security update has caused computers to crash, impacting your ability to serve customers, work, travel, and conduct many routine activities. That’s exactly what happened when CrowdStrike pushed what should have been a routine, but critical security update to Microsoft Windows, leading to the infamous “Blue Screen of Death” (BSOD) we all fear. This wasn’t a small hiccup; it disrupted operations across numerous industries, public infrastructure, and even our personal computers causing massive chaos that some have not yet fully recovered from. CrowdStrike quickly rolled back the update, but the fix required manual intervention: rebooting systems into Safe Mode and deleting specific files. This was no small feat, especially for large organizations with thousands of devices.
Impacts on airlines were significant, with over 5,000 flights (and counting) canceled or delayed. Or in the case of many: delayed, delayed, delayed, and then canceled. We’ve all seen the media footage of airports with long lines, frustrated passengers, and airline staff scrambling to manage the situation with limited manual tools and resources available. For instance, at Hong Kong International Airport, staff had to use handwritten signs to direct passengers because their check-in systems were down. Major airlines like Cathay Pacific faced booking system failures, making the situation even more chaotic and many of our domestic airlines’ crew scheduling systems were down.
With systems down, airlines struggled to handle the flood of inquiries. Call centers and airline websites were overwhelmed, leading to long wait times and mounting frustration among passengers. Without their usual channels, airlines found it hard to provide timely updates. This left passengers in the dark, often stuck in airports for days increasing their frustration and stress. Of course, ancillary businesses such as hotels, car rental services, trains, and other alternate means of travel or travel comfort were impacted, leaving little opportunity for travelers to solve for their individual needs.
Of course, this happened when I was traveling. I was on three flights in four days and was lucky not to have any disruptions, but many of my friends did not have the same luck and suffered through the madness. I heard on the news about a family on their way to vacation being rerouted due to this fiasco. Their layover in their connecting city was over 300 hours, almost two weeks! That may have been longer than their vacation. This is just one of thousands of stories of travelers’ plans and lives disrupted. It’s easy to see how frustration levels during an event like this for customers and employees are high.
Hindsight is always 20/20. Airlines could have used multiple channels like social media, email, and SMS to keep passengers informed. Real-time updates can go a long way in reducing anxiety and confusion, as would have simple loudspeaker announcements at the airports, offering clear instructions to passengers on how to handle common problems. Having teams specifically trained for crises can make a huge difference. These teams can focus on managing the surge in inquiries and ensuring consistent information is provided across all platforms. Backup systems or manual check-in processes could have been implemented more smoothly. Training staff on these procedures beforehand may have ensured a more organized response during a crisis. Temporary call center staff or AI-driven chatbots could help manage the increased volume of inquiries. This way, passengers can still get the help they need, even if human agents are swamped. Conducting regular drills to simulate various IT failure scenarios can prepare teams for real-life incidents. These drills help identify potential weaknesses and improve overall readiness. And, if all else fails simple empathy, kindness, and leadership from each employee and of course, the passengers impacted can go a very long way.
The CrowdStrike outage was a tough lesson for many businesses, highlighting the need for not just robust IT infrastructure and effective disaster recovery plans, but better training for its employees. For airlines and other affected sectors, it underscored the importance of being prepared for unexpected disruptions and having solid customer service strategies in place. It should also underscore the importance of employee professionalism and grace under pressure. As a passenger, we understand the problems are out of the employee’s control, but most of us will react differently if we are treated with empathy rather than as an annoyance. It doesn’t solve the problem, but it does give a better experience and may be the difference between a lost customer and a loyal customer. It’s all about being ready, being responsive, and above all, being empathetic to those who are affected.
As customers, we should remember that during a crisis, everyone is doing their best to resolve the issues. Patience and understanding can go a long way in making a difficult situation more manageable for everyone involved. Sometimes, even a little humor helps.
“When you assume negative intent, you’re angry. If you take away that anger and assume positive intent, you will be amazed.” — Indra Nooyi
Let’s turn lessons into action. If you’ve faced similar IT disruptions or customer service challenges, share your strategies and tips for handling such crises. Your experiences can provide valuable insights for others. Leave a comment below or share this blog to spark a discussion within your community!
Have an unchaotic weekend!!
-Vijay