Airline And Emergency Services Halted Worldwide Thanks to A Simple Update 

On Friday morning, Karen came to work for Delta Airlines at 4:30AM like she always did to help the early bird travelers check in and catch their flights.  When she booted up her computer, she saw something she had not seen in 20 years.  It was the “Blue Screen of Death.”   She asked a co-worker, and her computer was showing the same thing.   What was she going to do with all those travelers that can’t check in?  By 10:00AM EDT, Delta had cancelled more than 600 flights.    By Saturday, July 20th, over 4,000 flights would be cancelled throughout the airline industry globally leaving passengers stranded or dealing with hours of delay.   

What happened?  Shortly after midnight, CrowdStrike, a security software provider, pushed out a single content update to its 24,000 customers worldwide.  It was a small update designed to stop new attacks hackers have been using.   On installation, the configuration update triggered a logic error that resulted in the famous Blue Screen of Death.  CrowdStrike could not just back out the patch.  The customer computers were inoperable.  There is no automated way to back out the software.  It required a “Safe Mode” boot which requires someone to be physically next to the device and enter a set of keystrokes during boot.  Only then could the bogus file be removed allowing the computer to operate as normal.   

The impact of this mistake was felt worldwide.  Several states, including Arizona, experienced 911 service outages.   By 3:00AM, the Federal Aviation Administration announced that all Delta, United, Allegiant, and American flights were grounded.  Transportation services in the Northeast, including trains and buses were experiencing delays.  Global banks reported services disruptions, from Australia, South Africa, Israel, and New Zealand.  Hospitals in Germany and the UK were cancelling all non-urgent surgeries due to the event.   Even locally, Maricopa County reported that their Dominion voting machines were malfunctioning due to the automatic update.   

CrowdStrike is a leader in the cybersecurity space.   Their Falcon Sensor product is an endpoint detection response tool.  It goes onto each individual computer and searches and stops known malware from firing.  The company was founded in 2011.   Some may recall that CrowdStrike was called to investigate the alleged Democratic National Convention server hack in 2016.  Since then, the small company has enjoyed tremendous growth and success.  The company says its customers include 298 Fortune 500 companies, eight out of the top 10 financial services firms, seven out of the top 10 manufacturers, six of the top 10 healthcare providers and eight out of the top 10 food and beverage companies.  With this many big names, you can see why the impact of this failed Falcon Sensor update caused such a huge problem.  

It is appalling that any company, much less a global leader like this, would automatically push out software which they had not validated.      There have been rumblings on the internet that this could have been done on purpose for some nefarious reason, but I disagree.   CrowdStrike should have manually validate their software at the developer level and then again at an independent test and verification department level and then again at a pilot customer site before pushing anything out to the world.    

As for the customers caught up in this, we would not recommend immediate auto-updates for anything.   While working in the industry, we regularly waited a day to test the vendor updates and ran through a suite of tests before releasing it to our customers.  The fact that there was no control at the customer level made this event that much worse. 

This event shows us the need for every business to have disaster recovery and contingency plans. Whether it’s due to cyberattacks, technical issues, or natural disasters, having an effective plan is crucial for maintaining business continuity and minimizing downtime. 

In a world where we are increasingly dependent on computers for our businesses to function, be ready to run the old school way as a backup – just in case.    

The original article was published in the Sierra Vista Herald and can be found here.