BOOK THIS SPACE FOR AD
ARTICLE ADCrowdStrike has hired two outside security firms to review the Falcon sensor code that sparked a global IT outage last month – but it may not have an awful lot to find, because CrowdStrike has identified the simple mistake that caused the incident.
News of the review emerged in a root causes analysis [PDF] published on Tuesday.
As we learned from CrowdStrike's first post incident review of the flawed code – which bricked millions of Windows machines worldwide – the problem began back in February.
That was when the security vendor added a sensor to Falcon to help the software detect novel attack techniques that abuse named pipes and other Windows interprocess communication (IPC) mechanisms. The update went through the usual development and testing, and then CrowdStrike pushed a new "Template Type" including the IPC-related info to its Falcon sensors in a "Channel File" numbered 291. That file was updated periodically with updates that added new "Template Instances" to improve Falcon's attack-detection prowess. Info in Template Instances is processed by a "Content Interpreter" when Falcon swoops into action.
The root causes analysis provided a deeper look at what went wrong next:
Then, as CrowdStrike also previously explained, two further IPC-related Template Instances were automatically deployed to Falcon users on July 19. One of these used a non-wildcard matching criterion for the 21st input. This resulted in a new version of the Channel File that required Falcon sensors to inspect the 21 inputs – but another piece of software called the Content Interpreter expected only 20 values.
"Therefore, the attempt to access the 21st value produced an out-of-bounds memory read beyond the end of the input data array and resulted in a system crash," the security shop explained in the root cause analysis.
CrowdStrike has coded a fix to ensure that mismatches of the number of inputs validated versus number of actual inputs doesn't happen again. It's a patch for the Sensor Content Compiler – this is the function that validates the number of inputs provided by the template type – and it went into production July 27.
Too late now for canary test updates, says pension fund suing CrowdStrike The months and days before and after CrowdStrike's fatal Friday CrowdStrike meets Murphy's Law: Anything that can go wrong will Beware of fake CrowdStrike domains pumping out Lumma infostealing malwareCrowdStrike also wrote that it has added runtime input array bounds checks to the Content Interpreter for Rapid Response updates, to ensure the size of the input array matches the number of expected inputs. These fixes are currently being backported to all Windows sensor versions 7.11 and above with a sensor software hotfix. The release will be generally available by August 9.
Additionally, the chastened security vendor is doing more tests – including some that test non-wildcard matching criteria for each field across all template types, and new checks to ensure that flawed files aren't pushed to Falcon customers in the future.
Further, as CrowdStrike had noted in its earlier analysis, every Template Instance will henceforth be deployed to customers in a staged rollout, rather than being pushed to all users all at once.
It's worth noting that the company is being sued by investors for not originally using this type of phased approach in sending updates to customers.
"Looking ahead, CrowdStrike is focused on using the lessons learned from this incident to better serve our customers," a spokesperson declared. "CrowdStrike remains steadfast in our mission to protect customers and stop breaches."
But not so steadfast that it’s naming the partners it hired to review its code.
Those reviews have commenced, and are focused on the code and processes that led to the July 19 fiasco.
"We are not providing information on the vendors who are doing work for us beyond what is referenced in the RCA," the CrowdStrike spokesperson told The Register. ®