A widespread Blue Screen of Death (BSOD) issue on Windows PCs disrupted operations across various sectors, notably impacting airlines, banks, and healthcare providers. The issue was caused by a problematic channel file delivered via an update from the popular cybersecurity service provider, CrowdStrike. CrowdStrike confirmed that this crash did not impact Mac or Linux PCs.
It turns out that similar problems have been occurring for months without much awareness, despite the fact that many may view this as an isolated incident. Users of Debian and Rocky Linux also experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company's software update and testing procedures. These occurrences highlight potential risks for customers who rely on their products daily.
In April, a CrowdStrike update caused all Debian Linux servers in a civic tech lab to crash simultaneously and refuse to boot. The update proved incompatible with the latest stable version of Debian, despite the specific Linux configuration being supposedly supported. The lab's IT team discovered that removing CrowdStrike allowed the machines to boot and reported the incident.
A team member involved in the incident expressed dissatisfaction with CrowdStrike's delayed response. It took them weeks to provide a root cause analysis after acknowledging the issue a day later. The analysis revealed that the Debian Linux configuration was not included in their test matrix.
"Crowdstrike's model seems to be 'we push software to your machines any time we want, whether or not it's urgent, without testing it'," lamented the team member.
This was not an isolated incident. CrowdStrike users also reported similar issues after upgrading to RockyLinux 9.4, with their servers crashing due to a kernel bug. Crowdstrike support acknowledged the issue, highlighting a pattern of inadequate testing and insufficient attention to compatibility issues across different operating systems.
To avoid such issues in the future, CrowdStrike should prioritize rigorous testing across all supported configurations. Additionally, organizations should approach CrowdStrike updates with caution and have contingency plans in place to mitigate potential disruptions.
Source: Ycombinator, RockyLinux
32 Comments - Add comment