The CrowdStrike Incident: What Lessons Can SME's Draw? | British Chamber Of Commerce Singapore

August 19, 2024

Software malfunction

When software malfunctions, the impact can be significant, affecting businesses of all sizes. A recent example is CrowdStrike, a leading cybersecurity company, and now an immortalised case study of how disastrously a routine update can go. On 19 July 2024, what should have been a simple update to CrowdStrike’s Falcon sensor turned into a global IT fiasco, causing system crashes across Windows hosts. This wasn’t a cyberattack but a self-inflicted error—a memory read error embedded within the update itself.

On 6 August 2024, CrowdStrike released the Root Cause Analysis (RCA), shedding light on the incident. While the RCA serves as a crucial reminder for software companies about the importance of robust development, testing, and deployment practices, it also gives all businesses, no matter how small or what field they are in, pause for thought. What vital measures are needed to secure our ever-increasingly connected world?

Lessons for SMEs: Prevention Over Cure

Testing Is Critical: For SMEs, thorough testing should never be seen as an optional extra. You should test and ask for full test reports from your IT providers. Implement both automated and manual testing to ensure your updates are reliable. A minor glitch can escalate quickly if it is not identified and fixed early.
Phased Rollouts: If ease of cost is a question, then software updates must be rolled out in phases. For SMEs, this means starting with a small segment of your systems or users before a full-scale deployment. This approach helps catch potential issues early, minimizing the risk of widespread disruption. In situations where a phased approach is not possible, a significant amount of extra testing needs to be planned.
Know Your IT Landscape: While you don’t need to grasp every detail of your systems, it’s essential to have a clear understanding of your high-level architecture. Know who owns each system, what type of access and update permissions are in place, and which applications are critical to your operations. Additionally, identify your most relied-upon applications and have contingency plans in place should they experience an outage.
System Redundancy Is a Must: Building redundancy into your IT systems is not just for large corporations. SMEs can also benefit from having backup systems and processes in place. This ensures that a single point of failure doesn’t cripple your entire operation.
Robust Support Coverage Is Crucial: Nobody likes paying for insurance or support cover they might not need, but the cost of being unprepared can be catastrophic. The difference between recovering systems in a day rather than three can make or break your business. Having robust support in place isn’t just about mitigating immediate damage—it’s about protecting your reputation.
Communication Is Key During Crises: When an issue arises, keeping customers and stakeholders informed is crucial. Transparent communication can help mitigate damage to your business’s reputation. SMEs should proactively establish clear communication protocols for when things go wrong, these protocols must consider your legal obligations regarding reporting IT outages.
Continuous Improvement Is Non-Negotiable: How your business responds to crises is a true measure of resilience. Use every near miss as an opportunity to refine your processes and improve. This mindset will help your SME grow stronger over time.

The SME Takeaway

Mistakes are inevitable in today’s fast-paced tech environment, but how you respond to them matters. The CrowdStrike incident can be used as a case study for what strategies SMEs should adopt: phased rollouts, rigorous testing, clear communication, system redundancy, robust support coverage, and a focus on continuous improvement. By implementing these practices, your business can better navigate unexpected disruptions and emerge stronger.

By Daisy Radford
Technology Committee Chair