CrowdStrike outage incident response – content update and future risk mitigation

0
156

Candice Wilson | Partner | Western Cape Consulting Leader | Ernst & Young (EY) | mail me


As organisations emerge from the recent CrowdStrike Falcon incident, I have taken some time to reflect on what we learn from this. 

Over the past few days, we have seen so many responses and opinions on apportioning liability and estimating the impact of the incident. This point of view takes a different perspective, reflecting on how to respond positively and in a forward-looking way.

CrowdStrike reflections – incident response learning through adversity

Interconnected ecosystems and supply chains

If we were not already aware of how interconnected we are with our ecosystems and supply chain partners, this incident underscores it. Less than 1% of Microsoft end points running CrowdStrike were impacted, yet nearly everyone across the world was impacted in some way.

Inability to access banking apps, retail stores that couldn’t trade, flights cancelled, meetings cancelled, delayed ‘go-lives’ are just a few examples of how the average person was impacted.

‘Fix it Broken’ reactions

There have been some alarming, panicked, responses to the incident. Granting everyone privileged access, deleting system files, restoring systems to prior days, are examples of the panicked ways that organisations responded to the incident ‘in the moment’ with mounting pressure from execs and customers to restore operations.

These panicked reactions have exposed organisations to additional, compounding, vulnerabilities, the impact of which may be more significant and far-reaching than the Falcon incident itself.

Resilience is more than disaster recovery

Our ecosystems and technology landscapes have evolved to a point where traditional disaster recovery (DR) and business continuity planning (BCP) is somewhat irrelevant.

The recent incident has shown that some organisations have effectively pivoted to a more holistic approach to resilience, while others remain stuck in traditional DR and BCP mindsets. Those organisations with a more holistic approach were able to respond more decisively, pragmatically and quickly. Whereas those following a more traditional DR approach tended to respond ‘blindly’ with panicked, ‘in-the-moment,’ decision making.

Next steps – emerging from the CrowdStrike crisis stronger

Residual mop up and remediation

As we emerge from the storm, its important to assess the potential damage and implement robust, sustainable measures to remediate that which was impacted. Careful and thorough assessments of the residual impact need to be carried out. This goes beyond just the CrowdStrike fixes and extends across the technology and business landscape.

Where workarounds were implemented, these need to be carefully investigated to not only replace the workarounds with more permanent solutions but also to assess whether the workarounds introduced vulnerabilities or exposed the environment to potential threats.

Decisions made during the incident need to be revisited to ensure that nothing was inadvertently broken or compromised as a result.

Cybersecurity – new threat profile considerations

It is expected that there will be an onslaught of attacks as a result of the incident. This is because organisations have inadvertently revealed the inner workings of their defences (both those running CrowdStrike and those who are not). This inevitably heightens and changes the threat profile.

It is critically important that organisations respond by adapting their detection and response to the new threat profile.

Evaluate crisis response

An actual crisis is always the best test of resilience and crisis response plans. In this sense, its an opportunity to identify areas where response plans can be improved.

Were decisions and actions clearly defined and understood? Did the response plans provide practical and fit-for-purpose support in dealing with the incident? Was information readily available and accurate to support decision-making? Was the response dependent on a couple of key people (what would happen if they weren’t available)? How did your vendors and ecosystem partners respond (in restoring the services that you require as well as in supporting your recovery)?

In answering these and other key questions, it is an opportunity to refine and improve your response plans.

In conclusion

To enhance your organisation’s security, it’s important to perform detailed impact assessments and create remediation plans. Continual refinement of tools and processes is essential for robust security.

Improving threat profiling and response strategies is key, as is regularly reviewing and practising crisis and incident response plans for prompt, effective action during security events.



Related FAQs: CrowdStrike incident – outage and response

Q: What was the recent CrowdStrike outage incident about?

A: The recent CrowdStrike outage incident involved a problem with a configuration update that caused system crashes on affected Windows hosts.

Q: How was the CrowdStrike incident response handled?

A: CrowdStrike quickly deployed rapid response content updates to resolve the issue and conducted a preliminary post-incident review to understand what went wrong.

Q: Were all CrowdStrike customers affected by the outage?

A: Not all CrowdStrike customers were affected by the outage, only those using the Falcon platform on Windows hosts.

Q: What was the impact of the outage on the affected systems?

A: The outage caused blue screen of death (BSOD) errors due to out-of-bounds memory access by the CrowdStrike content interpreter.

Q: How does CrowdStrike plan to mitigate future risks of such incidents?

A: CrowdStrike is implementing more rigorous testing procedures for content updates to prevent similar incidents in the future.

Q: Did CrowdStrike publicly address the issue and take responsibility?

A: Yes, CrowdStrike CEO George Kurtz publicly acknowledged the issue, apologized for the inconvenience caused, and detailed the steps being taken to avoid such incidents in the future.

Q: Is this outage the first of its kind in CrowdStrike’s history?

A: No, this outage is not the first in CrowdStrike’s history, but it has prompted the company to enhance their incident response services and content update processes.



 



LEAVE A REPLY

Please enter your comment!
Please enter your name here