IR and the Bathtub Curve

Written by

Over the last decade, cybersecurity has become one of the biggest threats to organizations; many publications (including this one) have given good advice about how to implement an information security program. Many of you will have taken these concepts and utilized them within your organization, but how do you know that your implemented SOC or Incident Response team is functioning as expected?

In the engineering industry there is a widely used term “The Bathtub Curve” which is used in reliability engineering to describe stress testing and failure rates. The term is derived from the cross section of a bathtub (steep sides and a flat bottom). This model does a very good job of showing how a standard Security Monitoring program functions over time. In the image below we have a classic Bathtub curve for a SOC:

Figure 1: The bathtub curve
Figure 1: The bathtub curve

In the first phase we see a standard Ad-Hoc approach to incident handling and response activities which leads to delays in remediating attacks. Gradually over time as more procedures, processes, and runbooks are put in place the team is able to respond in an organized manner greatly reducing the number of errors and delay in threat remediation.

However, as time goes on processes and procedures become outdated and new evolving threats are not scoped or identified leading to a gradual decrease in effectiveness of the team. To ensure that SOC and incident response programs stay on track, not only does there have to be an emphasis on keeping documentation and procedures up to date, but there must be a testing process to ensure that the teams are functioning as expected.

Looking once more at reliability engineering, this concept is best shown with the steps taken by the aircraft industry to ensure that all systems and infrastructure on an aircraft is tested. These tests were introduced in the 1950’s following two instances of the de Havilland Comet airliner crashing, both of which were attributed to metal fatigue.

In modern testing, each part of the airframe is stress tested to identify at which point components will fail. Figure 2 shows a dramatic representation of this with the wings of a test Boeing 787 being put under load:

Figure 2: Stress testing Boeing 787 wings
Figure 2: Stress testing Boeing 787 wings

So, if this is how the aerospace industry conducts its testing, then how do we, in the cybersecurity field, conduct our reliability testing? In short, we don’t; yes red teaming and penetration testing have their place, but they test the overall security architecture and not the security monitoring/incident response function or their staff.

What is called for is an Orchestrated Maturity Assessment or purple team engagement to test all of the incident handling processes and procedures in a live exercise (ideally using the production network).

While the Orchestrated Maturity Assessment should not be considered a penetration test, there are elements of the exercise, which can be described as “ethical hacking”. It is therefore important that the scope of the exercise and detailed descriptions of the rules and methodology which the testing will take place are agreed in advance.

What makes this type of assessment unique is that each SOC/IR Team design and implementation is different as its processes and procedures. Building out a customized live exercise for the analysts on their production environment gives them a real-world scenario to be tested against in a safe yet effective scenario.

An organization’s management get a good understanding of where they currently have gaps and their security monitoring program is stress tested to identify weaknesses. The after-action report stresses the areas which need to be focused on to ensure productive operations continue to function. To conduct an Orchestrated Maturity Assessment, there are the recommended specific tasks to achieve this:

  • Agree a scope for the testing
  • Review all pertinent processes and procedures
  • Design assessment scenario to align with processes and procedures
  • Initiate open source intelligence gathering of target organization
  • Creation of scenario and exercise binaries, add list of emails to be targeted in phishing attack
  • Identification of flags for the SOC/IR staff to discover
  • Mapping flags to the appropriate procedure
  • Conduct exercise
  • Assess flags captured and processes followed
  • Formal report with findings and recommendations

The results of the assessment give an organization’s management team the information required to adapt their response capabilities and remain focused on the specific threats to their organization, thus extending their bathtub curve for an extended period.

It goes without saying that this procedure should be conducted on a regular basis (at least twice a year) including new and varied scenarios to continually assess, improve your team’s effectiveness.

What’s hot on Infosecurity Magazine?