Identifying the Problem of Disruption

Written by

Recently, Dyn, the internet performance management company, published findings on the impact internet disruptions have on UK organizations. You can read a summary here, but it is interesting to note that:

  1. 57% of the internet disruptions UK organizations experienced in the past year have occurred outside their network control
  2. A quarter of UK organizations find it extremely or very difficult to monitor and identify when an internet disruption occurs outside of their network control, whilst a third (31%) find it extremely or very difficult to resolve an issue outside their network control
  3. When asked about the biggest impact and risks resulting from disruptions, nearly a third of UK organizations (30%) have experienced, or would expect to experience, loss of revenue and nearly a quarter (23%) would expected a loss of new business as a result of an internet disruption

Yet, despite this, just four in 10 UK companies (39%) monitor their network activity and identify patterns.

The situation is better in the US when it comes to remediation of incidents, something that US organizations do twice as well as UK businesses. However, the pertinent point still is: why is this happening, and more importantly, what can organizations do to insulate themselves from the real business risk?

Let’s answer the first question: why do so many disruptions occur outside of the organization’s network control and why are they hard to monitor and identify? The answer to the first part of the question is intuitive – organizations have moved to the cloud, outsourced operations and services, and hired best in class third parties to do everything from development to billing. When an IT supply chain is as stretched and diverse as this, there are bound to be gaps or faults in it. We have migrated from an IT supply chain that was:

  • Self-hosted data center
  • Redundant connections to the internet (one dependency)

To:

  • Cloud hosted data center (one dependency)
  • Credit card processing (another dependency)
  • Development performed by a third party (a third dependency + all the dependencies the third party has)
  • Content served on Content Delivery Networks (CDNs) (a fourth dependency)

This is not necessarily bad – best-in-class providers are motivated to provide superior service, but there are also areas of vulnerability in an IT supply chain that need to be carefully analyzed and addressed.

Which brings us to the second part of the question: why are they hard to monitor and identify? This is also somewhat intuitive to answer – organizations make assumptions. They assume that a cloud data center is fully redundant (which it generally is), but they don’t plan continuity contingencies if an entire cloud region goes down. They don’t monitor outside of their own network, because they assume their provider is doing it (which it generally is), but they don’t plan to have access to this instrumentation during outages. When an incident does occur, they scramble because half their services are outside their network, and they never planned for failure of the other parts of the IT supply chain.

So, what’s an organization to do if they’re in this situation? Clearly, it’s not something that can be ‘fixed’ overnight, but perhaps there are some tactical and strategic steps that can be taken to lessen the risk over time. An organization will want to:

  1. Perform a complete data and process flow, identifying points of business continuity vulnerability, including third party services
  2. For each third party service or IT supply chain integration point, review and rate continuity options (for example, for cloud service providers, the organization may have the option to have failover systems in different regions of the world to mitigate the risk; for third party development, they may have the option to check in escrow copies of code on a nightly basis)
  3. Collaborate with business management to define an acceptable level of business continuity risk
  4. Define the types of controls to be implemented to meet the business requirement:
    • Preventive controls, such as redundant installations, hot backups, etc.
    • Detective controls, such as independent monitoring functions that will alert if any part of the supply chain goes down
    • Mitigating controls, such as the ability to use slightly less functional ‘copies’ until the core systems are back online (e.g. phone in for credit card orders rather than online)
  5. Using the risk-based rating method from bullet 2, address the highest risk supply chain element, then the next, until the business risk is considered acceptable
  6. Consider codifying the requirements of their service providers into contractual language, which address the types of controls the service providers are expected to implement, including collaboration during incidents and compensation for lost services.

To summarize: as companies evolve in their use of technology and service providers, including cloud, third party outsourcing, CDNs and SaaS/PaaS/IaaS, the old methods of monitoring their networks are becoming obsolete. These newly sophisticated and complex IT supply chains require additional due diligence, including: understanding IT process flows, implementing additional monitoring and mitigation controls and requiring third party service providers to assist in incident investigation and remediation.

What’s hot on Infosecurity Magazine?