How to Achieve Cloud Operational Excellence

Written by

In the mid-1990s, Gartner acquired an IT metrics firm called Real Decisions. They offered benchmarking services so customers could compare their IT efficiency with similar organizations.

With hundreds of ‘Global 2000’-sized customers, their database was rich. Over time, Real Decisions refined its catalog of metrics and enriched the data with repeat studies to develop indices of efficiency. 

Within their user population, the difference in cost per unit of productive work was 11 times better in the top 10% compared with the bottom 10%. That doesn’t mean 11% better, it means 1,100% better. 

Note that the user base was self-selected. All participants wanted to get objective metrics of performance relevant to their business goals and paid for extensive studies involving questionnaires, financial audits and technical benchmarks.

In short, these were all industry leaders. The bottom 10% of this segment is still within the top 10% of the IT industry, and their score is 11 times worse than the best of the best. 

This raises the question: What is the industry average for IT efficiency? Is it possible that the hundreds of benchmarking users are all doing IT wrong, and the search for relevant metrics is misguided? I think not. 

Choosing the Right Metrics

In 1911, Fredrick W. Taylor published The Principles of Scientific Management, which discussed approaches to optimizing two important variables: Output quality and worker compensation. Taylor recognized that successful firms work collaboratively, with management and workers jointly setting goals and developing methods and tools to achieve both profitability and proportionate compensation.

Nowhere in this text – or in any of his recorded speeches or documents – does he say, “If you can’t measure it, you can’t manage it.” He didn’t say that for two reasons: First, he did not believe it; second, it is not true.

"The key difference between great and mediocre code flowed from how the organization managed problems"

What he did emphasize was that if you do measure it, you will manage it. That was a warning: Pick the right metrics or you will exert effort pursuing a meaningless goal. 

The journey to cloud excellence starts with developing the metrics that are most relevant to your business goals.

Sustaining Excellence 

After you agree on a set of metrics that are statistically reliable, repeatable, objective and aligned with your mission, how do you achieve and sustain excellence? 

In the 1970s, the U.S. Department of Defense bought a lot of custom software. Sometimes it worked well, other times it didn’t. So, the DoD funded research on code quality. It turns out that the key difference between great and mediocre code flowed from how the organization managed problems. The spectrum runs from confusion and dismay through to calm, rational assessment and remediation.

This study produced the Capability Maturity Model (CMMi), which was created by the Software Engineering Institute at Carnegie Mellon University. The CMMi framework identifies five levels of process maturity.

A level one organization has no standard method in place to deal with problems. When something goes wrong, everybody grabs tools and tries to figure out what went wrong and how to fix it. Organizations like this do not spend much on training or analysis. Their focus is continuing to produce whatever they are trying to make.

"Management rewards heroes who can shoot the most difficult bugs, and this reinforces the culture of heroes"

Over time, an individual may develop expertise in diagnosing a component, and when things go wrong, the call goes out to “Get Fred in here!” to troubleshoot the problem. Organizations with pockets of expertise are moving into level two.

Most organizations fall within one of these two levels. 

Management rewards heroes who can shoot the most difficult bugs, and this reinforces the culture of heroes. But moving forward requires the heroes to take on a new role. To proceed beyond level two, training and communications skills are crucial. Once the organization creates this documentation, it is on the path to level three.

Note that these transformations are wrenching. It is not easy to tell the heroes that their greatest value to the organization is now how well they can write or teach. But with proper management attention, it can be done. The benefits of moving forward are many: 

  1. Dramatically fewer crises: Staff can make plans and keep them – no emergencies interrupting a family gathering, a school event, or a get-together with friends. 
  2. High quality code: Maintenance tasks became much simpler, and customers experience improved reliability. Documentation means that troubleshooting becomes routine rather than overwhelming. 
  3. Reliable planning: In a mature organization, plans and estimates hold true because they are based on proven metrics, continuously validated processes and a competent team.

Cloud excellence is not a phantom or an unachievable goal. It is the result of clear thinking and sound documentation. Over time as practices improve, skills build.

To quote Macklemore, “The greats weren’t great because at birth they could paint, the greats were great because they painted a lot.”  

Brought to you by

What’s hot on Infosecurity Magazine?