• An example DevOps scorecard of important metrics. Source: Chakravarty 2014.
    image
  • DevOps metrics spanning several areas. Source: Kowall 2017.
    image
  • Key process steps with DevOps metrics. Source: Pfeiffer 2017.
    image
  • Metrics across the DevOps pipeline. Source: Electric Cloud 2017.
    image
  • Some metrics borrowed from engineering systems. Source: Woo 2017, fig. 3.9.
    image

DevOps Metrics

Improve this article. Show messages.

Summary

image
An example DevOps scorecard of important metrics. Source: Chakravarty 2014.

DevOps encourages incremental changes and faster releases while also improving quality and satisfaction. But how do we know if DevOps is making an impact? How do we decide what needs to change? We need to measure and this is where DevOps metrics come in.

Metrics give insights into what's happening at all stages of the DevOps pipeline, from design to development to deployment. Metrics are objective measures. They strengthen the feedback loops that are essential to DevOps. Collecting metrics and displaying them via dashboards or scorecards should be automated. It's important to map these metrics to business needs.

Milestones

2009

DevOps has its beginnings at the O'Reilly Velocity conference where John Allspaw and Paul Hammond present a talk titled 10+ Deploys a Day: Dev and Ops Cooperation at Flickr. Even in these early days, the importance of metrics is realized. Some metrics identified include CPU load, memory usage, network throughput, and aggregated job queue.

2016

There's a growing realization among practitioners that we can end up collecting a lot of wrong DevOps metrics. It's important to relate metrics to business values, needs or outcomes. One such proposal is the value-based approach that measures how value flows through the DevOps pipeline.

Jul
2017
image

Gartner publishes a report titled Data-Driven DevOps: Use Metrics to Guide Your Journey. This report includes a pyramid of metrics for DevOps.

Discussion

  • What factors make for a good DevOps metric?

    A good DevOps metric must ideally be all of these:

    • Obtainable: A metric that can't be measured is useless.
    • Reviewable: It must be relevant to the business and stand up to scrutiny.
    • Incorruptible: It should be free from influence of teams and team members.
    • Actionable: It should suggest improvements to workflows, policies, incentives, tools, etc.
    • Traceable: It should be possible to trace the metric to root causes.
  • What's the process of working with DevOps metrics?
    image
    Key process steps with DevOps metrics. Source: Pfeiffer 2017.

    A typical process involved identifying the metrics, putting in place methods to measure them, measuring and displaying them on dashboards, evaluating the metrics in terms status and trends, acting on the metrics to effect change, and continually assessing if the metrics are aligned to business goals.

    Since DevOps is cross-functional (process, people, tools) and cross-teams (dev, ops, testing), metrics should not narrowly focus on only some parts of the value chain. Metrics should capture a holistic view of the entire value chain.

  • What are some important DevOps metrics?
    image
    Metrics across the DevOps pipeline. Source: Electric Cloud 2017.

    There are dozens of metrics spread across all phases of a DevOps pipeline. Some have attempted to group them into categories:

    • Velocity: lead time, change complexity, deployment frequency, MTTR
    • Quality: deployment success rate, application error rate, escaped defects, number of support tickets, automated test pass percentage
    • Performance: availability, scalability, latency, resource utilization
    • Satisfaction: usability, defect age, subscription renewals, feature usage, business impact, application usage and traffic

    Another grouping can be host-based metrics, application metrics, network metrics, server pool metrics and external dependency metrics.

    There are also metrics for application build cycles, metrics for application performance, metrics for delivery performance, metrics organized by infrastructure, system and team health, and metrics for building or running apps.

    At a minimum, aim for more deployments per week, shorter lead time from code commit to deployment, lower failure rate in production, and shorter time to repair failures. Have metrics to measure these.

  • What are the important metrics in the world of microservices and serverless architectures?

    For microservices, metrics to include are number of requests per second, number of failed requests per second, and distribution of request service times.

    For serverless, the concern shifts from monitoring infrastructure to the application itself. Metrics include performance such as function runtime; scaling such as concurrency limits or memory limits; tracing event-triggered call flows across services or functions; and errors such as code bug, wrong invocation or function timeout.

    For both microservices and serverless, it's important to instrument the code. OpenTracing provides a vendor-neutral API for distributed tracing. Observability is an important aspect, which means that communication across services and functions needs to be accessible. A single request must be correlated to the sequence of service calls that followed it. Istio is a tool that requires strong observability.

  • Could you describe some DevOps metrics adopted from traditional engineering practices?
    image
    Some metrics borrowed from engineering systems. Source: Woo 2017, fig. 3.9.

    From traditional engineering, DevOps has adopted the following metrics:

    • Mean Time To Detect (MTTD): This is the average time to discover a problem. It's an indication of how effective is your incident management tools and processes.
    • Mean Time To Failure (MTTF): This is an indication of how long on average the system or a component can run before failing. This can suggest preventive maintenance. This metric relates to improving system uptime.
    • Mean Time Between Failures (MTBF): This is the average time between failures. It's a measure of reliability and availability.
    • Mean Time To Repair (MTTR): This is the average time to repair/resolve/recover after failure is detected. This metric relates to reducing system downtime. Code complexity is one aspect that affects MTTR.

    The goal is to reduce MTTD and MTTR while increasing MTTF and MTBF. DevOps is about incremental changes. If many changes are introduced at once, it will take longer to detect and fix issues.

  • Are there DevOps metrics that one should avoid?

    Teams transitioning to DevOps might end up adopting the wrong metrics. In fact, traditional metrics such as MTBF could be seen as irrelevant for DevOps where some failures are expected due to the speed of delivery. Look beyond such costs. Instead, improve total economic impact. Others to avoid are metrics that focus on business velocity at the expense of quality or culture; metrics that are optimized for one team and causing negative impact on others.

    Avoid conflict metrics that promote individuals rather than teams or pit one team versus another. These include ranking individuals or teams based on failure metrics (broken builds, etc.), rewarding top performers who don't collaborate or having different standards for different teams.

    Avoid vanity metrics that promote quantity or speed over quality: number of lines of code, number of deployments per week, number of bugs fixed, number of tests added.

    Don't collect a specific metric just because it's easy. Don't use a metric that encourages negative behaviours. It's been said,

    Human beings adjust behavior based on the metrics they’re held against... What you measure is what you’ll get.

  • Could you mention some best practices when using DevOps metrics?

    For those new to DevOps metrics, start with metrics that are simpler to collect and manage. Get the momentum going. For better focus, don't apply too many metrics. Choose metrics aimed at broader organizational goals or process health issues. Measure fast to enable real-time feedback loops.

    Because automated system-based metric collection is hard to do, you may want to start with surveys. In fact, both these are complementary. Surveys are good for metrics on culture or things outside the system.

    Use metrics that suit your business model. Adopt value stream mapping in which each metric is mapped to business values. For example, measuring website responsiveness becomes more useful if you can map it to business outcomes such as customer churn or abandoned shopping carts.

    Metrics can also be role-based (business vs engineering): give teams the choice to customize their own dashboards. In fact, dashboards are essential for tracking all metrics in one place. Compare trends, not teams. Look for outliers. Measure lead time to production, not just completion.

    Evolve your metrics as new technologies and tools enter your DevOps pipeline.

  • Are there tools to help teams collect metrics for DevOps?

    Many tools are available for various DevOps tasks. Some of these show metrics and even do real-time monitoring. We briefly mention some of them. In any case, there's a need to provide teams a single unified dashboard regardless of the tool that collects the metrics.

    Nagios is widely used for IT infrastructure monitoring. Zabbix, Sensu and Prometheus are alternatives. Prometheus is for service monitoring. It's often used with the visualization and analytics of Grafana.

    For application performance monitoring, there are New Relic, AppDynamics, Compuware and Boundary. For deeper integration, cross-platform data aggregation and monitoring, there's BigPanda and PagerDuty.

    JIRA Software does issue and project tracking. Code Climate automates code review and analysis. OverOps detects bugs proactively. For build automation, there's Apache Ant. Jenkins is useful for continuous integration and delivery. Ansible, Chef and Puppet help with continuous deployment. Ganglia is for cluster and grid monitoring. Snort is for real-time security. For logging, we have Logstash. Monit does system monitoring and recovery.

    Cloud providers offer their own monitoring tools: AWS CloudWatch from Amazon or StackDriver from Google.

References

  1. AlertOps. 2018. "MTTD vs. MTTF vs. MTBF vs. MTTR." AlertOps, May 07. Accessed 2018-10-09.
  2. Allspaw, John. 2009. "10+ Deploys Per Day: Dev and Ops Cooperation at Flickr." SlideShare. June 23. Accessed 2018-10-12.
  3. BlazeMeter. 2016. "Top 14 Monitoring Tools that Every DevOps Needs." BlazeMeter, February 11. Accessed 2018-10-09.
  4. Boyd, Mark. 2018. "Serverless Analytics: Metrics, Collection and Visibility." The New Stack, September 04. Accessed 2018-10-09.
  5. Chakravarty, Payal. 2014. "The DevOps Scorecard." DevOps.com, November 10. Accessed 2018-10-11.
  6. Coffman, Mason. 2017. "Seven Metrics That Matter When Measuring DevOps Success." Riverbed Blog, December 11. Accessed 2018-10-09.
  7. Cole, Arthur. 2018. "What Are the Right Metrics for DevOps?" ITBusinessEdge, April 20. Accessed 2018-10-09.
  8. Ehle, Dennis. 2016. "Measuring DevOps Performance Using a Value-Based Approach." Blog, VersionOne, October 18. Accessed 2018-10-09.
  9. Electric Cloud. 2017. "So how do you measure DevOps?" Electric Cloud, December 15. Accessed 2018-10-09.
  10. Ellingwood, Justin. 2017. "An Introduction to Metrics, Monitoring, and Alerting." Digital Ocean, December 05. Accessed 2018-10-09.
  11. Forsgren, Nicole and Mik Kersten. 2017. "DevOps Metrics." ACM Queue, vol. 15, no. 6, November-December. Accessed 2018-10-09.
  12. Haff, Gordon. 2017. "DevOps metrics: Are you measuring what matters?" The Enterprisers Project, July 10. Accessed 2018-10-09.
  13. Haff, Gordon. 2018. "3 warning flags of DevOps metrics." Opensource, February 21. Accessed 2018-10-09.
  14. Kowall, Jonah. 2017. "Why Metrics Must Guide Your DevOps Initiative." Blog, AppDynamics, November 17. Accessed 2018-10-09.
  15. Little, Mark. 2018. "Observability and Microservices: The Need for Effective Tracing and Metrics." InfoQ, June 17. Accessed 2018-10-09.
  16. McLaughlin, John. 2017. "The Dangers of DevOps Metrics." JMAC Labs, June 16. Accessed 2018-10-09.
  17. Melendez, Christian. 2018. "Which DevOps Metrics Matter?" DZone, May 03. Accessed 2018-10-09.
  18. New Relic. 2018. "Measuring DevOps." New Relic. Accessed 2018-10-09.
  19. Paul, Fredric. 2014. "The Incredible True Story of How DevOps Got Its Name." New Relic Blog. May 16. Accessed 2018-10-12.
  20. Pfeiffer, Mike. 2017. "Nine DevOps metrics you should use to gauge improvement." TechTarget, October. Accessed 2018-10-09.
  21. Ravichandran, Aruna. 2017. "Beware False DevOps Metrics." DevOps.com, March 21. Accessed 2018-10-09.
  22. Riley, Chris. 2015. "Metrics for DevOps." DevOps.com, January 26. Accessed 2018-10-09.
  23. Shabe, Charlie. 2017. "Understanding DevOps metrics." Beta News, August 25. Accessed 2018-10-09.
  24. Spafford, George and Ian Head. 2017. "Data-Driven DevOps: Use Metrics to Guide Your Journey." Gartner, July 13. Accessed 2018-10-11.
  25. Stackify. 2017. "Top DevOps Tools: 50 Reliable, Secure, and Proven Tools for All Your DevOps Needs." Stackify, March 10. Accessed 2018-10-09.
  26. Taylor, Twain. 2018. "Monitoring tools for serverless environments and AWS Lambda." Rollbar, January 08. Accessed 2018-10-09.
  27. Wallgren, Anders. 2018. "From Measurement to Insight: Put DevOps Metrics to Work." InformationWeek, March 14. Accessed 2018-10-09.
  28. Watson, Matt. 2017. "15 Metrics for DevOps Success." Stackify, December 11. Accessed 2018-10-09.
  29. Wilkie, Tom. 2017. "The RED Method: key metrics for microservices architecture." Blog, Weaveworks, May 13. Accessed 2018-10-09.
  30. Willie, Nigel and Sacha Labourey. 2018. "Enterprise DevOps: Principles of Meaningful Metrics Measurement, Part 1." CloudBees, April 18. Accessed 2018-10-09.
  31. Woo, Seongwoo. 2017. "Reliability Design of Mechanical System for mechanical civil Engineer." ResearchGate, January. Accessed 2018-10-09.

Milestones

2009

DevOps has its beginnings at the O'Reilly Velocity conference where John Allspaw and Paul Hammond present a talk titled 10+ Deploys a Day: Dev and Ops Cooperation at Flickr. Even in these early days, the importance of metrics is realized. Some metrics identified include CPU load, memory usage, network throughput, and aggregated job queue.

2016

There's a growing realization among practitioners that we can end up collecting a lot of wrong DevOps metrics. It's important to relate metrics to business values, needs or outcomes. One such proposal is the value-based approach that measures how value flows through the DevOps pipeline.

Jul
2017
image

Gartner publishes a report titled Data-Driven DevOps: Use Metrics to Guide Your Journey. This report includes a pyramid of metrics for DevOps.

Tags

See Also

Further Reading

  1. José, Fábio. 2018. "DevOps KPI in Practice — Chapter 1 — Deployment Speed, Frequency and Failure." Medium, March 22. Accessed 2018-10-09.
  2. Ellingwood, Justin. 2017. "An Introduction to Metrics, Monitoring, and Alerting." Digital Ocean, December 05. Accessed 2018-10-09.
  3. Schlossnagle, Theo. 2018. "Monitoring in a DevOps World." ACM Queue, vol. 15, no. 6, January 08. Accessed 2018-10-10.
  4. Robertson, Eric. 2017. "DevOps and value stream mapping: Why you need metrics." TechBeacon, July 12. Accessed 2018-10-10.
  5. Ehle, Dennis. 2016. "Measuring DevOps Performance Using a Value-Based Approach." Blog, VersionOne, October 18. Accessed 2018-10-09.

Top Contributors

Last update: 2018-10-12 15:21:50 by arvindpdmn
Creation: 2018-10-11 10:27:23 by lokesh.rawat

Article Stats

1462
Words
0
Chats
2
Authors
3
Edits
0
Likes
255
Hits

Cite As

Devopedia. 2018. "DevOps Metrics." Version 3, October 12. Accessed 2018-11-14. https://devopedia.org/devops-metrics
BETA V0.17.1