事情朝着好的方向发展_嘿,团队,我们朝着正确的方向前进吗?

事情朝着好的方向发展

关于衡量工程团队绩效的棘手问题-第2部分,共2部分:软件交付和运营绩效的五个关键指标/为什么我们需要更深入地挖掘/总结 (On the trickiness of measuring performance of engineering teams — Part 2 of 2: Five Key Metrics for Software Delivery and Operational Performance / Why we have to dig even deeper / Summary)

In the previous post (part 1 of 2), I have discussed the importance of having a good reason to measure performance, the attributes of good performance metrics, and why sprint velocity does not qualify as a performance metric.

在上一篇文章(第1部分,共2部分)中,我讨论了衡量性能的充分理由,良好性能指标的属性以及为什么冲刺速度不符合性能指标的重要性

四个魔术数字 (Four magic numbers)

So if sprint velocity does not help us with measuring performance, what does? The DevOps research and assessment program has some answers. DORA is the “longest running academically rigorous research investigation into the capabilities and practices that predict software delivery performance and productivity”. The research was initiated by Dr. Nicole Forsgren, Jez Humble and Gene Kim and resulted in the book “Accelerate” by the same authors. In early 2019 DORA was acquired by Google Cloud who now continues the research (you can download the latest State of DevOps report here).

因此,如果冲刺速度不能帮助我们衡量性能,那又如何呢? DevOps研究和评估计划有一些答案。 DORA是“对预测软件交付性能和生产率的功能和实践进行的时间最长,学术上严谨的研究调查”。 这项研究由Nicole Forsgren博士Jez HumbleGene Kim发起,并由同一作者撰写了《 加速》一书。 在2019年初,DORA被Google Cloud收购,Google Cloud现在继续进行研究(您可以在此处下载最新的DevOps状态报告)。

Based on data insights collected from thousands of DevOps practitioners and companies across the globe, DORA had originally identified four key metrics that indicate software delivery performance. In their latest research they have added a fifth one and extended the notion of performance to delivery- and operational performance. Here are the five metrics in all their beauty:

根据从全球数千名DevOps从业人员和公司收集的数据见解,DORA最初确定了四个关键指标来指示软件交付性能。 在最新的研究中,他们增加了第五项,并将性能的概念扩展到交付和运营性能。 以下是这五个指标的全部优点:

Image for post
DORA key metrics predicting Software Delivery and Operational performance
DORA预测软件交付和运营绩效的关键指标

Let’s have a quick look at what each metric means:

让我们快速了解每个指标的含义:

  1. Deployment Frequency — the frequency of deploying code changes into production

    部署频率-将代码更改部署到生产中的频率

  2. Lead Time for code changes — the time it takes from code committed to code successfully running in production

    代码更改的前置时间 -从提交 代码成功在生产中运行代码所需的时间

  3. Time to Restore — time it takes to restore “normal” service after an incident or a defect that impacts users occurs

    恢复时间-发生影响用户的事件或缺陷后,恢复“正常”服务所花费的时间

  4. Change Fail — percentage of production releases that result in degraded service and require remediation (rollback, fix forward, patch etc)

    变更失败 -导致服务质量下降并需要修复(回滚,向前修复,补丁等)的生产版本的百分比

  5. Availability — the percentage of time a primary application or service is available for its users

    可用性 —主应用程序或服务对其用户可用的时间百分比

According to DORA, these five metrics are the best indicators for system-level outcomes that predict software delivery and operational performance, which, in turn, predicts the ability of an organisation to achieve its commercial and non-commercial goals. Certainly sounds like something worth keeping an eye on, doesn’t it?

DORA认为,这五个指标是预测软件交付和运营绩效的系统级成果的最佳指标,这些指标又可以预测组织实现其商业和非商业目标的能力。 当然听起来像是值得关注的东西,不是吗?

Now, if you want to monitor these metrics, first thing to check is whether you have the relevant data. The good news is, if you work in a company that does not completely ignore DevOps, chances are that you do. You will probably use version control and track all code changes. You will probably track production deployments, whether you do them manually or automated as part of your continuous delivery pipeline. You will probably monitor service availability and document incidents in some way, or the tools you use will do it for you.

现在,如果要监视这些指标,首先要检查的是您是否具有相关数据。 好消息是,如果您在一家不完全忽略DevOps的公司工作,那么您很有可能会这么做。 您可能会使用版本控制并跟踪所有代码更改。 您可能会跟踪生产部署,无论您是在连续交付流程中手动还是自动进行生产部署。 您可能会以某种方式监视服务的可用性和事件记录,或者您使用的工具将为您做到这一点。

That doesn’t mean that setting up monitoring for these metrics will be easy. Relevant data can be scattered across many tools, from Grafana dashboards to Jira tickets. Some of the data may have poor quality, e.g. it can be incomplete or erroneous. The way the data is gathered may not be standardized across teams, making it difficult to create reusable solutions.

这并不意味着为这些指标设置监控将很容易。 从Grafana仪表板到Jira票证,相关数据可以分散在许多工具中。 一些数据的质量可能很差,例如可能不完整或错误。 各个团队之间的数据收集方式可能不统一,因此很难创建可重用的解决方案。

In other words, a lot of work will be involved in making it possible to monitor these metrics. So we should make sure that this is time well invested, and these metrics share the characteristics of good performance metrics discussed in the previous post.

换句话说,要监视这些指标将涉及大量工作。 因此,我们应该确保这是时间投入得当,并且这些指标具有上一篇文章中讨论的良好性能指标特征

衡量五个关键指标的好处有限 (Limited benefits of measuring the five key metrics)

In fact, the DORA metrics focus on global outcome, and they are leading indicators for software delivery performance. They can provide insights without additional context, although they are far from being non-contextual. However, when it comes to being actionable, there is a cue. “Let’s reduce time to restore!” — a great plan, but where to start? “Let’s deploy to production more often!” — sure thing, but our regression testing is manual and takes a full day, and we have dependencies with two other teams for this release…

实际上,DORA指标着眼于全球成果 ,它们是软件交付性能的领先指标 。 它们可以提供见解而无需附加上下文 ,尽管它们远非上下文无关 。 但是,当涉及到可行时 ,就会有所提示。 “让我们减少恢复时间!” —伟大的计划,但是从哪里开始呢? “让我们更多地部署到生产中!” —肯定的事情,但是我们的回归测试是手动的,并且需要一整天的时间,因此在此版本中,我们与其他两个团队有依赖关系……

If improving the five DORA metrics was easy, all companies would be elite performers. Development speed and operational performance do not come for free. They are the result of a multitude of good architectural-, engineering-, product-, process-, business- and organisational decisions and execution on all levels. Measuring the DORA metrics will give companies insights into their level of performance, and help them to see whether they are improving over time, and whether the sum of their improvements is yielding good results. This is valuable in itself. However, it will not tell teams which decisions they should rethink or where exactly they need to improve in terms of execution.

如果轻松改进五个DORA指标,那么所有公司都将是杰出的企业。 开发速度和操作性能不是免费的。 它们是在各个级别上具有良好的体系结构,工程,产品,过程,业务和组织决策和执行的结果。 衡量DORA指标将使公司深入了解其绩效水平,并帮助他们了解他们是否随着时间的推移而有所改善,以及改善的总和是否产生了良好的效果。 这本身很有价值。 但是,它不会告诉团队应该重新考虑哪些决策,或者在执行方面到底需要改进哪些地方。

深层发掘 (Digging deeper)

Looking at the DORA research we see that, while the DORA metrics are leading indicators for organisational performance, they are, in fact, lagging indicators for concrete capabilities in engineering, architecture, product and processes, product management and organisational culture. To identify areas on improvement, we need to look at more granular constructs in order to discover more granular objectives that we can act upon.

通过DORA研究,我们可以发现,尽管DORA指标是组织绩效的领先指标,但实际上,它们是工程,体系结构,产品和过程,产品管理和组织文化中具体能力的落后指标 。 为了确定需要改进的地方,我们需要研究更精细的结构,以便发现可以采取行动的更精细的目标。

Image for post
Extract from DORA graph describing predictive relationships between different constructs with relevancy for Software Delivery and Operational performance
从DORA 图中提取的内容 描述了与软件交付和运营绩效相关的 不同结构之间的预测关系

For example, we can go through a list of technical practices (see image above) that drive continuous delivery and thus predict a better software delivery performance, less rework, lower deployment pain and less burnout, and see which of them are in place and which of them we could consider implementing. While some practices may require high-level architectural decisions (e.g. shifting left on security, improving data architecture), others may be well tackled within the scope of a team (e.g. introducing trunk-based development, improving code maintainability, increasing test automation coverage).

例如,我们可以浏览一系列推动持续交付的技术实践(请参见上图),从而预测更好的软件交付性能,更少的返工,更低的部署痛苦和更少的倦怠,并查看其中哪些已实施,哪些已实施其中我们可以考虑实施。 尽管某些实践可能需要高层体系结构决策(例如,向左转移安全性,改善数据体系结构),但其他实践可能在团队范围内得到很好解决(例如,引入基于主干的开发,提高代码可维护性,增加测试自动化范围) 。

Identifying suitable metrics for each practice would go way beyond the scope of this post. I will limit myself to mentioning static code analysis tools as a means to measure test coverage or code maintainability. Relying to heavily on static code analysis tools has its dangers, but that’s true for any metric and a topic for a different discussion anyway.

为每种实践确定合适的指标将超出本文的范围。 我将只提及静态代码分析工具,以衡量测试覆盖率或代码可维护性。 严重依赖静态代码分析工具有其危险,但是对于任何度量标准和任何讨论的话题都是如此。

清单,成熟度模型和直觉 (Checklists, maturity models and gut feeling)

An easy and helpful means to measure the adoption of good practices in a team are capability checklists. Such lists can be great leading indicators for desired results, simply by surfacing which practices are in place and which are still missing. Certainly it is important that any suggested capabilities are well researched, and that possible improvements are reviewed against the backdrop of specific teams and organisations. One particular well-researched capability checklist I came across when researching this topic is the DevOps Checklist by Steve Pereira.

能力清单是衡量团队中良好实践采用情况的一种简单而有用的方法。 只需列出哪些实践已经存在,哪些实践仍然缺失,这样的清单就可以成为取得预期结果的重要领先指标。 当然,对任何建议的功能进行深入研究,并在特定团队和组织的背景下审查可能的改进,这一点很重要。 一个特别的精心研究能力清单我碰到研究这个话题的时候是DevOps的清单史蒂夫·佩雷拉

It is also worth mentioning self-checks and maturity models in this context. As Henrik Kniberg notes, such modelscan help boost and focus your improvement efforts. But they can also totally screw up your culture if used inappropriately”. Used correctly, namely as a means to learn and not to judge, the Squad Health Check model developed by Kniberg allows to surface areas of necessary action and trigger discussions that lead to concrete improvements.

在这种情况下,还值得一提的是自我检查和成熟度模型。 正如Henrik Kniberg指出的那样,此类模型 可以 帮助推动并集中精力进行改进。 但是如果使用不当,它们也可能完全破坏您的文化”。 正确使用(即作为学习和判断的一种手段),由Kniberg开发的Squad Health Check模型可以发现必要的措施并引发讨论,从而带来具体的改进。

Like all survey-based “health checks” the exercise is self-diagnostic, and therefore prone to error, as it largely relies on people’s subjective perception in absence of factual data. That said, where we do not have or maybe also do not need precise measurements, gut feeling can be a strong corrective factor, especially when combined with some quantified data. Many areas that predict organisational performance, such as having fun at work, strategic alignment or a clear mission, can hardly be measured other than by collecting subjective perceptions.

像所有基于调查的“健康检查”一样,这项运动是自我诊断的,因此容易出错,因为它很大程度上依赖于人们在缺乏事实数据的情况下的主观感知。 就是说,在我们没有或可能也不需要精确测量的地方, 肠感可能是一个很强的校正因素 ,尤其是与一些量化数据结合时。 除了收集主观感觉外,几乎无法衡量许多预测组织绩效的领域,例如工作乐趣,战略调整或明确的任务。

摘要 (Summary)

So, what stands at the end of this excursion into the world of measuring performance?

那么,在这次考察绩效世界之旅的终点是什么?

  • It is important to remember why we are measuring performance — namely, to learn what we can improve, and test the results of improvement experiments. Measuring performance can be costly, so ask yourself how the results will help you learn and improve before you invest time and efforts to measure anything

    重要的是要记住为什么我们要衡量绩效-即学习我们可以改进的地方 ,并测试改进实验的结果。 衡量绩效可能会付出高昂的代价,因此请问自己,在投入时间和精力来衡量任何事情之前,结果如何帮助您学习和改进

  • Good performance metrics are actionable, leading indicators that focus on global outcomes over local optimisations and provide value without much additional context

    良好的绩效指标是可行的,领先的指标,这些指标着眼于本地优化方面的全球成果,并在没有太多其他背景的情况下提供价值
  • Sprint velocity is not a (good) performance metric. It’s volatile, contextual and does not predict performance

    短跑速度不是一项(良好)性能指标。 它是易变的,上下文相关的,不能预测性能
  • Lead Time, Deployment Frequency, Change Fail, Time to Restore and Availability predict software delivery- and operational performance. They are, however, difficult to actionise because they abstract more concrete and granular capabilities

    提前期,部署频率,更改失败,恢复时间和可用性可预测软件交付和运营性能。 但是,它们难以执行,因为它们抽象了更具体和更细粒度的功能
  • To identify possible improvements, we need to dig deeper and look at more specific data. Capability checklists and maturity models can be helpful to uncover areas of improvement, and we also should not dismiss subjective self-assessment, especially in combination with quantified data

    为了确定可能的改进,我们需要更深入地研究并查看更具体的数据。 能力清单和成熟度模型有助于发现需要改进的地方,我们也不应该放弃主观自我评估,尤其是结合量化数据时

This got rather long. If you’ve made it here, congratulations and thank you very much for your attention. I would love to hear what you think, whether you can relate to this post, whether there is something you disagree with and whether you can take away something useful for you and your teams. So just leave a comment or reach out via LinkedIn or Twitter.

这花了很长时间。 如果您在这里取得成功,表示祝贺并非常感谢您的关注。 我很想听听您的想法,是否可以与该职位相关,是否有不同意见,以及是否可以从中获取对您和您的团队有用的信息。 因此,只需发表评论或通过LinkedInTwitter进行联系即可

翻译自: https://medium.com/serious-scrum/hey-team-are-we-moving-in-the-right-direction-958e55bd0d88

事情朝着好的方向发展

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值