Making metrics matter: what can we measure in a Scrum team?

Reading time: 5 minutes

In agile development, metrics carry real weight. They shape what teams pay attention to, how managers decide what “progress” looks like, and whether the Sprint Review ends in a genuine conversation or a defensive performance. Get the metrics right and they become a lens for improvement. Get them wrong, and they become ammunition.

Metrics
Photo by Luke Chesser on Unsplash

This post is about the measures that actually matter in a Scrum team, and the ones that quietly do damage while looking productive.

Let me start with the one everyone keeps reaching for. In Scrum, velocity is not an output metric. Velocity is not mentioned a single time in the current Scrum Guide! It is a forecasting aid popularised outside the framework, not a measure of performance. Points are a planning tool. The moment you start comparing velocity across teams, or tracking it as a KPI, you have turned a navigation instrument into a scoreboard. Teams adapt, as they always do. They inflate estimates. They split stories to pad numbers. Goodhart’s law (“when a measure becomes a target, it ceases to be a good measure”) quietly kicks in, and no one admits it out loud. 

Flow, not volume

A better lens is flow. Instead of asking “how much did we produce?” ask “how well does work move through our system?”

Cycle time tells you how long a piece of work takes from the moment someone picks it up to the moment it ships. Lead time tells you how long it waits before anyone picks it up at all. Cycle time and lead time are flow metrics that describe the system, not the individual worker. They are feedback on the process. The gap between the two is often where the real problem lives. If a team’s cycle time is two days and their lead time is six weeks, the bottleneck is not in the doing. It lives in the waiting.

Quality metrics belong in the same category. Defects discovered after a story is closed, escaped bugs, rework rate. These reveal whether speed is being borrowed from the future. A team accelerating its velocity while defect counts climb has not gotten faster. It has simply borrowed against tomorrow.

For teams that want a more industrial frame of reference, DORA metrics are worth knowing. The four DORA metrics are deployment frequency, lead time for changes, change failure rate, and time to restore service. These were identified as the strongest signals for software delivery performance. They are not Scrum metrics as such, but they pair well with empirical process control. They measure the system, not the sprinter.

Individual metrics are a minefield

Now for the dangerous part. Management almost always wants individual metrics. Who is pulling their weight? Who is the star? Who is slacking?

Be careful here. Individual productivity metrics in the classic sense (story points per person, lines of code, hours logged) are surveillance dressed up as insight. They destroy the conditions that make a Scrum team work. In *Drive* (2009), Daniel Pink argues that autonomy, mastery, and purpose sustain intrinsic motivation. External measurement of individuals against quantified targets tends to replace intrinsic drive with compliance theatre. People stop helping each other because helping does not show up in the dashboard. They stop pairing. They stop reviewing each other’s work carefully. The team becomes a collection of individually optimised ghosts.

If you want to track individual growth, track it *with* the individual, not *on* them. What are they learning? Where are they stretching? What feedback have they received and integrated? These are conversation prompts, never leaderboard entries. A proper one-on-one will tell you more than any chart.

Measuring the team

Team-level measures are where it gets interesting, because this is where cohesion either shows up or it doesn’t.

Team morale is worth measuring, but not through a quarterly engagement survey that everyone games. Ask a simpler question at the end of each Sprint: “how did it feel this week?” Over time the pattern tells you something real. A team with collapsing morale across three consecutive Sprints has a problem that no velocity chart will surface.

Collaboration quality is harder to quantify, and that is precisely why it matters. Does conflict surface during Refinement, or does it leak out into private channels afterwards? Does feedback happen in the room, or in the DMs? You can feel this in a Retrospective long before any instrument picks it up.

Shared learning might be the most underrated measure in Scrum. A team where only one person knows how to deploy is not a resilient team. A team where knowledge moves between members, where pairing happens without being mandated, where someone can take a week off without the Sprint derailing: that is a team with real depth. You can track it in simple ways. How often does documentation get updated? Who reviews pull requests besides the original author? Who can answer which questions when the usual person is offline?

The art of presentation

The hardest part of metrics is what happens when you hand them over.

Context almost always matters more than the chart itself. A cycle time of four days means nothing in isolation. What trend is it on? What did the team just try? What dependency got removed last Sprint? The chart is the opening sentence, not the whole paragraph.

Be careful who receives the numbers and how they will use them. In *A Typology of Organisational Culture* (2004), Ron Westrum distinguishes pathological, bureaucratic, and generative cultures by how they handle information. In pathological cultures, messengers are shot. In generative cultures, information is actively sought because it produces learning. In a generative culture, data triggers curiosity. In a pathological one, it triggers blame. A Scrum Master handing metrics upward into a blame culture stops being a reporter. They become a supplier of targets for a firing squad.

The team conversation is a different beast. In the Retrospective, metrics should feed the discussion, never replace it. A graph is a question, never an answer.

So… what is worth measuring?

Nothing, if the system cannot act on what it learns. Most metrics conversations in agile teams are only superficially about metrics. They sit on top of a deeper question about trust, and what happens when something goes wrong. Teams that trust each other measure to learn. Teams that do not, measure to defend.

Pick your metrics, but pick your battles first. If management is asking for a number that will clearly be misused, stop debating which metric to choose. The conversation needed is a coaching one.

 

Leave a comment