What Is an Agile Maturity Model?

An Agile maturity model is a structured framework for evaluating how effectively a software team or organisation practises Agile — not just whether it uses the rituals, but whether those rituals produce the outcomes they're designed to produce. The goal is to move from performing Agile (going through the motions) to being Agile (delivering continuously improving outcomes).

The concept draws from the Capability Maturity Model Integration (CMMI) developed at Carnegie Mellon in the 1980s, adapted for the Agile context. Most Agile maturity models use a 5-level scale, though the naming and weighting of levels varies by framework. The important thing isn't which model you use — it's that you're measuring the right things consistently.

Used correctly, a maturity assessment gives leadership three things: a current-state baseline, a prioritised list of improvement areas, and a common vocabulary for discussing delivery quality without blame.

Important distinction: A high maturity score doesn't automatically mean fast delivery. A team can be highly mature and still be slow because of organisational constraints, tool debt, or product complexity. Maturity measures process health — not speed. Speed is a downstream outcome of process health, not a direct measurement of it.

The Major Agile Maturity Frameworks Compared

There is no single universally adopted Agile maturity model. The most commonly referenced frameworks are:

FrameworkLevelsBest ForKey Focus
Scrum Alliance Maturity Model5Scrum teamsCeremony quality, team values, continuous improvement
SAFe Business Agility Model5Large enterprises, ARTsPortfolio alignment, PI planning, organisation-wide agility
DORA Metrics Framework4 performance tiersEngineering/DevOps teamsDeployment frequency, lead time, MTTR, change failure rate
Spotify Model MaturityInformalProduct-led companiesSquad autonomy, chapter alignment, tribe coordination
Klarheit Delivery X-Ray™5Scale-ups & enterprise8 dimensions: ceremonies, metrics, team dynamics, backlog, technical practices, release, stakeholders, improvement

For most software delivery teams — particularly those at Series A–C scale-ups and enterprise organisations not running SAFe at scale — a practitioner-built model like the Delivery X-Ray™ framework is more useful than a certification-aligned model. Certification models optimise for theoretical correctness. Practitioner models optimise for what actually causes delivery to improve or degrade.

Why Most Agile Assessments Fail

Agile assessments are common. Useful ones are rare. The reasons are consistent across organisations:

  • Self-assessment bias. Teams rate themselves higher than they perform. Scrum Masters — whose professional identity is tied to team performance — are particularly susceptible to this. Independent assessment removes the conflict of interest.
  • Surveying instead of observing. A questionnaire tells you what people think is happening. Ceremony observation, Jira data analysis, and stakeholder interviews tell you what's actually happening. The gap between these two is often where the real problems hide.
  • No prioritisation of findings. An assessment that returns 18 equal-weight recommendations is not useful. A useful assessment returns the top 3 changes with the highest expected impact given the team's current context.
  • Measuring compliance, not outcomes. "Do you run retrospectives?" is not a useful question. "Do your retrospectives generate action items that ship in the next sprint?" is.

The 5 Agile Maturity Levels in Depth

Level 1 — Initial: The Heroic Phase

At Level 1, delivery is chaotic. Processes exist on paper — if at all — but are rarely followed. Success in any given sprint depends on who is available, how hard they work, and whether the right person happens to know the answer. This is sometimes called the "heroic" phase because individual effort compensates for system failure.

Signs of Level 1: sprint scope changes daily, standups are status reports to management rather than team synchronisation, retrospectives either don't happen or produce the same list every quarter, and no one can reliably forecast what will be delivered two weeks from now.

The move from Level 1 to Level 2 requires establishing basic ceremony discipline before anything else. Timeboxing, defined roles, and a working Definition of Done are the minimum viable changes.

Level 2 — Developing: Going Through the Motions

Level 2 is where most teams arrive after their first Agile training. The ceremonies exist, the board is updated, the retrospectives happen. But the ceremonies don't produce outcomes — they produce outputs. Standups report yesterday's work. Retrospectives list the same three problems. Planning is optimistic and consistently overcommitted.

This is the most common level for teams that have adopted Agile without changing the underlying system. They have the vocabulary but not the principles. They're performing Agile, not being Agile.

The move from Level 2 to Level 3 requires focusing on retrospective follow-through above all else. Nothing changes until the feedback loop closes. A team that ships one retrospective action per sprint is already accelerating past most Level 2 peers.

Level 3 — Defined: Predictable Delivery

Level 3 is the first level where Agile is producing its intended outcomes. Delivery is predictable within a range. Stakeholders trust sprint commitments because commitments are consistently met. Retrospectives drive measurable change. The team has a working Definition of Ready and a Definition of Done that both mean something.

Level 3 is the minimum viable state for a scale-up that has investor expectations to manage, a product roadmap to commit to, or regulatory requirements that depend on predictable delivery cadences.

The move from Level 3 to Level 4 requires introducing flow metrics — particularly cycle time and throughput — and using them to drive planning decisions, not just retrospective conversation.

Level 4 — Managed: Metrics-Driven Optimisation

At Level 4, the team makes decisions based on data. Velocity trends inform capacity planning. Cycle time analysis identifies bottlenecks before they become crises. DORA metrics are tracked: deployment frequency, lead time for changes, change failure rate, and mean time to recovery. Stakeholder trust is high because forecasts are reliable and communicated proactively.

Level 4 is where technical practices start to become a constraining factor. Teams at this level typically have the process maturity to expose technical debt clearly — and the organisational maturity to do something about it.

Level 5 — Optimising: Continuous Improvement Culture

Level 5 is less a destination than a way of operating. Improvement is continuous, not episodic. The team doesn't wait for retrospectives to fix problems — it builds feedback loops into daily work. Experimentation is safe: teams try things, measure outcomes, and adapt without needing leadership approval for every change. Team autonomy is high because the system is trustworthy enough to support it.

Very few teams reach Level 5 and sustain it. It requires not just technical and process maturity but organisational maturity — leadership that trusts data over anecdote, and a culture that treats process improvement as product work, not overhead.

How to Conduct an Agile Maturity Assessment

A rigorous maturity assessment combines four data sources. Using any one in isolation produces an incomplete picture:

  1. Pre-session questionnaire. Sent to the Scrum Master, Product Owner, and 2–3 developers before any calls. Covers all 8 dimensions with specific, outcome-oriented questions (not "do you run standups?" but "what percentage of standup action items are resolved before the next standup?").
  2. Ceremony observation. Silent observation of at least one sprint planning, one standup, and one retrospective. This is where the gap between the questionnaire and reality typically becomes visible.
  3. Tooling data analysis. Jira or Azure DevOps sprint history, velocity trends, cycle time distributions, backlog aging, and board configuration. Quantitative data that can't be influenced by perception.
  4. Stakeholder interviews. 15-minute conversations with 1–2 stakeholders outside the team. Their experience of predictability, communication quality, and delivery trust is a leading indicator of team effectiveness that the team itself often can't see.

The output is a scored report across all 8 dimensions, an overall Agile Health Score, and a prioritised improvement roadmap with specific, actionable recommendations for the top 3 focus areas.

How long does it take? A Klarheit Agile Maturity Assessment typically takes 1–2 days of engagement time spread over 1–2 weeks (to allow for ceremony observation at natural sprint boundaries). The written report is delivered within 2 business days of the final session. Compare this to a full Delivery X-Ray™ audit, which takes 2–4 weeks and covers all dimensions in significantly greater depth.

Common Mistakes at Each Level — and How to Avoid Them

Every maturity level has characteristic failure modes — the traps teams fall into as they try to advance:

  • Level 1 → 2 mistake: Installing ceremonies without agreeing on their purpose. Teams start running standups before anyone agrees what a good standup looks like. The ceremony exists but produces nothing. Fix: agree on the Definition of Done for each ceremony before starting them.
  • Level 2 → 3 mistake: Focusing on velocity before fixing retrospective follow-through. Velocity is a lagging indicator. Retrospective action completion is a leading indicator of everything else. Fix: make retrospective action items first-class backlog items with owners and acceptance criteria.
  • Level 3 → 4 mistake: Tracking metrics without connecting them to decisions. Cycle time charts appear on dashboards but don't change planning behaviour. Fix: introduce a weekly flow review — 20 minutes where the team looks at one metric and decides one thing.
  • Level 4 → 5 mistake: Mistaking process compliance for continuous improvement. Teams at Level 4 often have excellent processes but conservative improvement culture — changes require approval, experiments feel risky. Fix: create a formal "safe to fail" experiment framework with lightweight hypothesis documentation and time-boxed trials.

What to Do With Your Assessment Results

An assessment score without a prioritised action plan is just a number. The framework for converting results into movement is straightforward:

  1. Identify your lowest-scoring dimension. That's where you start — not the easiest to fix, and not the most visible to leadership, but the one that is costing you the most in delivery quality right now.
  2. Identify the single change with the highest expected impact in that dimension. Not a programme. One change. Typically this is a ceremony modification (changing how retrospectives work), a metric introduction (adding cycle time to the sprint review), or a tooling fix (reconfiguring the Jira board to reflect actual workflow).
  3. Run it for two sprints and measure. Two sprints is enough to see whether a change is working. If it is, compound it. If it isn't, adapt it — don't abandon it.
  4. Only then move to the next priority. Compound improvement. Most teams try to fix everything at once and change nothing permanently. One change, measured, compounded, is how Level 2 teams become Level 3 teams inside six months.

The Klarheit Approach to Agile Maturity

Klarheit's Delivery X-Ray™ framework is built on 15 years of hands-on delivery experience across banking, payments, healthcare, telecom, and IoT — including four years as Scrum Master and Agile Coach at PwC. It is designed specifically for software delivery teams at scale-ups and enterprise organisations that need an objective, data-driven baseline — not a certification-aligned score that tells them what they want to hear.

The assessment is independent by design. Klarheit has no interest in selling you an ongoing coaching programme or a transformation engagement. The goal is a clear, honest picture of where you are, what it's costing you, and exactly what to change next. If a full Delivery X-Ray™ audit would add value after the assessment, we'll say so. If the maturity assessment is enough to get your team moving, we'll say that too.

The free discovery session is the right place to start. Thirty minutes. No pitch. Just an honest conversation about your team's delivery challenges and whether a structured assessment would give you the clarity you need.

Related Reading