How to Measure L&D ROI: A Practical Guide for HR and L&D Leaders
"If your ROI calculation relies on completion rates and satisfaction scores, you are measuring the training, not the development."
L&D ROI is the question that makes most HR and L&D professionals uncomfortable, not because they do not care about it, but because the honest answer is that most organisations are not measuring it properly. They are measuring activity. They are measuring satisfaction. They are measuring inputs rather than outputs. And when the CHRO asks whether the leadership development spend produced results, the answer is usually a carefully assembled package of proxies that point in the right direction without actually proving anything.
This article is about how to measure L&D ROI in a way that is honest, defensible, and actually useful for making better investment decisions. It covers the most widely used framework, its limits, and what a more rigorous approach looks like when behaviour change is the goal.
The Kirkpatrick Model: What It Measures and What It Misses
Most L&D professionals are familiar with the Kirkpatrick four-level model: Reaction, Learning, Behaviour, Results. It is the dominant framework for training evaluation because it provides a logical structure and because the first two levels are easy to measure.
Level 1 (Reaction) captures participant satisfaction, typically through a post-session survey. Did participants enjoy the session? Did they find it valuable? Would they recommend it? This is the most commonly collected data point in corporate L&D. It is also the least predictive of outcomes. High satisfaction scores correlate weakly with behaviour change. They correlate strongly with facilitation quality, physical comfort, and whether the session ended at a reasonable hour.
Level 2 (Learning) measures whether participants acquired new knowledge, skills, or attitudes. Pre- and post-tests, skill assessments, reflection exercises. Again, widely used, and again, only weakly predictive of what participants will actually do when they return to their jobs. Knowing the right answer to a question about active listening and genuinely listening differently in a difficult conversation are not the same thing.
Level 3 (Behaviour) is where the measurement becomes hard and most organisations stop. Behaviour change requires follow-up after the training, in the actual job context, by people who can observe the leader in action. Manager observations, 360-degree feedback, structured interviews with direct reports, performance reviews. These are time-consuming, require coordination, and often reveal that the training had less effect than the Level 1 and 2 data suggested. This is uncomfortable, which is why it is so often skipped.
Level 4 (Results) connects behaviour change to organisational outcomes. Team performance, retention, promotion rates, productivity measures, customer satisfaction in teams led by programme participants versus comparable teams who were not. This is the level that CFOs and CHROs ultimately care about, and it is the level that is almost never rigorously measured, because attribution is difficult and the data collection horizon is long.
Why Most L&D Measurement Stops at Level 2
The structural reason is incentive misalignment. L&D teams are typically evaluated on programme delivery: training hours completed, satisfaction scores, number of participants trained. These are all Level 1 and Level 2 metrics. The people who commission training programmes are typically not the people who would observe the behaviour change six months later, and the systems that would capture that data are often not in place.
The result is that organisations spend significant budgets on leadership development and have no credible evidence about whether it worked. They have strong evidence that the training happened. They have reasonable evidence that participants liked it and felt they learned something. They have very limited evidence about whether anything actually changed.
A More Honest Measurement Approach
A more honest approach starts by defining success before the programme begins, not after. This requires L&D teams and the business stakeholders who commissioned the programme to agree, in advance, on three things: what specific behaviours should change, how those changes will be observed, and what observable change would constitute a successful return on the investment.
This conversation is harder than it sounds. Business stakeholders often cannot articulate specific behaviour changes. They want "better leaders" or "stronger collaboration" or "more strategic thinking" without being able to specify what those phrases would look like in observable terms. The L&D professional's job, at this stage, is to push until the articulation becomes specific enough to measure.
"Better leaders" is not measurable. "Managers who hold structured one-on-ones with their direct reports at least fortnightly" is measurable. "Stronger collaboration" is not measurable. "Cross-functional project teams that resolve resource conflicts without escalating to director level" is measurable. The specificity feels pedantic until you try to measure whether things changed. Then the specificity is what makes measurement possible.
Practical Measurement Approaches for Different Interventions
For skills-based training (technical skills, process adoption), pre- and post-assessment with a follow-up check at 60 days is usually sufficient. If the skill is not being applied at 60 days, the training either failed to produce learning or produced learning that is not transferring to the job context. Both are diagnostic signals.
For behaviour-focused development (leadership, communication, influence, decision-making), the measurement must happen at Level 3. This requires a baseline before the intervention and follow-up observations after. Practical approaches include: 180-degree or 360-degree assessments with participants' teams before and three to six months after, structured interviews with direct reports and peers, facilitator observations during subsequent programme sessions, or manager assessments against specific behaviour criteria.
For team dynamics and culture change, the measurement window is longer and the attribution challenge is greater. Team performance metrics, engagement scores, and retention data are all relevant, but distinguishing the effect of the programme from other environmental factors requires either a comparison group (teams with similar characteristics who did not receive the intervention) or a longer observation window that can separate signal from noise.
The ROI Calculation
If you need to produce an ROI number for a CHRO or finance team, the Phillips ROI model extends Kirkpatrick by adding a Level 5 that converts Level 4 results into financial terms. The formula is standard: ROI (%) = ((Benefits - Costs) / Costs) x 100.
The hard part is not the formula. It is isolating the financial value of the behaviour change. This requires putting a number on things like reduced escalation time in conflict situations, improved decision quality in resource allocation meetings, or faster onboarding of new team members into a more cohesive team. These numbers exist, they are just not naturally surfaced by most management information systems.
A practical approach: focus the calculation on one or two high-value behaviour changes where the financial stakes are clearest. A senior manager who resolves cross-functional conflicts more effectively is preventing escalations that consume director and VP time. A leadership team that makes resourcing decisions without the dysfunction of territorial negotiation saves planning cycles. Estimate conservatively, document the methodology transparently, and present the ROI as a range rather than a point estimate. Ranges are more honest and, counterintuitively, often more credible with finance audiences than precise numbers.
What Good L&D ROI Measurement Requires of the Intervention Itself
The final point is often overlooked. Effective ROI measurement requires that the intervention was designed to produce measurable outcomes. If the training had no specific behavioural objectives, you cannot measure whether it achieved them. If the programme was designed primarily around content and engagement rather than behaviour change, the measurement will accurately reflect that: good satisfaction scores, uncertain behaviour outcomes.
This is why the design question and the measurement question are inseparable. Interventions designed around specific observable behaviour changes are both easier to measure and more likely to produce what they promise. Serious games are built around specific behaviour patterns they are designed to surface. That means the facilitator can observe whether those patterns appeared, the debrief can connect them to real work contexts, and the follow-up measurement has a specific thing to look for.
The measurement framework should be designed at the same time as the programme, not retrofitted afterward. When it is, measuring L&D ROI becomes a standard part of programme delivery rather than an uncomfortable question the CHRO asks that nobody can answer well.
To talk about how to design L&D interventions with measurable behaviour change built in, get in touch. Or explore the games designed to surface specific leadership patterns.
