Visible Learning, hidden error
A critical examination of John Hattie's Visible Learning methodology
John Hattie's Visible Learning has been treated like a Bible for education reformers since its publication in 2009. By synthesizing over 800 meta-analyses, Hattie delivers a grand list of “what works” in education, dressed up in neat effect sizes and anchored to a magical "hinge point" of 0.4.1 But dig a little deeper, and cracks quickly start to show. Hattie's grand edifice is perhaps not as sturdy as it seems.
We need to talk about methodology
Simpson (2020) points out that Hattie's "meta-meta-analysis" is like blending apples, oranges, and maybe a few wrenches. Mashing together wildly different studies without accounting for variations in quality, context, or methods is a recipe for confusion, not clarity.2 Eacott (2018) goes further, calling out how Hattie's data glosses over contradictions, offering a false sense of certainty where none should exist.3
Then there's the famous hinge point: 0.4. Kamenetz (2015) skewers this arbitrary cut-off, noting that it falsely divides good from bad teaching strategies. Education is messy; boiling it down to a single number insults the complexity of classrooms. Worse, relying on these simplified metrics can lead schools down the wrong path, chasing "high effect size" interventions without understanding the why or how.4
Hattie's statistical blunders don't help either. Wikipedia's entry on Visible Learning documents errors in calculating the Common Language Effect Size (CLE). If the math is wrong, how much faith should we have in the conclusions? When education policy hangs on shaky calculations, it's more than just an academic slip: it's a real-world problem.
Beyond the numbers, Hattie's vision of education is possibly too narrow. Scott Eacott criticises Hattie for focusing almost exclusively on test scores, ignoring vital aspects like emotional development, creativity, and civic engagement. Teaching isn't just about moving test scores; it's about shaping human beings. I’m less convinced by this critique though as, while it’s no doubt true that the scope of education should be much wider than exam results, all these wonderful but hard-to-measure attributes we value need to show up somewhere. It’s no good saying that your approach is great for improving creativity or emotional learning if you can’t design a test to prove it.
Meanwhile, back at the chalkface, Hattie's ideas have led to some questionable practices. Robert Slavin pulls no punches in his baldly titled blog, John Hattie is Wrong
“Hattie is profoundly wrong. He is merely shoveling meta-analyses containing massive bias into meta-meta-analyses that reflect the same biases.
Slavin argues that Hattie’s approach combines studies of varying quality without sufficient scrutiny, leading to inflated effect sizes and misleading conclusions about educational interventions. He emphasises that many of the underlying studies are small-scale, short-term, or lack rigorous controls, which compromises any validity of the synthesised findings. He also shows how the exhortation to "make learning visible" has sometimes morphed into rigid learning objectives and checklists. Teachers become box-tickers rather than responsive educators, creativity stifled by bureaucratic demands.
In 2016, I even wrote my own criticism in which I challenged Hattie’s high effect size attributed to “self-reported grades.” I pointed out that the studies Hattie cites don’t actually measure the impact of students predicting their own grades on achievement. Instead, these studies often assess the correlation between students’ self-assessments and their actual performance, without establishing causation. I argued that Hattie’s interpretation conflates correlation with causation, leading to misleading conclusions about the efficacy of self-assessment strategies.
On top of all this, Rømer (2018) argues that John Hattie’s way of thinking about education makes it all about personal achievement, instead of helping students connect to shared knowledge, culture, and values. If learning is just about everyone building their own version of the world, Rømer warns, then education stops being about bigger truths and becomes just a way to boost individual scores. He believes Hattie’s approach risks turning schools into places that only train students to perform, not places that teach them how to think, belong, and understand the world around them.5
This is a pretty interesting take and deserves some thought. While Hattie borrows language from constructivism, particularly the idea that students must be active learners, he is far from a radical constructivist. Hattie champions explicit instruction, clear goals, rigorous feedback, and measurable outcomes, tools that pure constructivists often resist. His work is driven by pragmatic concerns about what maximises achievement, not by loyalty to any educational philosophy. I think Rømer’s critique goes too far in painting Hattie as philosophically extreme, when in reality, he is far more concerned with evidence-based impact than with ideological purity.
In an interview with the TES, Hattie insisted that his work is meant to inform, not prescribe. “Visible Learning was never about telling teachers what to do. It’s about helping them ask the right questions.” Professional judgment should still rule the day. On the hinge point, he has denied saying “‘Do everything above 0.4 and nothing below it.’ That’s nonsense. The context always matters.” And on the complexity of teaching he argues, “Teaching is not about applying a recipe. It’s about understanding your impact.” The clear implication is that his work allows teachers to better understand their impact. But the way Visible Learning is used - often uncritically - suggests that nuance gets lost somewhere between theory and practice.
To his credit, Hattie has acknowledged certain methodological flaws in his Visible Learning research, particularly concerning the calculation of the Common Language Effect Size (CLE). In his book, Hattie presented CLE values that were mathematically incorrect, with some exceeding 100% or being negative, an obvious impossibility for probabilities.6 However, he downplayed the significance of his errors, asserting that the incorrect CLE values did not substantially affect the overall conclusions of his work.
The Sequel: a response to criticism?
In 2023, Hattie released a substantially updated version of his book, Visible Learning: The Sequel. The new edition introduces a greater emphasis on the story behind effect sizes, not just the numbers themselves. Hattie acknowledges that context, implementation fidelity, and teacher expertise dramatically affect the success of interventions, an admission that moves him closer to his critics, who argued that his earlier work implied a kind of educational “shopping list” detached from the realities of classrooms. It offers a sharper focus on equity and learning processes rather than only outcomes. Hattie now warns readers against misusing effect sizes as absolute measures of quality, and explicitly encourages readers to think about why certain practices work, not just whether they seem to work. There’s also a greater emphasis on using longitudinal, large-scale, well-controlled research, rather than just any old meta-analyses.
Does this address all the criticisms? Not entirely. The core methodology remains. And while the tone is more cautious, the problem of educational complexity resisting neat quantification still haunts the project. Education is messy. While research can help us to filter out a signal from the noise, it always resists easy answers.
Although we must treat Hattie's conclusions with a healthy dose of skepticism, I still want to acknowledge Hattie as having played a crucial role in making education more research-aware. Visible Learning helped spark a major shift in education toward valuing evidence over intuition. Hattie’s work, despite its flaws, has made teachers and school leaders realise that not all interventions are equally effective and that teaching practices should be subjected to rigorous scrutiny.
For me personally, as well as for many others, John Hattie opened the door to a more thoughtful, research-literate profession.
Perhaps Hattie’s most famous idea is the "hinge point," which he uses to separate average teaching effects from truly impactful ones. In his analysis, an effect size of 0.4 represents more than the typical learning gains students make over a year; anything above it is flagged as especially worth pursuing. However, critics have pointed out that this hinge point is essentially arbitrary; a convenient line rather than a deeply grounded benchmark. It flattens the complex landscape of education into a binary of "good" and "not good enough," ignoring that smaller effect sizes can still be critically important depending on context, equity goals, or subject matter.
In The Misdirection of Public Policy (2020), Adrian Simpson critiques the way educational meta-analyses, particularly those like Hattie’s Visible Learning, combine and compare effect sizes across vastly different studies. Simpson argues that treating standardized effect sizes as universally comparable is a major methodological flaw because differences in study design, populations, and outcome measures can heavily distort results. As a result, interventions may appear more or less effective not because of their true educational value, but because of quirks in how they were studied. Simpson warns that this creates misleading "league tables" of strategies and risks steering educational policy toward interventions that look good statistically but may fail in real-world classrooms. He calls for much greater scrutiny of how evidence is synthesised before it is used to shape educational decisions.
In Meta-critique of Visible Learning (2018), Scott Eacott argues that John Hattie’s Visible Learning reduces the complexity of education to simplistic, quantifiable metrics without adequately defining what “learning” actually means. Eacott critiques Hattie’s work as more of a theory of evaluation than a genuine educational theory, and he challenges the philosophical foundations underpinning it, particularly Hattie’s use of radical constructivism and a flawed reading of Karl Popper’s “three worlds” theory. He warns that Hattie’s approach risks promoting a narrow, hierarchical, data-driven model of education that undervalues the broader, messier realities of teaching and learning. Overall, Eacott calls for a more philosophically rigorous and context-sensitive approach to educational leadership and research.
In Five Big Ideas That Don’t Work in Education (2015), Anya Kamenetz dismantles some of education’s most sacred cows, arguing that popular reforms like smaller class sizes, more funding, tougher standards, school choice, and standardized testing often sound good but do little to boost real learning. Drawing on Hattie’s research, she shows that these big-ticket ideas are more about political appeal than proven effectiveness, and warns that clinging to them wastes time, money, and opportunity in education reform.
Rømer also accuses Hattie of misusing Karl Popper’s ‘Three Worlds’ theory, shrinking education into a narrow focus on personal cognition while neglecting the richness of cultural and scientific knowledge.
This error was identified by Arne Kåre Topphol et al (2011) who noted that Hattie had misapplied the formula by using z-scores directly as probabilities without proper conversion.
Thanks for taking the time to critique Hattie's work, given Hattie's influence this is really important.
Hattie argued to move from "What Works" to "What works Best".
"The major message is that we need a barometer of what works best..." (preface)
"One aim of this book is to develop an explanatory story about the key influences on student learning - it is certainly not to build another 'what works' recipe." (p. 6)
"When teachers claim that they are having a positive effect on achievement or when a policy improves achievement, this is almost always a trivial claim: Virtually everything works. One only needs a pulse and we can improve achievement." (p. 16)
"Instead of asking 'What works?' we should be asking 'What works best?'" (p. 18)
Hattie cleverly argued the Effect Size determines what works best, but the significant critique of his work now seriously questions that.
I thought Slavin's critique showing Hattie mixed up +ve and -ve effect sizes was interesting. Hagemeister (2020) and also Sundar & Agarwal (2021) have shown this mistake in the Class Size category. Also, in - Reducing Disruptive Behavior, 1 of the 3 studies Hattie reports has a huge negative effect size - How can it be that reducing disruptive behavior, decreases achievement by a lot?
As a result of this critique, as you say Hattie has changed his emphasis to, "the Story".
When Ollie Lovell questioned him about all the critiques of his method, Hattie replied, "It’s not the numbers it’s the story... that’s why this will keep me in business to keep telling the story."
Given Hattie's significant influence, he should be challenged more about this change in narrative as it means his claims in Visible Learning are wrong and misleading.
The more educational research one reads & enacts, just as others ignore it, and do what they ‘feel’ works. The more one reads books as at the same time one reads the class in front of you. The more you realise that reading writing about teaching very often makes little difference. When you find yourself back at square one. It’s like the progress in philosophy over the ages. None