Cognitive load, conceptual fog: when ideology masquerades as evidence
Alfie Kohn’s critique of explicit instruction and Cognitive Load Theory reveals more about his educational philosophy than about how learning actually works
Recently I wrote a critique of Alfie Kohn’s argument against Cognitive Load Theory.
I set out to approach it as respectfully and even-handedly as possible. But as I began digging into the footnotes - of which there are 37 - my patience wore thin. The deeper I read, the clearer it became that many of his citations are little more than rhetorical scaffolding. And frankly, when someone cites Guy Claxton’s clownish posturing as serious support, it’s hard not to question the intellectual footing of the entire argument. You can read Kohn’s original article here.
The effect of Kohn’s 37 citations, whether by design or accident, is to build an impenetrable rampart behind which his trenchant critique of Cognitive Load Theory is defended. It functions like a ring of warning signs: proceed no further unless you’re willing to engage in a laborious trawl through citations, half-claims, and borrowed authority.
Because this rampart demanded a comprehensive rebuttal, the post below is necessarily long (I’m afraid, there’s almost 20,000 words of it!) Each of Kohn’s footnotes had to be traced back to its original source and scrutinised to see whether they actually support the claims he makes. I don’t expect many readers to make it to the end, but the cumulative effect is hard to ignore.
Each of the 37 sections begins with a short extract from Kohn’s essay in bold, followed by the original footnote in italics. My commentary appears in plain text.
1. [As] early-childhood expert Alison Gopnik put it, “direct instruction really can limit young children’s learning.” The quotation is from Alison Gopnik, ““Why Preschool Shouldn’t Be Like School,” Slate, March 16, 2011. Among the studies she cites is Elizabeth Bonawitz et al., “The Double-Edged Sword of Pedagogy: Instruction Limits Spontaneous Exploration and Discovery,” Cognition 120 (2011): 322-30. Her summary is consistent with my own review of multiple studies that contrast explicit, teacher-centered instruction of young children with child-centered or constructivist approaches: See this lengthy excerpt from Alfie Kohn, The Schools Our Children Deserve (Houghton Mifflin, 1999). Separately, a 2022 review of research conducted with children from toddlers to eight-year-olds found that guided play was more effective than direct instruction: Kayleigh Skene et al., “Can Guidance During Play Enhance Children’s Learning and Development in Educational Contexts?” Child Development 93 (2022): 1162-80.
In a pattern that will very quickly become clear, the claim that explicit instruction limits children’s learning rests on a brittle scaffold of selective evidence and questionable interpretation. Gopnik’s Slate article, though elegantly written, is a journalistic polemic rather than a rigorous review. Her invocation of Bonawitz et al.’s 2011 study is frequently cited by constructivists as evidence that instruction “limits exploration.” Yet what the study actually found was extraordinarily context-dependent. Children shown how a toy worked were less likely to explore other functions of the toy but that’s hardly a wholesale repudiation of instruction. It simply shows that children infer pedagogical purpose: if the adult tells me this is how it works, that must be what I need to know. In other words, the finding is as much about social inference as it is about cognition.
Moreover, the Bonawitz paper studied four-year-olds with a novel, ambiguous toy. It tells us very little about how children learn number bonds, letter-sound correspondence, or syntactic rules. The leap from ‘children taught one function of a toy don’t explore it as much’ to ‘formal instruction stifles thinking’ is, frankly, acrobatic.
Kohn’s invocation of his own writing doesn’t help the case. The Schools Our Children Deserve is a rhetorical treatise that makes no pretence of neutrality. His cherry-picking of studies that favour child-led learning is consistent with his ideological stance but inconsistent with the weight of cognitive science research. Kohn may be persuasive to those who already agree with him, but his interpretations collapse under the weight of careful reading or controlled replication.
The 2022 Skene et al. review is a more serious piece of work, but even here the devil is in the detail. “Guided play” is not the same as unguided exploration. It typically involves adults scaffolding learning within a play context, often drawing explicitly on knowledge of developmental psychology and instructional design. In many cases, guided play looks suspiciously like carefully planned, responsive teaching with a playful veneer. And crucially, its effects tend to depend on domain: it may be helpful in early numeracy, for example, but far less so in phonics or vocabulary acquisition, where structured instruction outperforms discovery.
At best, the evidence cited here suggests that the manner and timing of instruction matters in early childhood, and that over-didactic teaching may crowd out curiosity. But it does not justify the wholesale rejection of direct instruction, nor does it refute the growing consensus that novices benefit most from clarity, structure, and intentional modelling.
2. “A review of 225 studies showed that active learning results in “strong increases in student performance” when compared to traditional lecture-based teaching. Scott Freeman et al., “Active Learning Increases Student Performance in Science, Engineering, and Mathematics,” PNAS, May 12, 2014. The result was so clear, in fact, that the researchers remarked that if this had been a medical study, it would have been halted because of ethical concerns about continuing with the inferior approach — i.e., lecturing. Also see discussions of research on higher-education pedagogy in Maryellen Weimer, Learner-Centered Teaching, 2nd ed. (Jossey-Bass, 2013) and in my essay about the ineffectiveness of lecturing (and efforts to move beyond it): “Don’t Lecture Me!“, blog post, June 24, 2017.
This oft-cited passage from Freeman et al. has become a talisman for advocates of “active learning,” but like many talismans, its power lies more in its symbolic value than its interpretive clarity. To begin, it’s worth scrutinising the rhetorical flourish: “if this were a medical trial, it would have been stopped early.” This analogy, though dramatic, is misleading. Education is not medicine. The variables are messier, the interventions less isolable, and the outcomes more contested. You don’t double-blind a seminar, nor randomise instruction in the way you can randomise pill-taking. Invoking medical ethics is a rhetorical hammer trying to drive in a pedagogical thumbtack.
Freeman’s 2014 meta-analysis does indeed show that “active learning,” broadly defined, tends to outperform didactic lecturing in STEM courses at the university level. But there are two sleights of hand here. First, the comparison is not between explicit instruction and constructivist learning, but between static, unresponsive lectures and lessons in which students are asked to think, discuss, and practise while being guided. All effective explicit instruction already includes these features. Second, the term “active learning” is maddeningly imprecise. It can mean peer instruction, clicker questions, problem-solving, group work, discussion, all of which vary enormously in quality and intent.
More to the point, this research is focused on adults in postsecondary science courses, not on novice readers or primary pupils grappling with the mechanics of arithmetic. The leap from university-level engineering lectures to five-year-olds learning phonics is as unhelpful as it is misleading.
The references to Maryellen Weimer and Kohn’s own blog post titled “Don’t Lecture Me!” do little to bolster the claim. Weimer’s advocacy of “learner-centred teaching” is grounded more in ideology than rigorous research, and the blog, by the author’s own admission, is an opinion piece. Neither offers the kind of robust empirical evidence needed to counter the deep body of research from cognitive science showing that novices benefit from carefully sequenced, explicit instruction, with ample modelling, worked examples, and gradually released responsibility.
Lecturing, in its worst form, is indeed an inferior method, especially when used as the sole means of instruction. But what this critique fails to grasp is that explicit instruction isn’t synonymous with passive listening. Done well, it is interactive, responsive, and grounded in how people actually learn.
To critique lecturing is not to critique instruction. The real question isn’t ‘Should we lecture less?’ It’s ‘What kinds of instruction best support learning, for whom, and under what conditions?’ On that, the evidence points consistently towards structured guidance for novices, and greater autonomy for experts.
3. [The ‘strong increases’ in active learning’s results] holds true across diverse populations, including with low-income and minority students - “There is growing evidence from large-scale experimental and quasi-experimental studies demonstrating that inquiry-based instruction results in significant learning gains in comparison to traditional instruction and that disadvantaged students benefit most from inquiry-based instructional approaches” (Cindy E. Hmelo-Silver et al., “Scaffolding and Achievement in Problem-Based and Inquiry Learning,” Educational Psychologist 42 [2007], p. 104). Two studies in 2021, one with third graders and one with high schoolers, confirmed the benefits of project-based, student-centered approaches with diverse students, notably low-income kids of color. See, respectively, Joseph Krajcik et al., “Assessing the Effect of Project-Based Learning on Science Learning in Elementary Schools,” Technical Report, Michigan State University, January 11, 2021, which is summarized here; and Anna Rosefsky Saavedra et al., “Knowledge in Action Efficacy Study Over Two Years,” Center for Economic and Social Research, University of Southern California, February 22, 2021. From the latter: “The traditional ‘transmission’ model of instruction…may be suboptimal for supporting students’ ability to think and communicate in sophisticated ways, demonstrate creativity…and transfer their skills, knowledge, and attitudes to new contexts.”
The 2007 Hmelo-Silver et al. paper is often invoked to suggest that problem-based and inquiry-led approaches are not only effective but especially beneficial for low-income or minority students. Yet the claim that “disadvantaged students benefit most” is based on a handful of correlational studies, many of which suffer from the usual problems of educational research: self-selection bias, context specificity, and wildly variable definitions of “inquiry.” The paper itself is more circumspect than the selective quotation implies, noting that success in inquiry learning depends heavily on the nature and extent of scaffolding (a key feature of explicit instruction.) This is a tacit admission that unguided or poorly guided inquiry is not likely to yield the same benefits.
The 2021 Krajcik study and the Saavedra Knowledge in Action report are both well-meaning, well-funded, and pedagogically interesting. But both involve highly resourced, tightly supported implementations of project-based learning, often with significant teacher training and expert-designed materials. These are not your average classrooms. Nor are the results universally conclusive: while some gains were seen, especially in engagement and higher-order outcomes, the picture is patchier when it comes to core knowledge retention, long-term academic outcomes, or scalability. Most tellingly, the Saavedra report criticises the “transmission” model of teaching as suboptimal. But this sets up a straw man: nobody seriously advocating explicit instruction believes in inert information dumping. Explicit instruction involves deliberate explanation, questioning, modelling, checking for understanding, and guided practice, often far more cognitively demanding than many so-called inquiry activities, which too often devolve into groupwork theatre or surface-level engagement.
Finally, the idea that disadvantaged students benefit most from discovery-based methods is not only questionable but directly contradicted by other evidence. The Matthew effect in education tells us that students with more prior knowledge benefit more from less structured approaches, while novices - particularly those from disadvantaged backgrounds - depend more heavily on clear, structured instruction to build the foundations others may already possess.
So although some forms of inquiry can work, especially when expertly designed, richly resourced, and carefully scaffolded. But the idea that such methods inherently work better for disadvantaged pupils is at best an overstatement, and at worst a dangerous misconception. If we really care about equity, we should ensure all children get the explicit knowledge and practice they need to flourish, not romanticise methods that too often widen gaps under the banner of student-centred idealism.
4. [The ‘strong increases’ in active learning’s results holds true] at least in some circumstances, with low-achieving students. There is some evidence that more proficient students are better able to take advantage of inquiry learning — which, in light of all the other data attesting to its benefits, is an argument for providing more scaffolding for struggling students, not for subjecting them to direct instruction instead. But one interesting study of college math education (which comprised more than one hundred sections of forty courses at four universities) found that, while all students in inquiry-oriented sections “succeeded at least as well as their peers in later courses,” it was the students with poorer academic records who benefited the most. Their performance in subsequent courses reflected “sizable and persistent” improvement “relative both to their own previous performance and to [traditionally taught] peers” (Marina Kogan and Sandra L. Laursen, “Assessing Long-Term Effects of Inquiry-Based Learning,” Innovations in Higher Education 39 [2014], pp. 195, 194, 196).
This footnote offers a more measured defence of inquiry-based learning by acknowledging its limitations - particularly for lower-achieving students - while nevertheless insisting that these limitations justify more scaffolding, not a shift toward explicit instruction (the distinction is unclear.) But the argument, while superficially balanced, is ultimately circular and unconvincing when held up to closer scrutiny.
Let’s start with the modest concession: more proficient students tend to benefit more from inquiry-based approaches. The expertise reversal effect is a well-documented finding and echoes decades of research in cognitive psychology. Expertise allows learners to engage in problem-solving, transfer, and creative application because they already have a secure knowledge base to draw upon. For novices or struggling students, lacking this foundation, inquiry becomes an exercise in frustration; what Sweller would call “problem-solving search,” where cognitive resources are spent flailing rather than learning.
The response offered here - more scaffolding - is not wrong, but it does beg a question: what does effective scaffolding look like for students who lack prior knowledge? Often, it closely resembles explicit instruction. If your inquiry-based lesson begins with teacher modelling, followed by structured questioning, worked examples, and gradually released independence, congratulations, you’re engaging in explicit instruction in disguise.
The Kogan and Laursen (2014) study is, at first glance, a striking finding: struggling students in inquiry-oriented college maths classes did better in later courses than their peers. But dig into the details and a more qualified picture emerges. First, this study takes place in higher education: students who are, by definition, already academically selected. Even the “low achievers” in this context are not directly comparable to pupils struggling with literacy or numeracy at school. Second, the study’s methodology - like much research in education - is quasi-experimental, not randomised, and it relies heavily on institutional grade data, which can be noisy, inconsistent, and influenced by local marking practices.
Moreover, what the study categorises as “inquiry-based” instruction is often carefully curated and supported by instructors trained in the pedagogy. Again, we are not looking at everyday classroom practice in under-resourced schools. The interventions are exceptional in both design and delivery, conditions unlikely to be replicated widely or reliably without significant investment.
The final move in this argument - that the evidence is a call for better scaffolding, not direct instruction - is more ideological than logical. It assumes that inquiry is the pedagogical good, and any challenge to its efficacy for certain students is a problem to be worked around, not a reason to re-evaluate. But if explicit instruction consistently outperforms inquiry for students who need the most help, then insisting they would succeed in inquiry if only the scaffolding were perfect is a faith-based position and not amiable to logic.
In the end, the question isn’t whether inquiry can work in ideal conditions, but whether it works reliably, equitably, and scalably in the real classrooms where most students - especially the struggling ones - are taught. On that, the evidence remains stubbornly in favour of clarity, structure, and teacher-led explanation.
5. [The ‘strong increases’ in active learning’s results holds true] … in STEM subjects - In a comprehensive review of the evidence in STEM fields that was published in 2023, an international group of 13 researchers argued for some role for direct instruction, but in the end they emphasized, “Overall the literature persuasively shows the benefits of inquiry-based instruction over direct instruction for acquiring conceptual knowledge” (Ton de Jong et al., “Let’s Talk Evidence – The Case for Combining Inquiry-Based and Direct Instruction,” Educational Research Review 39 (2023), pp. 9-10). The huge Freeman et al. metaanalysis that I mentioned in note 2 demonstrates the superiority of the inquiry approach at the university level, but it’s just as true for younger students according to many, many studies. In elementary school, for example, see E. M. Granger et al., “The Efficacy of Student-Centered Instruction in Supporting Science Learning,” Science 338 (October 5, 2012): 105-108. Also, “students tended to score higher on the 4th grade and 8th grade NAEP science tests when they had experienced science instruction centered on projects in which they took a high degree of initiative” (Harold Wenglinsky, “Facts or Critical Thinking Skills? What NAEP Results Say,” Educational Leadership, September 2004, p. 33). And it was true both in middle school and high school science when students were were evaluated on their conceptual understanding; no difference showed up on multiple-choice tests: Marcia C. Linn et al., “Teaching and Assessing Knowledge Integration in Science,” Science 313 (August 25, 2016): 1049-50. Incidentally, one early review of 57 studies of elementary science programs, which found benefits for an inquiry-based approach across the board — and particularly for disadvantaged students — offered an important caution: The advantages offered by such student-centered teaching “may be lost” if students are subsequently taught “in classrooms where more traditional methods prevail” (Ted Bredderman, “Effects of Activity-Based Elementary Science on Student Outcomes,” Review of Educational Research 53 [1983], p. 513).
This flourish draws together a familiar chorus of references to make a sweeping claim: that inquiry-based instruction is not only effective in STEM, but broadly superior, especially for developing conceptual knowledge, even among younger and disadvantaged students. But once again, the evidentiary net is cast wide and thin, and Kohn’s conclusion, though emphatic, rests on a patchwork of equivocations, misreadings, and wishful interpretation.
The 2023 de Jong et al. review is often quoted as a clinching argument for inquiry, but in truth it presents a more nuanced case. The authors do not so much endorse inquiry over direct instruction as advocate for an integrated approach. They explicitly acknowledge that direct instruction is necessary in some situations, especially for novices and when efficiency is a priority. Their key claim that inquiry is more effective for acquiring conceptual knowledge must be understood in context, namely once foundational understanding is already in place. They are at pains to point out that effective inquiry involves significant structure, prompting, and often begins with explicit explanation. This is a call for sequenced instruction, not a repudiation of teacher-led methods.
The same caveats apply to the Freeman et al. 2014 meta-analysis. As previously discussed, it found that active learning, broadly defined, led to improved performance in university-level STEM courses. But university students are not primary school pupils. Their prior knowledge, maturity, and motivation are vastly different. Ascribing these findings to younger students requires empirical support that is often assumed rather than demonstrated.
We then get a flurry of studies: Granger et al. 2012, Wenglinsky 2004, Linn et al. 2016, and even Bredderman 1983. Each, when closely examined, adds less than claimed. Granger’s study, for instance, focused on a very particular implementation of student-centred science instruction, supported by curriculum developers and ongoing teacher training. Its apparent success is hard to disentangle from the professional development and resourcing that surrounded it. Wenglinsky’s NAEP analysis, meanwhile, is correlational. Higher NAEP scores among students who reported more project-based science might reflect many factors such as teacher expertise, school culture, or parental involvement, not causation. Linn’s paper reports stronger conceptual understanding from inquiry, but no difference on objective test outcomes. That suggests again that inquiry may complement but not replace more traditional methods.
As for Bredderman’s 1983 review, it is both venerable and vulnerable. The finding that inquiry benefits may be lost when students return to traditional classrooms is presented as a cautionary tale yet could just as easily be read as an indictment of the fragility of those gains. If the efficacy of inquiry collapses the moment students leave a controlled environment, we are not dealing with a robust generalisable pedagogy but an instructional orchid, beautiful, rare, and desperately in need of careful cultivation.
What is consistently overlooked in all of this is the thorny issue of transfer. Inquiry methods often result in higher engagement and richer classroom talk, but their results in long-term knowledge retention, adaptability, and progression are mixed at best. Moreover, it is precisely the students with the least background knowledge, disadvantaged, struggling, or younger students, who are most dependent on direct structured teaching to build the schema necessary for later conceptual insight.
To return to de Jong’s own title, Let’s talk evidence. Fine, but let’s do it honestly. When properly defined, scaffolded inquiry has a place but the claim that it is broadly superior to explicit instruction, even for conceptual understanding, is simply not supported by the weight of evidence. In practice, it is structured intentional instruction, not loosely guided exploration, that most reliably closes gaps, builds mastery, and equips students, especially those least advantaged, with the knowledge they need to thrive.
6. [The ‘strong increases’ in active learning’s results also holds true for] reading instruction, where … “The more a teacher was coded as telling children information, the less [they] grew in reading achievement.” Barbara M. Taylor et al., “Looking Inside Classrooms: Reflecting on the ‘How’ as Well as the ‘What’ in Effective Reading Instruction,” The Reading Teacher, November 2022, p. 278. Also see Karen Eppley and Curt Dudley-Marling, “Does Direct Instruction Work?” Journal of Curriculum and Pedagogy 16 (2019); and Randall J. Ryder et al., “Longitudinal Study of Direct Instruction Effects from First Through Third Grades,” Journal of Educational Research 99 (2006): 179-91. The latter two studies investigated a version of direct instruction for teaching reading that is known as Direct Instruction (with capital letters) or Reading Mastery.
By this point it will be unsurprising to find that Kohn’s attempt to extend the case for ‘active learning’ into the domain of reading instruction depends heavily on conflation, selective quotation, and a lack of conceptual clarity about what is meant by direct instruction.
The key quotation from the 2022 Taylor et al. study, “The more a teacher was coded as telling children information, the less they grew in reading achievement,” is offered as a decisive strike against explicit teaching. But there is no serious attempt to define what kind of “telling” is being measured. Was this carefully sequenced explanation with worked examples and modelling? Or was it directionless teacher talk, disconnected from structured practice? The study leans heavily on observational coding, a method that can capture superficial correlations without unpacking causal mechanisms. More importantly, ‘reading development’ is not a single, monolithic skill. What counts as effective instruction depends entirely on what is being taught. Decoding and phonemic awareness benefit from precision, repetition, and structured routines. Inferencing and comprehension may call for richer, dialogic exploration. To imply that telling children information is uniformly detrimental is both simplistic and misleading.
The Eppley and Dudley-Marling article is similarly polemical. It asks whether Direct Instruction (DI) works but largely rehearses familiar ideological objections. The authors reject Engelmann’s scripted, behaviourist model of DI, not explicit instruction in its broader and more nuanced sense. They focus on cultural transmission, teacher autonomy, and aesthetic concerns rather than robust empirical findings. Their critique amounts to this: DI offends progressive sensibilities. That may be true, but it is hardly a reason to dismiss decades of controlled trials showing its positive effects on early literacy, particularly for disadvantaged pupils.
The Ryder et al. longitudinal study is more empirical in approach. It does report that gains from DI fade somewhat after early implementation, and that effects vary by context. But the study also acknowledges positive short-term outcomes in reading accuracy and fluency. It is telling that even the more cautious studies do not dispute the efficacy of early explicit phonics-based instruction. What they question is its long-term impact if not reinforced by broader, richer reading experiences later on. In other words, the real problem is not direct instruction in itself but the absence of a well-sequenced, knowledge-rich curriculum to follow it.
What is striking across all these citations is the failure to distinguish between DI as a scripted programme and explicit instruction as a pedagogical approach. The former may have limitations. The latter is an essential part of how novices acquire complex knowledge. In reading, this is especially acute. Decoding does not emerge spontaneously from exposure to texts. It must be taught clearly, cumulatively, and systematically. Vocabulary does not grow through guessing alone. It is built through repeated, deliberate encounters with words in meaningful contexts. Even comprehension depends on background knowledge, which is not evenly distributed and cannot be constructed from first principles by children who have never heard of the things they are being asked to infer about.
In short, these studies reinforce the need to match method to content. Obviously, ‘telling’ cannot always be harmful - how would we have evolved the capacity to tell each other stuff if it were so counter productive to our survival? - but depends on what is being told and how. The question is not whether children are being told things, but whether what they are being told is accurate, well sequenced, and supported with structured practice and feedback.
7. [The ‘strong increases’ in active learning’s results] holds true when judged by how long students retain knowledge. One review of research relevant to this point explained the finding as follows: “Instructional strategies that actively involve students in learning may result in qualitatively different memories that are more resistant to forgetting than memories acquired through more traditional instructional methods” (George B. Semb and John A. Ellis, “Knowledge Taught in School: What Is Remembered,” Review of Educational Research 64 [1994], p. 279). Similarly, a study of math instruction in a “constructivist learning environment showed better retention of almost all the concepts than…in the traditional [lecture-based] class” (Serkan Narli, “Is Constructivist Learning Environment Really Effective on Learning and Long-Term Knowledge Retention in Mathematics?” Educational Research and Reviews 6 [2011]: 36-49). And Australian middle-school students in more progressive classrooms (featuring active learning and group discussions) remembered significantly more geography content than those in conventional classrooms (Andrew A. Mackenzie and Richard T. White, “Fieldwork in Geography and Long-Term Memory Structures,” American Educational Research Journal 19 [1982] 623-32).
Kohn attempts to clinch the case for ‘active learning’ by invoking one of education’s holy grails: long-term retention. He claims that students taught through active or constructivist methods remember more, for longer, than those taught through traditional lecture-based instruction (NB. nothing advocates would accept as a definition of explicit instruction.) But like many such claims, it relies on a string of dated or narrowly framed studies and fails to engage with the broader, more robust body of evidence from cognitive psychology and instructional science.
Start with the Semb and Ellis review from 1994. The quoted line suggests that active learning (when compared to lecturing) produces qualitatively different memories that are more resistant to forgetting. This claim is hedged in conditional language (“such as,” “may result”) and the review is careful to acknowledge that retention depends on many factors, including the nature of the content, the frequency of retrieval, and whether the learning was embedded in meaningful context. In fact, much of the evidence cited in the review points to the importance of overlearning, spaced practice, and testing effects. These are features more commonly associated with direct, structured instruction than with open-ended discovery or loosely guided activities.
The Narli study on mathematics, cited as evidence that constructivist classrooms outperform traditional ones in retention, suffers from several limitations. First, its sample size is small and its design quasi-experimental. Second, ‘constructivist’ in this context refers not to unguided exploration but to a carefully scaffolded environment with clear learning goals, which sounds very like most definitions of explicit instruction. There is little detail about the nature of the so-called traditional class, which feels like it’s being used as a straw man. More importantly, the study is based on a short intervention with specific content. The question of whether the same results would hold across domains, teachers, and year groups remains unanswered and is probably unanswerable.
The Mackenzie and White study from 1982 is even more context dependent. It finds that Australian middle school students in more progressive classrooms retained more geography content after fieldwork and group discussions. But again, this is not an argument against explicit instruction. It is an argument for rich, varied experiences embedded within a knowledge-rich curriculum. There is no evidence that the retention was due to reduced teacher explanation or the absence of modelling and practice. In fact, fieldwork is most effective when underpinned by strong instructional framing. Without that, students are likely to remember the trip but forget the point.
What these studies have in common is a tendency to compare well-designed active learning with poorly executed traditional teaching. When explicit instruction is presented as passive lecturing, of course it performs poorly. But that is not how effective direct instruction actually works. It includes retrieval practice, spaced review, formative checks for understanding, and carefully sequenced curriculum design. These are precisely the ingredients that cognitive science has shown to enhance long-term retention.
The broader point is this: retention is not a matter of method alone, it is a function of how well new information is encoded, how often it is retrieved, and how clearly it connects to existing knowledge. The techniques that best support memory such as retrieval practice, interleaving, spacing, and elaboration can be embedded in any instructional model but are most easily and reliably implemented through explicit, structured teaching. Active learning may engage students in the moment, but engagement is not the same as durability.
In short, the claim that constructivist approaches to instruction improves retention may hold in particular cases, but it cannot be generalised without qualification. When judged by the standards of what sticks, what transfers, and what builds over time, structured instruction still holds the strongest ground.
8. The more emphasis one places on long-term outcomes, on deep understanding, on the ability to transfer ideas to new situations, or on fostering and maintaining students’ interest in learning, the more direct instruction (DI) comes up short. This is clear from many of the studies cited throughout this section. The phenomenon is explained lucidly by Daniel L. Schwartz et al., “Constructivism in an Age of Non-Constructivist Assessments,” in Sigmund Tobias and Thomas M. Duffy, eds., Constructivist Instruction: Success or Failure? (Routledge, 2009). They cite research showing that student-centered classrooms help students to construct meaning while also helping them to learn standard material “without an appreciable cost in overall instructional time.” Simply telling students the standard procedure “blocked student learning” — a result that comes into focus only if the assessment is rich enough to capture something more than short-term retention of facts. One legacy of direct instruction is that students “learned the solution procedure but they did not learn about the structure of situations for which that procedure might be useful” (pp. 55, 59).
This argument leans heavily on a common but ultimately fragile premise: that “direct instruction” (or DI - it’s important to emphasise that these are not the same) is effective for short-term retention but fails when it comes to deep understanding, transfer, and long-term interest. This feels superficially persuasive. Who wouldn’t prefer a classroom that fosters meaning-making over rote recall? But a closer reading of the claims and the studies used to support them reveals more assertion than substance.
The quotation from Schwartz et al. is taken from a thoughtful chapter, but one that is explicitly speculative. Their argument rests on a familiar contrast between surface learning and deep learning, with the implication that student-centred, constructivist classrooms naturally promote the latter. But the devil, as ever, is in the detail. When they say that ‘telling students,’ the standard procedure, blocked student learning, they are referring to narrowly defined tasks in experimental contexts. The conclusion does not generalise to normal forms of explanation, nor to the broad family of techniques that fall under explicit instruction.
There is also a sleight of hand in the idea that constructivist methods allow students to learn standard material without additional time. What is often left unexamined is how much of that learning is contingent on prior knowledge, teacher skill, and curriculum design. In many cases, constructivist approaches appear to work best with students who already know enough to make sense of the activity. In such cases, it is not the method doing the heavy lifting but the knowledge students bring with them.
The legacy of DI, we are told, is that students learn procedures but not when or why to use them. Yet this critique depends on a narrow definition of what DI (or direct instruction, it’s not clear which is being referred to) entails. Good instruction includes modelling, practising, checking for understanding, and carefully fading support as well as carefully sequenced explanation. It is precisely this structured approach that builds the mental models required for transfer. Far from blocking learning, a clear procedural foundation is often a necessary precondition for the kind of flexible thinking that transfer demands.
There is also a deeper problem in this line of argument: the assumption that transfer and deep understanding are the natural products of constructivist classrooms. In reality, transfer is rare and difficult. It depends less on abstract reasoning and more on how richly connected and securely embedded knowledge is in long-term memory. Decades of research in cognitive science have shown that students must first automatise core knowledge and procedures before they can apply them effectively in unfamiliar contexts.
As for interest, the idea that direct instruction dampens curiosity has intuitive appeal, but little empirical grounding. In fact, many students report greater satisfaction and confidence when they know what they are doing and why. Struggle without success tends to erode motivation rather than build it. There is nothing intrinsically engaging about floundering through a task you do not yet understand. Epistemic curiosity is not sparked by ignorance but by a conscious awareness of a gap in knowledge. You cannot feel curious about something you do not yet recognise as meaningful. In this sense, curiosity often follows instruction. It is by knowing a little that students come to want to know more.
9. [The ‘strong increases’ in active learning’s results] appear to hold true … because active learning and inquiry are beneficial… Indeed, this brief summary of the research only scratches the surface, not just because many other studies have replicated the same basic finding but because separate strands of evidence independently attest to the benefits of each of the components of student-centered learning and active exploration. These include the multiple advantages of “autonomy support” (that is, giving people more say about what they’re doing); learning from, and in collaboration with, one’s peers; and supportive student-teacher relationships.
Here Kohn attempts to appeal not only to replication but to converging evidence from supposedly distinct strands of educational research. The implication is that active learning works because it incorporates a cluster of beneficial features such as autonomy, peer collaboration, and supportive relationships. But this conflation conceals more than it reveals.
First, replication in education is a notoriously fragile thing. Many so-called replications tweak the method, change the age group, or shift the subject matter. What they replicate, more often than not, is a general pattern of engagement rather than a robust improvement in long-term learning outcomes. The “same basic finding” that active methods outperform traditional ones is typically context-sensitive, modest in effect size, and highly variable depending on implementation quality.
Second, the idea that separate strands of evidence confirm the value of components like autonomy support, collaboration, and warm teacher relationships does not automatically validate active learning as a unified method. These elements are not exclusive to inquiry-based or student-centred models. In fact, some of the most effective examples of explicit instruction include structured opportunities for student choice, well-managed pair work, and strong relational trust between teacher and class. It is entirely possible to design highly effective teacher-led lessons that build motivation, encourage dialogue, and support autonomy within clear boundaries.
Autonomy, for instance, is beneficial only when learners have enough knowledge to make informed choices. Without prior understanding, freedom can be overwhelming rather than empowering. Peer collaboration, likewise, is most effective when students already possess the requisite knowledge and vocabulary to engage meaningfully with each other’s ideas. Otherwise, misconceptions spread faster in groups than in solitude. And while teacher-student relationships certainly matter, they are a necessary condition for effective learning, not a sufficient one. A supportive relationship may increase a student’s willingness to try, but it does not eliminate the need for clear explanation and guided practice.
In short, the components listed here - autonomy, collaboration, relational warmth - are not the preserve of active learning. They are general features of effective teaching, and their presence tells us little about the merits of inquiry as a method. The risk is that active learning becomes a catch-all label for any practice that appears humanising or engaging, while more structured approaches are dismissed as mechanistic or cold. That is not only unfair but educationally incoherent. The real task is to design instruction that matches cognitive architecture, supports novice learners, and creates the conditions in which curiosity, confidence, and capability can grow together. That is not the exclusive domain of any one method. It is the product of deliberate, intelligent teaching.
10. [The ‘strong increases’ in active learning’s results appear to hold true] because explicit telling can be actively harmful… See the discussion of possible mechanisms for that counterproductive effect in Gopnik, op. cit., and Bonawitz et al., op. cit. Another sort of evidence comes from research demonstrating that the beneficial impact of teaching that focuses on helping students to understand mathematical principles (which they then have to figure out how to apply) is undermined when they are also taught step-by-step problem-solving procedures. See Michelle Perry, “Learning and Transfer: Instructional Conditions and Conceptual Change,” Cognitive Development 6 (1991): 449-68. Also see the second study cited in note 14, below, which produced a similar finding: A positive outcome often requires not only the presence of student-centered teaching but the absence of explicit instruction.
This claim crosses a line. To suggest not just that explicit instruction is less effective, but that it is actively harmful, is not only unsupported by the evidence but ethically suspect. It implies that clarity, guidance, modelling, and practice do damage to students, and that the best outcomes occur when these things are deliberately withheld. That is not a pedagogical argument but an act of bad faith.
The Gopnik and Bonawitz studies, once again, involve narrow tasks with very young children, often using novel toys or ambiguous problems in controlled settings. At most, they suggest that overly prescriptive demonstration can limit exploration within a specific context. They do not show that telling children things causes harm to their overall intellectual development. To extrapolate from these limited findings to a blanket condemnation of explanation is not scientific reasoning. It is ideology masquerading as evidence.
The Perry study is frequently cited by critics of explicit instruction because it appears to show that teaching step-by-step problem-solving procedures can undermine conceptual understanding. But here again, the interpretation depends entirely on context. If procedural instruction is front-loaded without attention to underlying structure, then yes, students may fail to grasp the big picture. But that is not a flaw in explicit teaching per se. It is a failure of curriculum sequencing. There is nothing in the research that suggests conceptual insight and procedural fluency are mutually exclusive. In fact, when well taught, they reinforce each other. Students are more likely to understand a principle when they have successfully applied it, and more likely to apply it correctly when they understand what it is for.
The final suggestion - that explicit instruction must be removed for student-centred teaching to work - is not just absurd, it is grotesque. It places the burden of sense-making entirely on the student and treats the teacher’s knowledge as a contaminant rather than a gift. It would be unthinkable in medicine, or law, or engineering. Only in education do we find people arguing that novices learn best when experts keep their insights to themselves.
This line of thinking is not simply wrong. It is dangerous. It encourages practices that widen gaps, undermine confidence, and leave students - especially those without cultural capital or strong prior knowledge - adrift in well-meaning confusion. To call instruction harmful is to mistake clarity for coercion and support for control.
11. [The ‘strong increases’ in active learning’s results appear to hold true because explicit telling can be actively harmful] sometimes in ways that extend beyond the academic realm. Several studies cited in my review of early-childhood research (see note 1) found that children whose preschool had used DI (compared with a child-centered or constructivist approach) fared more poorly as teenagers and then as adults on a range of psychological, social, and other measures. For an updated summary of the most ambitious of those investigations, see Lawrence J. Schweinhart, The High/Scope Perry Preschool Study Through Age 40 (Ypsilanti, MI: HighScope, n.d.). A more recent longitudinal study didn’t find a negative effect from teacher-directed instruction but confirmed that “child-initiated instruction in preschool is a robust predictor of adulthood well-being” (Jasmine R. Ernst and Arthur J. Reynolds, “Preschool Instructional Approaches and Age 35 Health and Well-Being,” Preventive Medicine Reports 23 [2021]).
Now we have reached peak parody: explicit instruction, it seems, does not merely endanger a child’s curiosity. It lurks in the background of their future divorce, midlife health issues, and general psychological ruin. Apparently, telling four-year-olds how to hold a pencil triggers a slow-burn moral collapse that only fully manifests around age 35. One half expects the next study to link phonics lessons to receding hairlines and a mild distrust of labradors.
The claim here is not just implausible but ludicrous. A handful of studies from the 1960s and 70s are dragged out as evidence that teacher-led preschool causes long-term damage to the soul. Not less enjoyment of story time. Not a slight dip in standardised test scores. We are talking life outcomes: mental health and social function. Presumably because someone once said “This is the letter B” instead of letting the child feel their way towards it using finger paint and ambient jazz.
Let’s take the High/Scope Perry Preschool Study. It is indeed ambitious, and the people behind it were dedicated, serious researchers. But it was a tiny, highly localised programme in Ypsilanti, Michigan, run more than half a century ago with enormous wraparound services and a heavy emphasis on parental involvement. To treat it as a clean test of pedagogical method is absurd. The fact that its graduates fared better as adults tells us something about long-term investment in early education. It tells us nothing about the relative consequences of a more structured instructional approach.
As for the Ernst and Reynolds study, the quoted line is oddly evasive. It says child-initiated instruction is a predictor of well-being but carefully avoids stating that teacher-led instruction causes harm. And of course it does. Because no one wants to make the claim too directly, for fear of sounding like a Victorian pamphlet warning about the soul-corroding effects of industrial chimney sweeps. Instead, we get the whisper: perhaps if we had let little Kai explore the concept of quantity through pinecones and interpretive dance, he wouldn’t now be struggling with existential dread and a high cholesterol count. This is melodrama in peer-reviewed form. The idea that giving young children clear, structured instruction leads to emotional or psychological impairment later in life is not only unsupported, it is offensive. It caricatures teaching as dominance and learning as submission. It treats guidance as oppression and knowledge as a pollutant. It ignores entirely the very real damage caused by leaving children to guess at meaning, stumble through half-formed concepts, and fail to build the secure foundations they need to thrive.
If explicit instruction is the villain of this tale, it is a curious one: calm, consistent, informative, and entirely lacking in the wild-eyed zealotry of its critics. It tells children what they need to know. It shows them how to do things. It corrects them gently and builds them up slowly. If that is harmful, then we may as well conclude that shoes inhibit foot freedom and maps spoil the fun of getting lost.
12. Unfortunately, [all the research Kohn has cited] hasn’t prevented traditionalists from defiantly doubling down. For example, Richard E. Mayer, “Should There Be a Three-Strikes Rule against Pure Discovery Learning? The Case for Guided Methods of Instruction,” American Psychologist 59 (2004): 14–19; Paul A. Kirschner, John Sweller, and Richard E. Clark., “Why Minimal Guidance During Instruction Does Not Work,” Educational Psychologist 41 (2006): 75-86 (and virtually everything else written by those three authors); and Lin Zhang et al., “There Is an Evidence Crisis in Science Educational Policy,” Educational Psychology Review 34 (2022): 1157-76. (Strong rebuttals to the latter two articles were subsequently published in the same journals. See especially de Jong et al., op. cit.)
This complaint rests on the idea that, despite what Kohn presents as a substantial body of research in favour of student-centred learning, Richard Mayer, Paul Kirschner, John Sweller, and Richard Clark are singled out as exemplars of this resistance, their work portrayed as a kind of defiant retrenchment. Yet this characterisation overlooks the fact that these scholars are not, in fact, “doubling down” in the face of overwhelming contradictory evidence, they are continuing to articulate a position that remains strongly supported by research in cognitive science.
Mayer’s 2004 paper is often cited not because it is controversial, but because it captures a clear, evidence-informed conclusion: pure discovery learning, especially for novices, is generally less effective and more time-consuming than guided methods of instruction. Kirschner, Sweller, and Clark build on this by drawing attention to how human working memory functions and why this matters for learning. Their argument is not that all forms of inquiry are ineffective, but that minimal guidance is unlikely to work well for most students most of the time. This is not ideology. It is grounded in decades of empirical work.
Zhang et al.’s 2022 paper extends this concern into the realm of science education policy. It does not dismiss all inquiry-based approaches, but it does raise serious questions about how evidence is selected, interpreted, and applied. The call for more rigorous, transparent standards in educational research is not an attack on innovation. It is a plea for clarity.
It is true that scholars like de Jong have published responses defending more structured versions of inquiry. These are often presented as balanced or integrative, but in reality they are underpinned by a strong bias in favour of constructivist methods. While such responses acknowledge some of the findings from cognitive science, they frequently minimise or sidestep the implications. Rather than engaging directly with the challenge posed by cognitive load theory or the expertise reversal effect, they tend to reframe the debate in terms that continue to privilege inquiry as the default.
What is typically proposed is a hybrid model, some explicit instruction at the outset, followed by student-led exploration. But this structure, though rhetorically framed as a compromise, usually places far more emphasis on discovery than the evidence warrants. The knowledge that is front-loaded is often too thin or superficial to support the complex tasks that follow. The result is a method that appears reasonable but still demands that novices engage in higher-order thinking before they have secured the foundational knowledge such thinking depends on.
This is not genuine synthesis. It is a repositioning of inquiry in slightly more palatable terms. And it does not escape the original problem. Without well-sequenced, carefully modelled instruction, many students - particularly those without strong prior knowledge - struggle to construct the very understanding that inquiry is supposed to promote. Rebranding the approach does not resolve its core limitations. It simply makes them harder to spot.
What Kohn interprets as stubbornness is in fact a consistent application of the principle that learning depends on prior knowledge, that novices need structure, and that clear teaching remains one of the most reliable tools we have. To present this position as reactionary is not only deeply hypocritical, it is to misread both the tone and substance of the debate.
13. As one group of researchers put it, “Studies favoring direct instruction tend to be small-scale, use limited measures and time horizons, [and rely on] ‘skill acquisition’ or simple concepts as the learning goals…” Schwartz et al., op. cit., p. 58. The same point is made by a pair of researchers who conducted a metaanalysis on just this question: Ard W. Lazonder and Ruth Harmsen, “Meta-Analysis of Inquiry-Based Learning: Effects of Guidance,” Review of Educational Research 86 (2016), especially pp. 684, 704, 706.
The quotation from Schwartz et al. suggests that studies favouring explicit instruction are biased towards simple outcomes and lack the scope to capture deeper learning. Yet this overlooks a basic point: most classroom teaching is concerned with building foundational skills. You cannot leap to sophisticated reasoning or complex problem-solving without first securing the basics. And those basics - whether decoding, number facts, or sentence structure - are not trivial. They are the enabling knowledge that makes later insight possible.
It also ignores the vast scale and significance of Project Follow Through, the largest and most expensive educational study ever conducted. Of all the models tested, only Direct Instruction consistently improved both basic skills and higher-order outcomes, including self-esteem and problem-solving ability. No other approach came close. For critics of explicit instruction to pretend this landmark study does not exist, or to dismiss it on the basis that it focused on so-called simple skills, is to engage in wilful amnesia.
The Lazonder and Harmsen meta-analysis is often cited in support of inquiry-based approaches, but its conclusions are more cautious than they are often made to sound. It finds that inquiry can be effective when it is accompanied by substantial guidance. In other words, it works best when it stops being minimal. The strongest results come from approaches that provide worked examples, feedback, and direct support at key stages. These are not incidental features. They are precisely the features championed by advocates of fully guided instruction.
To argue that studies of explicit instruction are inherently limited because they focus on measurable outcomes is to misunderstand the role of measurement in educational research. Of course short-term studies cannot capture the whole arc of learning. But they can tell us whether a particular method helps students learn something clearly and efficiently. And in many cases, the things they are learning - how to read, how to solve equations, how to structure an argument - are not only measurable, but essential.
The idea that these studies are only concerned with “skill acquisition” smuggles in a false dichotomy between skills and understanding. In practice, the two are entwined. You understand something better when you can use it fluently. You can apply knowledge flexibly only when it has been well secured. And the methods that best support that process are rarely discovery-driven.
What the critics often want is not better research, but different outcomes. They value creativity, self-direction, collaboration, and deep conceptual understanding. These are worthy aims. But they are not incompatible with explicit instruction. Nor are they reliable indicators of success in the absence of secure knowledge. The complaint here is not really about the size or scope of the studies. It is about a discomfort with what they consistently show. And what they show is that for novices, clear, structured, fully guided teaching remains the surest route to long-term success.
14. DI’s defenders triumphantly cited a 2004 experiment in which science students who received “an extreme type of direct instruction in which the goals, the materials, the examples, the explanations, and the pace of instruction [were] all teacher controlled” did better on a test than their classmates who designed their own procedures. But three years later, another pair of researchers returned to the same question in the same discipline with students of the same age. This time, though, they measured the effects after six months instead of only a week and they used a more sophisticated assessment of learning. It turned out that any advantage produced by DI quickly evaporated. And on one of the outcome measures, exploration ultimately proved to be not only more impressive than DI but also more impressive than a combination of the two — further reason to believe that DI not only is less effective but can actually be counterproductive. The original study: David Klahr and Milena Nigam, “The Equivalence of Learning Paths in Early Science Instruction,” Psychological Science 15 (2004): 661-67. The follow-up study: David Dean, Jr. and Deanna Kuhn, “Direct Instruction vs. Discovery: The Long View,” Science Education 91 (2007): 384-97. DI proponents continue to cite the first study but, as far as I can tell, have never even acknowledged the existence of the second. For other examples of how they have repeatedly ignored “a massive number of controlled studies that have shown the benefits of inquiry-based instruction in comparison with direct instruction” — or, on other occasions, misrepresented research that contradicts their position — see de Jong et al., op. cit., pp. 3-4.
This is a common rhetorical move: cite an early study that appears to support explicit instruction, then point to a follow-up that supposedly overturns it, and finally accuse the defenders of never acknowledging the update. It is tidy, dramatic, and entirely misleading.
The 2004 Klahr and Nigam study is indeed frequently cited, and for good reason. It offered a rare example of a tightly controlled comparison between fully guided and minimally guided science instruction. The finding was clear: students who received explicit goals, explanations, and examples outperformed those who had to figure things out for themselves. Crucially, the study demonstrated that novices benefit from clarity, not open-ended exploration. That conclusion is entirely consistent with what cognitive science has repeatedly shown.
Now enter the 2007 follow-up by Dean and Kuhn. This study asked a different question, over a longer time period, using different outcome measures. Its authors claim that students who engaged in exploration performed better after six months than those who received explicit instruction. But the interpretation is not so straightforward. The measures used were not of factual recall or conceptual understanding, but of students’ ability to design and interpret scientific investigations. In other words, the assessment was designed to favour the skill set practised in the inquiry condition. That is not surprising. If you train one group to juggle and another to play the violin, and then test them on juggling six months later, the first group will do better. That does not mean juggling is a superior pedagogical method.
More to the point, the exploratory group in Dean and Kuhn’s study did not engage in pure discovery. They were supported, prompted, and encouraged to reflect on their processes. In other words, they received structured guidance, just not in the form of DI This is an important distinction, and it is precisely what defenders of fully guided instruction have long recognised. The key variable is not whether the method has the label “inquiry” or “instruction” but whether it provides the right kind of support at the right time, matched to the learner’s level of expertise.
The suggestion that proponents of explicit instruction have “never even acknowledged the existence” of the Dean and Kuhn study is also unfair. The paper is well known, and where it has been discussed, it is typically treated as one contribution among many, not a definitive rebuttal. Nor is it true that there is a “massive number” of controlled studies showing the superiority of inquiry over explicit instruction. Meta-analyses, such as those by Alfieri et al. and Lazonder and Harmsen, consistently find that inquiry with guidance can be effective, but they do not support the idea that guidance is unnecessary. If anything, they reinforce the case for structured teaching, especially for novices.
Finally, the claim that defenders of explicit instruction routinely misrepresent research is ironic, given the selective framing on display here. To cite a single follow-up study as proof that explicit instruction is not only less effective but potentially counterproductive, while ignoring the broader sweep of evidence in its favour, is not a worthy or reputable rebuttal.
15. The second problem with evidence said to favor DI reflects the way its proponents tend to structure what happens in the two teaching conditions they’re comparing. On the one hand, they’re apt to set up inquiry learning for failure by using a caricatured version of it, a kind of pure discovery rarely found in real-world classrooms, with teachers providing no guidance at all so that students are left to their own devices. On the other hand, the version of DI they test sometimes sneaks in a fair amount of active student involvement — to the point that the two conditions may just amount to “different forms of constructivist instruction.” Alyssa Friend Wise and Kevin O’Neill, “Beyond More Versus Less: A Reframing of the Debate on Instructional Guidance,” in Tobias and Duffy, eds., op. cit., p. 87. On this point, also see Michele T. H. Chi, “Active-Constructive-Interactive,” Topics in Cognitive Science 1 (2009), pp. 92-3; and Schwartz et al., op. cit., p. 49.
This line of attack is wearisomely tedious and breathtakingly hypocritical. The argument goes like this: studies supporting explicit instruction are rigged. Inquiry is set up to fail by reducing it to a straw man version of pure discovery, with no guidance, no scaffolding, and no meaningful structure. Meanwhile, the explicit condition quietly includes opportunities for discussion, questioning, and active engagement. In other words, the charge is that the comparison is unfair because both groups are, in fact, doing some form of constructivist learning.
But this objection misunderstands both the nature of empirical comparison and the substance of what explicit instruction actually involves. It is not a flaw that explicit instruction includes questioning, modelling, discussion, or student activity, it’s what explicit instruction actually is. Fully guided instruction is not the absence of thinking. It is thinking structured around well-sequenced explanation and guided practice. If an instructional condition involves teacher modelling, scaffolding, formative feedback, and opportunities for students to work through problems with support, that is not constructivism in disguise.
The real caricature here is not of inquiry, but of explicit instruction. Critics frequently claim that its defenders create a straw man version of discovery learning, yet they themselves often present explicit instruction as a kind of script-following obedience drill. In reality, explicit instruction includes many forms of student activity, but these are purposeful, aligned, and guided. What distinguishes it from minimally guided approaches is not the presence or absence of activity, but the source and structure of that activity. In explicit instruction, the teacher bears responsibility for sequencing, modelling, and guiding thinking. In inquiry approaches, students are often expected to discover the logic themselves, a process that is inefficient at best and exclusionary at worst when prior knowledge is lacking.
The charge of unfair comparison also rings hollow when applied to defenders of explicit instruction. If there is a double standard, it is that critics like Kohn routinely do exactly what they accuse others of doing. As we’ve seen, Kohn selects studies that test highly structured, teacher-supported versions of inquiry — often with clear goals, ongoing feedback, and well-designed materials — and contrast them with an unrecognisable version of explicit instruction in which the teacher dominates, the students are passive, and understanding is reduced to rote compliance.
The claim that comparisons are unfair because one condition uses an unrealistic version of inquiry deserves scrutiny. In fact, many studies that compare guided and unguided approaches do not set up inquiry to fail. They test real-world implementations or draw directly from classroom practice. Even when inquiry methods are reasonably well supported, they typically underperform relative to structured, teacher-led approaches when judged by accuracy, efficiency, or long-term retention, particularly for novice learners.
The Wise and O’Neill paper, along with Chi’s active-constructive-interactive framework, argues that different forms of engagement lead to different learning outcomes. Fair enough. But that framework reinforces the importance of support. Interactive learning, in Chi’s taxonomy, is most effective when students are responding to prompts, receiving feedback, and working with models. In other words, what Chi calls interactive learning looks remarkably like high-quality explicit instruction.
So yes, the line between active learning and explicit instruction can sometimes blur, but that is not a weakness in the research. It is a reflection of the fact that well-designed instruction is never passive, regardless of the label. What matters is who provides the structure, how it is sequenced, and whether students are supported at the point of need. On that front, the evidence continues to show that fully guided instruction remains the most reliable path to success, particularly for those who need it most.
16. A mostly student-centered approach really does make more sense most of the time. While some researchers talk about looking for ways to combine inquiry and direct instruction (for example, de Jong et al., op. cit.), it’s important to keep in mind that just because the latter can be used in certain circumstances — notably, if a teacher’s goal is to transmit facts rather than to help students learn how to think — that doesn’t prove that it needs to be used or that it’s more effective than student-centered learning (when the latter is accompanied by adequate guidance). Moreover, while a recommendation to find a place for both may appeal to us as a reasonable compromise, it deflects attention from what remains a fundamental divergence in one’s point of departure: Is learning understood mostly as memorizing facts and practicing skills to produce right answers, or as constructing meaning and understanding ideas? Are students primarily seen as passive receptacles or active meaning makers?
Similarly, even though few schools exemplify the pure version at one pole or the other, the vast majority tend to privilege teacher talk over student talk, impose a curriculum that students have little opportunity to help design, and continue to rely on instruments of traditional pedagogy such as lectures, worksheets, quizzes, textbooks, and practice homework. (For evidence of the continued traditionalism of U.S. schools, if anyone really requires it, see my The Schools Our Children Deserve, op. cit., chap. 1; and Robert Pianta et al., “Opportunities to Learn in America’s Elementary Classrooms,” Science 315 [March 30, 2007]: 1795-96.)
Nor should we lose sight of how radically this residual traditionalism diverges from what evidence shows are usually more advantageous strategies. Elsewhere, I — and, of course, many other authors — have discussed how teachers can provide guidance, model problem-solving, elicit students’ questions about the world and create a curriculum with them (rather than just for them), helping students to acquire intellectual proficiencies and to think in increasingly sophisticated ways. Teachers can provide limited direct instruction when necessary, but the bottom line is that their role should not consist chiefly of dispensing information. High-quality teaching is usually more facilitative than directive and more implicit than explicit.
The suggestion that explicit instruction is only appropriate when the goal is to “transmit facts” reflects a profound misunderstanding of how learning works. The implication throughout is that explicit instruction is appropriate only for the unambitious, the unimaginative, or the educationally regressive. The deeper assumption is that the only thing standing between students and intellectual flourishing is the stubborn traditionalism of their teachers. But this framing misrepresents both the nature of explicit instruction and the evidence about how students learn.
The idea that explicit instruction is merely about transmitting facts reflects a caricature, not a serious engagement with how the approach actually works. Fully guided instruction involves modelling thought processes, breaking down complex ideas into teachable steps, checking for understanding, and giving students structured opportunities to practise and consolidate their learning. It is not antithetical to thinking. It is the groundwork that makes thinking possible. You cannot help students become sophisticated thinkers unless you first teach them the knowledge that sophisticated thought depends on.
Nor is the dichotomy between teacher talk and student talk especially helpful. What matters is not who is speaking, but whether what is being said contributes meaningfully to learning. In a well-run classroom, explicit instruction always includes dialogue, questioning, elaboration, and discussion, all carefully designed and facilitated by the teacher. If most classrooms still lean towards teacher-led methods, perhaps that reflects not outdated pedagogy, but a recognition of what works for the students in front of them. The idea that this represents a failure to modernise assumes what it sets out to prove.
The appeal to “evidence” that supposedly confirms the superiority of student-centred approaches is likewise overconfident. Much of the cited research conflates different forms of guidance, or compares highly designed, resource-rich examples of inquiry with implausibly reductive forms of teacher-led instruction. It is not that inquiry never works. It is that its success depends heavily on prior knowledge, expert teaching, and careful planning. In practice, most students - especially those with weaker starting points - benefit more from well-sequenced, explicit instruction than from loosely structured exploration.
To say that good teaching should be “more facilitative than directive and more implicit than explicit” is to confuse tone with method. A calm, questioning, open-minded atmosphere is desirable. But it does not follow that instruction itself should be implicit. Quite the opposite. The clearer the explanation, the more secure the understanding. When teachers are reduced to facilitators of student discovery, too many students are left to grope in the dark, some successfully, others not at all.
This is not to argue for dogma. High-quality teaching includes a range of approaches. But to suggest that explicit instruction is merely about facts, that students are passive when not designing their own curriculum, or that traditional methods are inherently inferior, is to rely more on slogan than on substance. The real question is not whether instruction is explicit or implicit, but whether it is effective. And on that front, the evidence points in one direction more clearly than some would like to admit.
17. [How can ‘traditionalists’] get away with saying DI is “evidence-based” or supported by the “science of learning” “Science” is also misleadingly invoked in support of a reductive phonics-centered method of teaching reading. For more on that issue, see these lengthy excerpts from Kohn, op. cit., and the following more recent rebuttals by experts to the so-called “science of reading” campaign: Robert J. Tierney and P David Pearson, Fact-Checking the Science of Reading (Literacy Research Commons, 2024); David Reinking et al., “Legislating Phonics: Settled Science or Political Polemics?“, Teachers College Record 125 (2023): 104-31; Peter Johnston and Donna Scanlon, “An Examination of Dyslexia Research and Instruction with Policy Implications,” Literacy Research: Theory, Method, and Practice 70 (2021): 107-28; Jeffrey S. Bowers, “Reconsidering the Evidence That Systematic Phonics Is More Effective Than Alternative Methods of Reading Instruction,” Educational Psychology Review 32 (2020): 681-705; Stephen Krashen, “Beginning Reading,” Language Magazine, April 2019; Dominic Wyse and Alice Bradbury, “Reading Wars or Reading Reconciliation?“, Review of Education 10 (2022): e3314; and Catherine Compton-Lilly et al., “Stories Grounded in Decades of Research: What We Truly Know About the Teaching of Reading,” The Reading Teacher 77 (2023): 392-400. Also see a series of three essays by literacy expert Maren Aukerman on the media’s coverage of reading instruction, all titled “The Science of Reading and the Media” and subtitled, respectively, “Is Reporting Biased?“, “Does the Media Draw on High-Quality Research?“, and “How Do Current Reporting Patterns Cause Damage?” Literary Research Association Critical Conversations, 2022.
This line of questioning - how can traditionalists get away with claiming that explicit instruction or phonics-led reading is “evidence-based” - is not so much a challenge to the science as a refusal to accept what that science, on the whole, actually says. The implication is that talk of “the science of learning” or “the science of reading” is a public relations campaign for reactionary pedagogy, a political manoeuvre cloaked in lab coats.
Let’s take that claim seriously. Is there a reductive version of the science of reading? Yes. Are there cheerleaders who overstate or oversimplify? Absolutely. But none of that alters the core fact that the most robust and replicated findings in cognitive science do support explicit instruction in general and systematic phonics in particular, especially for novice readers and those at risk of falling behind.
The list of counter-citations offered here, drawn from a familiar circle of critics, does little to shift the weight of that evidence. Bowers, for instance, is widely cited for questioning the phonics consensus, but his work has been repeatedly and carefully rebutted by reading researchers who have pointed out serious methodological flaws and selective interpretations. Tierney and Pearson, respected figures in literacy, challenge what they see as a narrowing of the reading curriculum, but they do not provide strong evidence that systematic phonics is ineffective, only that it should not be all there is. On that point, virtually no one disagrees.
What Kohn and his preferred sources consistently overlook is the distinction between sufficiency and necessity. Phonics is not sufficient for reading comprehension but it is necessary. Children need to learn how the alphabetic code works before they can begin to access and engage with meaning. Without that, fluency stalls, motivation declines, and the rich world of texts remains locked behind a door they cannot yet open.
The appeal to “constructivist” alternatives, often based on immersion in rich literature, assumes that children will intuit the structure of written language from exposure alone. This may work for some. It fails many others. And it is those children - those least likely to arrive at school already immersed in books, talk, and print - who pay the price when instruction prioritises philosophy over effectiveness.
As for the broader charge that traditionalists misuse the “science of learning,” it is worth remembering that the core findings in this field are not controversial. Working memory is limited. Prior knowledge enables comprehension. Spaced retrieval supports long-term retention. Novices benefit from fully guided instruction. These are not ideological preferences. They are stable results from decades of research in cognitive psychology. To pretend otherwise is to trade clarity for convenience.
Ultimately, the issue is not whether the label “science” is sometimes used simplistically. It is. The issue is whether the best available evidence supports structured, explicit teaching as the most reliable foundation for learning. It does. No amount of rhetorical doubt or selective citation will change that. The real question is whether we are prepared to act on what we know, not just for the confident and curious, but for the struggling and unseen.
18. “there are no standard, reliable, and valid measures for the main constructs of the theory….Without a measure of cognitive capacity, the predictions of CLT cannot be tested.” The first quote is from Ton de Jong, “Cognitive Load Theory, Educational Research, and Instructional Design,” Instructional Science 38 (2010), p. 114. (That article offers a useful review of the technical literature about CLT more generally.) The second quote is from Roxana Moreno, “Cognitive Load Theory,” Instructional Science 38 (2010), pp. 136, 137. The fundamental conjecture of CLT therefore must rely on “research in related domains”; it has never been tested by comparing constructivist teaching and direct instruction to see if the latter actually reduces cognitive load (Sigmund Tobias, “An Eclectic Appraisal of the Success or Failure of Constructivist Instruction,” in Tobias and Duffy, eds., op. cit., p. 340).
This critique seeks to undermine Cognitive Load Theory (CLT) by pointing to what appear to be serious empirical gaps: the absence of standard, reliable measures for core constructs, the need to borrow evidence from related domains, and the claim that CLT’s predictions have not been directly tested by comparing constructivist and explicit instruction in terms of actual cognitive load. It sounds damning only if you misunderstand how theories in psychology actually work.
CLT is not a closed, test-every-variable lab theory. It is a design theory: a framework for making sense of why some forms of instruction succeed and others do not, particularly for novices. Its strength lies not in a single definitive experiment, but in its explanatory power across hundreds of studies and learning contexts. The absence of a precise, universally accepted metric for “cognitive capacity” is hardly fatal. Most meaningful constructs in education - from motivation to engagement to understanding - are inferred, not directly measured. What matters is whether the predictions made by CLT consistently align with observed effects. And they do.
The core insight of CLT is straightforward and well-supported: working memory is limited, and instructional methods that overload it tend to impair learning. This idea is not controversial. It builds on foundational research in cognitive psychology from Miller, Sweller, Baddeley, and many others. The theory predicts that learners benefit when unnecessary cognitive load is reduced and that explicit, sequenced instruction achieves this more effectively than unguided exploration. That prediction has been borne out in dozens of domains, particularly when working with novices.
De Jong’s criticism that CLT lacks standardised measures is not a refutation but a prompt for further research. Moreno’s reminder that CLT relies on related research is equally unremarkable. All useful theories build from existing work. Tobias’s claim that CLT has not been directly tested by comparing instructional methods ignores the evidence. In fact, many studies have done just that, comparing minimally guided and fully guided approaches using behavioural proxies for cognitive load: accuracy, learning time, retention, transfer. These studies consistently show that learners using well-structured, explicit instruction perform better, especially on complex or unfamiliar tasks.
None of this makes CLT untouchable. Like any theory, it evolves. Measures are improving. Operational definitions are tightening. But to imply that its lack of a single psychometric tool invalidates its conclusions is like arguing that we cannot study gravity until we build a perfect scale. In education, we do not need perfect measures to make informed decisions. We need theories that explain how people learn and guide us toward better teaching. CLT does both.
19. CLT “is constructed in such a way that it is hard or even impossible to falsify.” de Jong, op. cit., p. 125. Also see Wolfgang Schnotz and Christian Kürschner, “A Reconsideration of Cognitive Load Theory,” Educational Psychology Review 19 (2007): 469-508; and Guy Claxton, “Cognitive Load Theory: Just Brain Gym for Traditionalists?“, blog post, August 15, 2022.
The claim that CLT is unfalsifiable deserves serious scrutiny. There is a legitimate conversation to be had about the testability of complex design theories like CLT. Its constructs - intrinsic load, extraneous load, germane load, element interactivity - are not always easy to isolate or measure with precision. This does not, however, make the theory unfalsifiable but makes it difficult to reduce to a single lab condition. This is a challenge in almost all areas of applied cognitive psychology, from memory research to motivation. What matters is whether the theory generates predictions that are consistently supported across diverse contexts. In CLT’s case, it does. Novices benefit from guidance. Overloading working memory hinders learning. Instructional design matters. These are not vague truisms. They are replicable insights that explain why some approaches consistently outperform others.
Schnotz and Kürschner raise some thoughtful questions about how CLT models learning processes, and their work is worth reading. But using Guy Claxton as support for your argument is a sign you don’t wish to be taken seriously. Claxton’s blog post, “Cognitive Load Theory: Just Brain Gym for Traditionalists?” is poorly thought out performance art. As is typiucal of his oeuvre, he begins with mockery and ends with insult, scattering a few misunderstood quotations along the way. The argument, such as it is, proceeds by insinuation: CLT is popular among people who like traditional teaching, therefore it must be suspect. He offers no serious engagement with the empirical literature, no understanding of the theory’s architecture, and no alternative framework with anything close to equivalent explanatory power. What he delivers is the intellectual equivalent of standing outside a laboratory with a placard that says “I’m bored.”
Comparing CLT to Brain Gym is not merely unserious, it’s dishonest. Brain Gym was pseudoscientific nonsense, rightly discredited. CLT is the product of decades of peer-reviewed research, grounded in established cognitive principles, and used to design more effective instruction for real students in real classrooms. The fact that it supports explicit teaching - and therefore threatens the aesthetic preferences of progressive mouthpeices like Claxton - is not a flaw in the theory. It is a measure of its utility.
This sort of criticism, dressed up in scorn and unburdened by evidence, says more about the anxieties of its author than the reliability of the theory he attacks. If Claxton wants to debate pedagogy, he is welcome to do so. But he might begin by reading the literature he claims to critique, and by offering a theory of learning that does something more than flatter the adult’s desire to feel less like a teacher and more like a facilitator of vibes.
20. “dissatisfaction about the explanatory and predictive value of CLT continues to grow among the scientific community” Moreno, op. cit., p. 139.
Moreno’s claim sounds damning but we should ask who, exactly, is dissatisfied, and with what? As with many criticisms of CLT, the complaint here is not that the theory is wrong, but that it lacks precision. Its components can be difficult to isolate. Its predictions may depend on context. Its terminology has evolved over time. In other words, it shares the same challenges as every theory in applied educational psychology.
None of this amounts to a collapse in confidence. CLT continues to underpin high-impact instructional design, from worked example research to multimedia learning. It remains central to the work of researchers like Sweller, van Merriënboer, Mayer, and Paas. Its insights are widely used in teacher training, curriculum design, and digital learning platforms. If this is what growing dissatisfaction looks like, it is remarkably productive. More to the point, many of those voicing “dissatisfaction” are not rejecting the theory’s core claims. They are calling for refinements; tighter definitions, better operationalisation, clearer boundaries between types of load. These are the signs of a maturing theory, not a failing one. Scientists disagreeing about how best to apply a theory is not evidence that the theory is broken. It is evidence that it is being taken seriously.
What some critics really object to is not the explanatory value of CLT, but the conclusions it supports. Because CLT consistently highlights the importance of fully guided instruction, especially for novices, it cuts against the grain of romanticised constructivism. For those invested in discovery learning or student-led inquiry, this is deeply uncomfortable. The dissatisfaction, in many cases, is ideological rather than scientific.
In the end, the test of an educational theory is not whether it pleases everyone, but whether it helps us understand how people learn and how teaching can be improved. CLT does both. It may not be perfect, but it remains one of the most powerful and practically useful frameworks we have.
21. As cognitive scientist Guy Claxton put it, “CLT is just a fad; it is, as someone said, like Brain Gym for Traditionalists. The sooner the fad passes the better.” Claxton, op. cit.
That a supposedly serious academic could repeat such a glib line without irony tells you everything you need to know about the intellectual rigour of both Claxton’s and Kohn’s argument: all posture, no substance, and not a trace of curiosity about why CLT continues to be taken seriously by those who actually study how learning works.
22. “CLT offers no recognition of the learner as an autonomous agent.” Peter Ellerton, “On Critical Thinking and Content Knowledge,” Thinking Skills and Creativity 43 (2022), p. 10. Also see Moreno, op. cit.
Perhaps not. CLT is a theory of cognitive architecture, not a manifesto for personal development. Demanding it account for agency is like criticising a blueprint for not being a poem.
23. Many years ago, a group of researchers tried to sort out the factors that helped children to remember what they’d been reading. They found that how interested the students were in the passage was thirty times more important than how “readable” the passage was. Richard C. Anderson et al., “Interestingness of Children’s Reading Material,” in Richard E. Snow and Marshall J. Farr, eds., Aptitude, Learning, and Instruction, vol. 3: Conative and Affective Process Analyses (Erlbaum, 1987), p. 288. Many other studies have reached the same general conclusion about the disproportionate impact of interest. In one experiment, fourth graders’ comprehension turned out to be so much higher when the passages they were assigned dealt with topics that interested them — suddenly, the kids were testing well above their supposed reading level — that the researchers ended their report by asking why teachers and researchers tend to be so “concern[ed] with difficulty when interest is so obviously a factor in comprehension.” See Thomas H. Estes and Joseph L. Vaughan, Jr., “Reading Interest and Comprehension: Implications,” The Reading Teacher 27 (1973): 149-52; quotation appears on p. 152. Today, the same persistent inattention to motivation is one of many problems with what is imposed on classrooms in the name of the “science of reading.” See Seth A. Parsons and Joy Dangora Erickson, “Where Is Motivation in the Science of Reading?“, Phi Delta Kappan, February 2024: 32-36; and Nancy Bailey, “The Comprehension Problem with New Reading Programs,” blog post, June 24, 2024.
This line of argument trades on a kind of motivational romanticism: the idea that interest alone can lift comprehension, transcend decoding difficulty, and reveal the absurdity of fixed reading levels. The studies cited do offer useful insights, but they are frequently misinterpreted as evidence that interest overrides the need for foundational skills. That is not what they show.
Yes, of course interest matters but interest amplifies comprehension only when the basic mechanics of reading are in place. A child cannot comprehend a passage they can’t decode. Inflated comprehension scores on high-interest topics often reflect the activation of prior knowledge, not some mystical power of engagement. The child doesn’t suddenly leap above their “reading level” because they’re more interested — they’re drawing on what they already know, which makes the text easier to process.
Moreover, the claim that motivation is absent from the “science of reading” is a straw man. Serious reading researchers don’t deny the importance of engagement; they simply refuse to treat it as a substitute for instruction. Parsons and Erickson’s piece - while right to highlight blind spots - offers more polemic than substance. Bailey’s blog is anecdotal. The evidence base for systematic phonics, on the other hand, is deep, international, and growing.
To suggest that phonics-focused reforms are failing because they ignore interest is to confuse two distinct stages of reading development. First, children must become fluent decoders. Only then can interest act as a multiplier. Reversing that order - prioritising engagement before accuracy - is exactly what left so many children behind under the old whole language model.
In short, interest is not a magic key that unlocks comprehension for struggling readers. It is a catalyst but only when the engine is already running. If the “science of reading” sometimes underemphasises motivation, its critics often do the opposite, overstating its role and underplaying the brutal reality that children who can’t decode cannot read.
24. Good luck finding any discussion of students’ motivation, emotions, beliefs, or agency — any acknowledgment that they “decide whether they do or do not engage and [the cognitive] resources they will invest” Schnotz and Kürschner, op. cit., p. 497.
The complaint that CLT neglects motivation, emotion, or agency badly misunderstands what CLT is for. It is not a grand unified theory of human psychology. It is a design theory, concerned with how instructional materials interact with cognitive architecture. To fault it for not modelling belief or desire is like faulting a scalpel for not performing heart surgery on its own.
CLT doesn’t ignore motivation because it thinks it doesn’t matter. It simply brackets it off, deliberately and appropriately, in order to focus on the constraints of working memory and long-term schema formation. That is its job. Other theories handle the rest. Expecting CLT to account for emotional states is like expecting load-bearing calculations to account for paint colour. Different tools for different problems.
More to the point, much of the research that draws on CLT does in fact take engagement into account, not by theorising about it in the abstract, but by measuring what works in practice. When instruction is clearly sequenced, well-modelled, and structured to reduce unnecessary cognitive effort, students are more likely to engage. Not always because they are intrinsically motivated, but because the task is manageable. Clarity breeds confidence, and confidence sustains effort.
This is the piece Kohn habitually gets backwards. He wants motivation to precede success. But in reality - and this is backed by a substantial body of research in educational psychology - motivation is often a result of success, not its precursor. Students become motivated when they feel competent. That sense of competence arises when tasks are structured so they can actually succeed at them. In other words, effective instruction isn’t a reward for motivation. It is the condition that makes motivation possible.
25. CLT flattens learning and problem-solving into a mechanical process of taking in information, holding it briefly in short-term memory, and then storing it permanently in long-term memory, with the last step described, remarkably, as “the ultimate justification for instruction.” Kirschner et al., op. cit., p. 77.
If anything flattens learning, it is this caricature of CLT. Yes, Kirschner et al. describe schema acquisition as the goal of instruction. And yes, they state that storing information in long-term memory is the “ultimate justification.” But this is not reductive unless you already assume that memory is somehow less important than, or separate to, ‘real’ learning.
What is stored in long-term memory is not inert fact. It is the very stuff of thought — concepts, procedures, strategies, patterns — and it is what makes problem-solving possible. Without it, you are not solving problems, you are floundering. Far from flattening learning, CLT explains how deep understanding is built: not by magic, but through careful, cumulative construction of knowledge over time.
To suggest otherwise is to romanticise the surface features of inquiry while ignoring the cognitive mechanisms that make it work. Learning cannot be all improvisation and insight. It must rest on something. CLT simply identifies what that “something” is and shows how to build it.
26. [CLT] “is based on a vastly oversimplified and antiquated notion of ‘working memory’ that was current in psychology in the 1970s” Claxton, op. cit. He continues: “The computer metaphor on which the original concept of WM was based is no longer widely accepted as an accurate or adequate depiction of human cognition. Brain-based theories, in which there are no separate memory ‘stores’ – no boxes in the head – underpin much current research, and they do not lead to or justify anything like John Sweller’s image of Cognitive Load.”
The claim that CLT rests on a “vastly oversimplified and antiquated” model of working memory might sound damning if it were remotely true. But the Baddeley and Hitch model that underpins CLT is not some fossil from the 1970s. It’s a foundational framework in cognitive psychology, still widely cited, tested, and refined.
Claxton’s dismissal of the “computer metaphor” and “boxes in the head” is pure straw man. No serious theorist believes the brain is literally a filing cabinet. Metaphors are explanatory tools, not ontological claims. The fact remains: human cognition is constrained by limited working memory, and long-term learning depends on what is stored and retrievable. That’s not a metaphor. That’s a well-established empirical fact.
And what does Claxton offer in return? Nothing. No theory, no evidence, just vague gestures toward “brain-based” theories that apparently justify abandoning the most robust instructional design framework we have. It’s intellectual sleight of hand — empty insinuation in place of actual critique. That Kohn finds this persuasive tells us more about his biases than about CLT.
27. [CLT] fails to justify the conclusion that learners can’t handle anything other than formal, explicit instruction. For more on CLT’s simplified view of cognition, see David Jonassen, “Reconciling a Human Cognitive Architecture,” in Tobias and Duffy, eds., op. cit. On the existence of diverse types of load, and the fact that some learning doesn’t require additional working memory, see Schnotz and Kürschner, op. cit., especially pp. 485 and 502; and Sue Gerrard, “Direct Instruction: The Evidence,” blog post, April 23, 2014; and “How Working Memory Works,” blog post, March 16, 2014. Gerrard explains why the model of working memory used by Sweller and his colleagues “appears to be oversimplified and doesn’t take into account the biological mechanisms involved in learning.” Their model is based on straightforward activities like solving algebra problems in which “new items coming into the buffer displace older items, so buffer capacity would be a limiting factor. But real-world problems tend to involve different buffers, so items in the buffers can be easily maintained while they are manipulated by the central executive…..Discovery, problem-based, experiential, and inquiry-based teaching in classrooms tends to more closely resemble real world situations than the single-buffer problems” on which CLT is based.
The claim that CLT “fails to justify the conclusion that learners can’t handle anything other than formal, explicit instruction” is a classic case of tilting at windmills. CLT does not insist that learners must only ever be taught through direct explanation; in fact it suggests the precise opposite for expert learners. What it argues is that unguided or minimally guided methods are less effective for novices, because they place a heavier burden on working memory.
Jonassen and Schnotz raise reasonable points about the complexity of cognition. Of course learning involves different types of load, of course some tasks feel more intuitive, and yes, as expertise grows, the limits of working memory become less restrictive. None of this contradicts CLT. It simply refines our understanding of when and how guidance is most helpful. The more complex and open-ended the task, the more carefully instruction must be structured to support novices without overwhelming them.
Sue Gerrard’s hobbyist critique attempts to expose what she sees as an oversimplified model of working memory. Her point about different buffers is well taken in a technical sense, but it misses the point pedagogically. CLT is not a theory of brain architecture. It is a practical framework for instructional design. To complain that it does not fully reflect neural complexity is to misunderstand its purpose. Teachers do not need a neuroscientific model of synaptic activity, they need a theory that helps them decide how much information to present and when.
Gerrard is also wrong to suggest that Sweller’s research only applies to narrow academic tasks like solving algebra problems. In fact, CLT has been tested in a wide range of contexts, including complex and realistic learning environments. The idea that real-world problem-solving is immune to working memory limits is simply untrue. If anything, the demands are greater, which makes the case for thoughtful guidance even stronger.
Kohn’s tactic here feels, increasingly, designed to bury the reader in citations that appear to complicate the story, while avoiding the simple truth at the heart of CLT: memory is limited, and instruction that ignores this is likely to fail. The theory is not a straitjacket. It is a safeguard against the wishful thinking that so often accompanies romanticised views of discovery learning.
28. When you’re taught the procedure, you come away with only a formula for solving similar problems. First quotation: Ellerton, op. cit., p. 8. Second quotation: Henk G. Schmidt et al., “Problem-Based Learning Is Compatible with Human Cognitive Architecture,” Educational Psychologist 42 (2007), p. 95. If we want to help students to think flexibly and be prepared for future learning, these authors add, then problem-based learning is a better bet than direct instruction.
The argument here hinges on a false dichotomy. Ellerton claims that when students are taught a procedure, all they acquire is a formula, as if procedural knowledge and conceptual understanding cannot coexist. But this is a caricature. High-quality explicit instruction does not simply hand students a recipe and walk away. It explains why the procedure works, models its logic, and offers opportunities to apply it in varied contexts. The idea that procedural teaching necessarily precludes flexible thinking is not supported by the evidence. In fact, much of the research in mathematics education shows that students who master procedures first are better able to develop conceptual insight later.
The Schmidt et al. claim, that problem-based learning better prepares students for future learning, depends entirely on context. Their 2007 paper was written as a rebuttal to earlier critiques of problem-based learning, but its conclusions are highly conditional. Even the authors acknowledge that novices benefit from guidance and that unguided exploration is only effective when learners already possess sufficient domain knowledge. In other words, problem-based learning may support future transfer after foundational knowledge has been acquired, not in place of it.
This is the crux of the matter. Flexibility and transfer do not emerge from thin air. They are built on a secure base of knowledge. Problem-based approaches may be useful for extending understanding, but they are inefficient and often ineffective for building it from scratch. To treat them as a replacement for explicit instruction is to misunderstand how learning progresses and to risk leaving many students behind.
29. “Conditions that maximize performance in the short term may not necessarily be the ones that maximize learning in the long term.” Manu Kapur and Nikol Rummel, “Productive Failure in Learning from Generation and Invention Activities,” Instructional Science 40 (2012), p. 645. Emphasis added to underscore the two separate distinctions being made here — between short term and long term, and between performance and learning. The latter distinction is critical to understanding the inadequacy of DI: Eliciting a correct answer may amount to what these authors call “unproductive success” rather than meaningful learning. Indeed, Kapur found in a pair of studies that students who had to wrestle with problems on their own before receiving instruction “significantly outperformed DI students on conceptual understanding and transfer without compromising procedural knowledge.” Did they expend greater mental effort? Yes. Did that hurt? No – if anything, it helped. See Manu Kapur, “Productive Failure in Learning Math,” Cognitive Science 38 (2014): 1008-22.
The quotation from Kapur and Rummel is often used to imply that explicit instruction prioritises short-term performance at the expense of long-term learning. It is an appealing line, and a misleading one. No serious advocate of explicit instruction confuses immediate performance with enduring understanding. On the contrary, the distinction between performance and learning has been central to cognitive psychology for decades, and is embedded in the very instructional principles that CLT and its allied models promote.
This same point is reinforced by Soderstrom and Bjork, who distinguish clearly between the conditions that support performance (what students can do right now) and those that support learning (what will stick and transfer over time.) Their 2015 review lays out the danger of judging instruction by what is easily measured in the moment. Students who are able to produce the correct answer during a lesson may not retain or understand it later. But this does not mean that high initial performance is worthless, or that we should engineer failure as a pedagogical goal. It means we must be cautious in how we interpret short-term success and deliberate in how we structure learning so that it endures.
Kapur’s concept of “productive failure” is important and worth engaging with. It highlights the value of allowing students to struggle with problems before being shown solutions, and in some contexts, particularly with more advanced learners, this can lead to deeper conceptual understanding. But it does not follow that this finding undermines explicit instruction. First, in Kapur’s own studies, instruction still follows the struggle. The failure is not productive on its own; it becomes productive because it is eventually resolved through teaching. Second, these studies are carefully structured, with support systems in place and narrowly defined tasks. They are a far cry from the open-ended, minimally guided learning environments that many proponents of discovery favour.
Moreover, the fact that students exert greater cognitive effort during the struggle phase does not automatically make the experience better. As Soderstrom and Bjork point out, effortful processing can enhance retention, but only when that effort is focused in the right way. High effort is not inherently virtuous. It can aid memory when it is productive, but it can also overwhelm and demoralise when poorly timed or unsupported. The key question is not whether students are working hard, but whether that effort is directed, structured, and scaffolded.
The very idea of “productive failure” is not without its problems. It rests on a shaky optimism that failure will, through some alchemical process, lead to insight. But failure is only productive when it is framed, supported, and swiftly followed by clear teaching. Otherwise, it risks becoming merely confusing or discouraging. Worse, the concept sometimes functions as a seductive justification for instructional designs that delay explanation in the name of “discovery,” even when evidence shows that novices learn better from guidance first. Struggle is not a virtue in itself. It is only desirable when it prepares students to understand what follows. In too many classrooms, what is called productive failure amounts to aimless guessing followed by rushed clarification.
The phrase “unproductive success” is rhetorically clever but pedagogically dangerous. If students are giving correct answers and developing procedural fluency, that is not failure. It is the foundation on which conceptual understanding can be built. Productive failure, if mistimed or misused, can all too easily become just failure. The more reliable path remains well-sequenced, explicit teaching that builds knowledge systematically and provides just enough challenge to support growth without collapse.
30. “learning can be impeded…when too much help is provided.” Schnotz and Kürschner, op. cit., p. 484. Thus, presenting “worked examples” to students can be “beneficial for task performance but not for learning. In other words: Making a task easier does not necessarily result in better learning” (ibid., p. 493).
This is an important caveat, but one that’s often misapplied in criticisms of explicit instruction. Schnotz and Kürschner rightly observe that learning can be impeded when too much help is provided. They note that presenting worked examples may benefit immediate task performance without necessarily improving deeper learning. In their words, making a task easier does not automatically result in better understanding. This has been used by some to suggest that explicit instruction over-scaffolds and therefore suppresses thinking.
But this criticism fundamentally misunderstands how fully guided instruction - particularly in the form informed by CLT - is actually designed. Cognitive Load Theory explicitly incorporates the concept of fading: the gradual withdrawal of instructional support as competence increases. Early learning benefits from examples, modelling, and structured practice because novices lack the mental schemas to cope unaided. As fluency develops, support must taper to allow for increasing independence.
The goal of explicit instruction is not to make tasks permanently easier, but to reduce unnecessary complexity at the point of first contact. Initial success builds confidence and schema, which in turn make further complexity manageable. If instruction is never faded, that is not CLT’s fault. That is bad implementation. In this light, the Schnotz and Kürschner warning is not a strike against structured teaching but a reinforcement of what the best explicit instruction already aims to do: support students long enough to learn, then step back just in time for them to think.
31. “Increasing the cognitive load under certain circumstances can improve learning.” Walter Kintsch, “Learning and Constructivism,” in Tobias and Duffy, eds., op. cit., p. 229.
Walter Kintsch’s observation that “increasing the cognitive load under certain circumstances can improve learning” is both accurate and widely misinterpreted. It does not license making learning harder for its own sake, nor does it imply that cognitive effort always yields dividends. Rather, it points to the pedagogical importance of what Robert Bjork has termed desirable difficulties, conditions that induce struggle but result in success. These are tasks that may slow performance in the short term but enhance retention and transfer in the long term.
Far from contradicting Cognitive Load Theory, this principle fits squarely within it. CLT has always distinguished between different types of cognitive load. Extraneous load, caused by poor instructional design, must be reduced. Intrinsic load, which arises from the complexity of the material itself, must be managed. Germane load- the mental effort invested in processing, understanding, and building connections - should be encouraged. Desirable difficulties fall into this final category. They are challenging but tractable. They promote deeper engagement precisely because they are embedded within structured, meaningful learning.
The idea that instruction should be effortful in order to be effective has gained traction, but effort alone is not enough. Struggle only becomes productive when it occurs within a framework that ensures eventual success. Absent that structure, cognitive difficulty simply becomes noise. Kintsch’s point is not that more load is better. It is that load, when aligned with the learner’s level of knowledge and supported by clear instructional goals, can drive learning forward.
Explicit instruction, when well designed, already incorporates this logic. It begins by reducing unnecessary difficulty to prevent overload, but then deliberately increases challenge as understanding grows.
32. “Premature efficiency is the enemy of subsequent new knowledge construction.” Schwartz et al. op. cit., p. 46.
A neat phrase, and one that invites exactly the kind of misuse Kohn puts it to. What Schwartz et al mean is that when students are taught procedures too early, before they’ve had a chance to wrestle with the underlying ideas, they may apply those procedures mindlessly. The risk is that fluency with a method masks a shallow grasp of its rationale.
But again, this is not an argument against explicit instruction. It is an argument against bad instruction, the sort that drills answers without modelling thought, or prizes speed over understanding. The best explicit instruction is deliberately sequenced to build conceptual knowledge before procedural fluency, or at least alongside it. Efficiency is not the enemy. Efficiency without comprehension is.
More to the point, the idea that early success somehow impedes future learning contradicts the wider body of cognitive science research, including the work of Bjork, Sweller, and Rosenshine. Foundational knowledge supports transfer. Mastery of the basics creates the conditions in which flexible, conceptual thinking becomes possible. If students appear efficient too soon, the problem may not be the timing of instruction, but the superficiality of the assessment. The solution is not to delay teaching but to deepen it.
33. If the goal is for students to make meaning, to think critically and creatively and flexibly, then direct instruction is usually unwise and cognitive load shouldn’t be our chief concern. For example, see Rand J. Spiro and Michael DeSchryver, “Constructivism: When It’s the Wrong Idea and When It’s the Only Idea,” in Tobias and Duffy, eds., op. cit.
This seductive claim yet again assumes that direct instruction and meaning-making are mutually exclusive, that thinking critically or creatively is only possible when structure is removed. But this misunderstands both the nature of cognition and the purpose of instruction.
If the goal is to help students think well, flexibly, critically, and creatively, then more structure, not less, is often needed. Complex thinking depends on prior knowledge, and the mind cannot conjure meanings from a vacuum. Before one can critique an argument or generate a novel solution, one must have something to think with. That something is stored in long-term memory, which is exactly what cognitive load theory is designed to support.
Spiro and DeSchryver’s work on ill-structured domains is valuable in reminding us that not all knowledge is linear or tidy. But even in these domains, some concepts are foundational. Some things must be explained before they can be explored. To treat cognitive load as irrelevant whenever the learning goal is higher order thinking is to ignore the basic fact that human working memory is limited. Overloading it with too much complexity too soon does not spark insight. It breeds confusion.
Well-designed explicit instruction does not suppress thought. It cultivates the conditions for it. It builds schema, reduces unnecessary demands, and allows students to focus on the ideas that matter. When cognitive load is managed well, it frees up space for meaning-making. When it is ignored, all too often only the most advantaged students, those who already know enough to cope, benefit.
34. …the case for direct instruction based on cognitive load “comes from studies on individual learning settings instead of group-based learning settings such as [problem-based learning], where different cognitive load conditions apply.” The quotation is from Schmidt et al., op. cit., p. 95. On the value — and different variants — of cooperative learning, see my “Learning Together,” which originally appeared as chapter 10 of Kohn, No Contest: The Case Against Competition, rev. ed. (Houghton Mifflin, 1992).
The point that much of the research underpinning CLT has been conducted in individual learning contexts rather than in collaborative or group-based environments is a fair one. As Schmidt et al. note, cognitive dynamics shift when learners work together. In cooperative settings such as problem-based learning, demands on individual working memory can be distributed across group members, allowing for more complex tasks to be undertaken without overloading any one person.
But this does not invalidate the insights of CLT. It simply means that the theory must be applied with attention to context. The fundamental constraint, that working memory is limited and that effective instruction must account for this, still holds. If anything, group work demands even more careful instructional design. Poorly structured collaboration can increase extraneous load through confusion, off-task behaviour or cognitive fragmentation. Productive group learning, like individual learning, benefits from clearly modelled goals, shared schema and scaffolding that supports joint problem-solving.
Moreover, the best cooperative learning environments often begin with explicit instruction. Students need a common conceptual foundation in order to contribute meaningfully and build on one another’s thinking. That foundation is not acquired through osmosis or wishful thinking. It is taught. The fact that cognitive load behaves differently in group settings is a reason to be even more precise about how and when instruction occurs, not a reason to discard what we know about how memory and learning function.
35. Even if CLT were persuasive, it wouldn’t necessitate direct instruction. In real life, there are lots of ways to avoid cognitive overload, such as doing one segment of a task at a time or writing things down so everything doesn’t have to be recalled at once. CLT research excludes such strategies, “thus creating a situation that is artificially time-critical” in order to make its theory seem correct. These studies are contrived in other ways, too, such as giving learners no choice about what they’re doing, which naturally affects their motivation, and providing very little time to study the material on which they’ll be tested. de Jong, op. cit., pp. 123, 124.
then immediately lists a set of real-world strategies learners use to manage cognitive load. But those strategies do not undermine CLT. They demonstrate it. Segmenting tasks, writing things down, offloading memory; these are exactly the kinds of adaptations cognitive load theory predicts and explains. The existence of coping strategies is not a refutation. It is evidence that the underlying constraint is real.
To then accuse CLT research of being “contrived” because it removes such supports is to miss the point of controlled experimentation. Of course studies isolate variables. That is what makes causal claims possible. Suggesting this is somehow dishonest is like accusing a chemist of cheating because they used a lab. Experimental settings simplify in order to reveal what matters, they do not pretend to replicate every detail of real life.
Worse still is the notion that giving learners little time or no choice invalidates the research. Classrooms often involve limited time and tasks that are not freely chosen. Any serious theory of instruction must account for these conditions. If cognitive load theory performs well under pressure, that is a feature not a bug.
To wave away this body of research on the grounds that it doesn’t mirror idealised versions of student-led learning is to have it both ways. Either we take cognitive load seriously and design accordingly, or we abandon all pretence of rigour. What we cannot do is reject the theory while simultaneously appealing to its core premises - that overload impairs learning - and pretending this somehow vindicates constructivist methods.
36. “Most proponents of [inquiry learning] are in favor of structured guidance,” one group of researchers notes, but that guidance looks nothing like direct instruction; it “affords choice, hands-on and minds-on experiences, and rich student collaboration.” Hmelo-Silver et al., op. cit., p. 104. Wise and O’Neill, op. cit., make the same point, as do Kapur and Rummel, op. cit.
The claim that most proponents of inquiry learning endorse “structured guidance” is perfectly true — but profoundly misleading if used to suggest that they are somehow aligned with the principles of explicit instruction. As Hmelo-Silver et al. point out, the kind of guidance typically favoured in inquiry models “affords choice, hands-on and minds-on experiences, and rich student collaboration.” That is a far cry from what cognitive load theorists or advocates of fully guided instruction recommend.
This matters because the term guidance is being stretched to cover very different things. In inquiry approaches, guidance often amounts to intermittent nudges while students explore. In explicit instruction, it means systematic modelling, carefully sequenced explanations, and close monitoring of understanding. Calling both “guided” is like saying a treasure map and a satnav offer the same support.
The same distortion appears in Wise and O’Neill and in Kapur and Rummel, who argue for hybrids but in practice lean heavily toward open-ended exploration followed by feedback. This may work in some contexts, especially for older or more advanced students. But to present it as broadly equivalent to explicit instruction is, again, to have one’s cake and eat it.
37. One metaanalysis found that DI is inferior to even the sort of inquiry in which relatively little guidance is provided. More guidance can be even better — which is what that same review found. Erin Marie Furtak et al., “Experimental and Quasi-Experimental Studies of Inquiry-Based Science Teaching: A Meta-Analysis,” Review of Educational Research 82 (2012): 300-29. For another study (published after this review) that found students who received no guidance at all did better than those who received direct instruction or worked examples, see Kapur 2014, op. cit.
The claim that Furtak et al.’s meta-analysis shows direct instruction to be inferior even to minimally guided inquiry is, at best, a distortion and, at worst, a wilful misreading. What the 2012 review actually found was that inquiry-based teaching can be effective, especially when it includes elements of structured support, not that it reliably outperforms direct instruction, let alone that minimally guided inquiry does so. The review is careful to note wide variability in effect sizes and outcomes, heavily dependent on how inquiry was implemented and the extent of scaffolding provided.
To treat this as evidence against direct instruction is to confuse correlation with causation and nuance with negation. More guidance, the review concludes, leads to better results, which directly undermines the case for minimal guidance and reaffirms the principle at the heart of cognitive load theory. The idea that “relatively little” guidance outperforms fully guided instruction isn’t supported by the evidence; it’s a selective interpretation, stripped of the methodological detail and caution that the authors themselves apply.
As for Kapur’s 2014 study, while it does provide interesting support for “productive failure,” it too relies on eventual instruction. The students who struggled first still received targeted teaching afterwards. The learning gain came not from the absence of instruction, but from its careful sequencing.
Once again, we are presented with a bait-and-switch: evidence that favours structured inquiry is passed off as an indictment of direct instruction, when in fact it confirms that guidance matters and the more precise and deliberate that guidance, the more reliable the results.
After this exhaustive process, I feel confident in saying that Alfie Kohn’s critique of explicit instruction and CLT is not driven by a careful weighing of empirical evidence, but by a deep-seated ideological preference for constructivist, student-centred approaches. He is not so much analysing the theory of instruction as prosecuting it, with the tone and tactics of a polemicist not that of a researcher. His motivation appears to be the defence of a particular educational vision, one in which autonomy, exploration, and intrinsic motivation are not just pedagogical tools but moral imperatives. Explicit instruction, by contrast, is painted not only as ineffective but as dehumanising, coercive, and even harmful.
This ideological impulse shapes the way Kohn handles evidence. He consistently cherry-picks studies that appear to support his thesis, while ignoring or misrepresenting the much larger and more robust body of research that contradicts it. He often conflates explicit instruction with passive lecturing, setting up a straw man only to knock it down. Meanwhile, studies showing even modest benefits for inquiry methods, often in highly structured, tightly controlled conditions, are inflated and generalised far beyond what the data warrant. In one breath, he claims that even minimal guidance outperforms direct instruction; in the next, he insists that good inquiry includes plenty of guidance. The result is an argument that wants to have its cake and eat it: claiming superiority for constructivist methods while quietly importing the very supports and structures that make instruction work.
Moreover, Kohn’s understanding of CLT is selective and, at times, deeply confused. He accuses the theory of being both reductively simplistic and unfalsifiably complex while simultaneously mischaracterising its core claims. CLT does not deny the importance of motivation, curiosity, or social context; it simply focuses on the cognitive constraints that shape how instruction should be designed. Far from prescribing explict instruction in every case, it recognises the expertise reversal effect and the need to fade guidance as students develop fluency. Yet Kohn continues to present it as a blunt instrument, designed only for “transmission” models of teaching.
His rhetorical style compounds the problem. He relies heavily on insinuation, loaded language, and appeals to emotion. Direct instruction is described as “traditional,” “rigid,” or “teacher-dominated,” while inquiry is framed as liberating, student-centred, and humane. This is not an attempt to adjudicate between rival theories; it is an attempt to delegitimise one of them by associating it with oppression. His arguments frequently rest not on logic that if something looks like school, it must be bad; if it looks like play, it must be good.
In the end, Kohn’s critique tells us far more about his beliefs than about the quality of explicit instruction or the validity of CLT. What he’s written is not a contribution to the science of learning, but a manifesto against it; an argument grounded in ideology, packaged as research, and delivered with the conviction of someone who has already decided what the answer must be. That is not necessarily a bad thing in itself; ideology can be a source of insight but it should not dress itself up as evidence. When it is, the result is not clarity but confusion, not principled disagreement but intellectual sleight of hand.
“This is the piece Kohn habitually gets backwards. He wants motivation to precede success. But in reality - and this is backed by a substantial body of research in educational psychology - motivation is often a result of success, not its precursor. Students become motivated when they feel competent.”
This is the essential rub. I started off my career being fed Kohn - and found very quickly what actually motivated my struggling learners was a taste of legitimate success - and if I could scaffold their experience (no, not water it down - they’d know it if I did anyway) enough to get them that often first taste of competence, it would become the drug they’d be addicted to the rest of their lives. AND they would have the motivation to take more and more learning “risks” with me - as their trusted guide.
It’s almost like Kohn wants to ensure that most students remain frustrated, disengaged, and demoralized by their educational experience.
Thanks for this deeply thorough dive into this deeply flawed argument. And for the lit review. (Do you sleep?)
We were exposed to quite a lot of educational research on my PGCE (and later in a pointless teaching Masters) and I was more up for engaging than some in my cohort. But it usually ended up being something like 'we interviewed 7 teachers in Delaware in 1974' or 'we studied tribal initiation rights in Kenya' or worst of all some meta-analysis of 25,000 studies about CPD or some such. And then in school extremely tentative or niche conclusions were presented as the 'evidence base' for some policy of dubious relevance. The whole experience left me rather cynical, as you can probably tell.