Why I hate 'beautiful books'

Why students' exercise books make for dangerously unstable proxies for learning and poor revision resources

Jul 03, 2026

An exercise book is so temptingly inspectable. It sits there, solid and available, full of dates, titles, diagrams, worksheets, corrections, and comments in variously coloured pens. You can pick it up, flick through it and feel you’ve seen something. Lessons have taken place, work’s been done, a teacher has intervened, students have responded. All the evidence one could ever need to feel confident that learning is or isn’t proceeding as desired, conveniently packaged between two floppy, card covers.

Inspection, in all its forms, has become curiously vulnerable to book scrutiny. We want to know whether students are learning, whether teaching’s effective and whether the curriculum’s coherent, but these are awkward things to establish. So we reach for visible proxies: full pages, neat presentation, written comments, coloured pen, student responses, evidence of challenge, evidence of progress, evidence of pride. Before long, the proxy hardens into the standard, and books are no longer being used to ask better questions but to avoid them.

Common problems with work scrutiny

The trouble starts when we treat book scrutiny as if it tells us anything meaningful about learning. Exercise books may give us traces of what’s happened, but these traces bear only a passing resemblance to the thing itself. A date tells us when work was done, not whether anything was understood. A teacher’s comment tells us feedback was given, not whether it changed anyone’s thinking. A polished paragraph tells us words ended up on the page, not whether the student can recreate the argument again without help. A book offers evidence, but tells nothing about the reliability or validity of that evidence.

If we’re only interested in what we can see in a set of books rather than in figuring out what conclusions we’re entitled to draw from what we see, we’re likely to make invalid inferences. A sample of books may tell us what kind of tasks students have been given, how much they’ve written, whether the school’s presentation policy has been followed, whether teachers are leaving written comments, and whether students are copying down what’s put in front of them. But it can’t tell us whether students understand the work, whether teachers’ explanations are effective, whether classroom interactions are improving performance, or whether the curriculum is being remembered. You might think it can, but, as I intend to show, this belief is profoundly damaging to the project of education.

Book sampling has some value as a low-stakes fact-finding activity, but it becomes much more dubious as an accountability tool. Once teachers know what’s being sampled, the sample changes. If leaders make it known they want to see written comments, written comments will appear. If they inspect student responses, student responses will multiply. And, most egregiously, if leaders fixate on presentation, presentation will, on the whole, improve, but at what cost? None of this means students are learning more, just whether teachers have learned how to satisfy the system.

Work sampling pro formas, no matter how well intentioned, make all this worse.

They tell us what to notice before we’ve had a chance to think. If there’s a box to tick about marking frequency, coloured pens, written responses, presentation or evidence of progress, what matters has been pre-determined. The best we can hope to find is what we were looking for in the first place. We’re no longer looking at students’ work to understand what it reveals; we’re looking to see whether the book conforms to an official picture of what learning’s supposed to look like.

Counter-intuitively, the way to get the best out of teachers is, unless we really don’t trust them, to withhold our preferences and allow them to make professional judgements based on how well they know their students and their subjects. If, explicitly or implicitly, leaders say what they’re looking for, teachers will make perfectly rational choices, adapting to the preferences of those judging them.

Lerner and Tetlock distinguish between being accountable to an audience with known views and being accountable for the quality of one’s reasoning. When people know what their audience wants, they tend to shift towards the audience’s preferences. When they expect to justify their thinking to an audience whose views are unknown, they’re more likely to think carefully, consider alternatives and make better judgements. In book scrutiny, we’re more likely to get better outcomes if we avoid asking, “Has the teacher marked in the approved way?” or “Does the book meet my idea of pride?” and instead focus on asking, “What was the teacher trying to achieve, what evidence did they use, what happened as a result, and what might they do next?”

The injustice of beautiful books

We should never lose sight of the tendency for managerialism to cosplay as rigour. What does it matter if dates are on the “correct” side of the page, titles underlined, sheets glued in and green pen used? These are wonderfully auditable but also arbitrary. The more precise the rules become, the easier they are to inspect and the less likely they are to have much to do with learning. You may think that asking students to “take pride in their work” is a meaningful way to assess standards, but what does it actually tell you?

It might tell you that a student has fluent handwriting, good fine motor control, spare attention and a secure grasp of school routines. It might tell you they care about adult approval or that the teacher has created calm conditions in which neat work is possible.

A neat book can conceal shallow thinking, just as a scruffy book can contain effortful thought. A beautifully highlighted paragraph may have been copied from the board, while a messy answer may be a student’s first attempt to grapple with a difficult idea. If we mistake surface care for intellectual quality, we reward students for being neat rather than for working hard.

“Pride” isn’t equally easy for every student to display in the approved form. For some, neatness is cheap: they can write quickly, organise space, copy accurately and still attend to the explanation. For others, the same requirements consume the working memory we’d rather they spent on the subject. Slow handwriting, weak transcription, poor spelling, visual processing difficulties, dyspraxia, dyslexia, gaps in schooling or anxiety can all mean the effort to make the page look acceptable comes at the expense of understanding.

Beautiful-book culture takes behaviours that are easier for already fluent students and recasts them as moral virtues. Students who can produce neat work are praised for pride; students who struggle are told they should be ashamed of what might be their best efforts. We think we’re raising standards, but we may simply be increasing the cost of access to the curriculum.

Presentation matters when it serves thinking. A disorganised calculation may hide errors, a messy diagram may obscure a relationship and an illegible sentence is tough to improve. But once presentation becomes an independent standard, detached from utility, it becomes ceremonial.

Thirty copies of a single thought

Many’s the time I’ve opened a student’s book and been blown away by the quality of work. Their explanations are lucid, and writing style seems impressively mature. Clearly, this student’s really got it. Then, on opening the next book, I find the exact same paragraph. By the time I’ve seen the fourth identical explanation, my admiration has curdled into disappointment.

Copying is not necessarily always useless. I can see arguments that sometimes students need accurate models, or need to rehearse language, structures, diagrams or procedures before they can use them independently. But getting a class of students to copy the same stuff into their books is, in the main, an oddly valueless task.

So why does it happen? Partly because copying is tidy. It fills pages, creates a record and ensures every student has something plausible in their book. And it prevents the terror of the blank page. In a culture where quality of work is mistaken for quality of thought, copying feels safer than either messy independent attempts, or worse still, a rich classroom discussion where nothing is recorded.

It’s also, I think, a misunderstanding of the “I do, we do, you do” model. The “I do” phase should be a space where the teacher models expert thinking, not an invitation for students to simply write down someone else’s words. A model should make the invisible visible: why this word rather than that one, why this step comes next, why this misconception is tempting, why this example matters, why this answer is better than that answer. The point is not transcription but attention to the decisions that make the model work.

If the lesson ends with everyone having copied the teacher’s answer, the most important part of the sequence has probably been skipped. The “we do” phase should involve students making decisions with support, and the “you do” phase should require them to attempt something independently. Copying a sentence prompt, or a paragraph plan can help prepare students for independent work, but it can’t replace it.

So what should we do if we find evidence that students are copying? Punish the teacher? In my view, the only sensible approach is to ask what systemic pressures are causing the behaviour and, perhaps, to consider whether students can do something independently as a result of the copying. Was it building fluency? Was it preserving a model students would later use? Was it followed by a fading scaffold, a worked example with gaps, a comparison task, a retrieval attempt or an independent application? If not, the copying is more for adult convenience than for the students’ benefit.

Quality is subject-specific. A non-specialist may see long answers, difficult-looking vocabulary and tidy presentation and assume challenge. They may see short answers, repeated practice or a familiar text and assume low expectations. But quality in mathematics, English, science, history, art and music lives in different places. It takes subject knowledge to know whether the work is demanding, accurate and worth doing.

In some subjects, exercise books don’t provide even a rough map of the territory; they’re a postcard from a distant corner. What can a book tell us about a student’s singing, tackling, painting, throwing, experimenting, speaking or designing? If written records have been carefully designed by a subject specialist they may be useful but, more often, giving students exercise books in some subjects is utterly pointless.

This is also why the distinction between the quality of work and the quality of marking matters so much. A book full of weak work and compliant marking shouldn’t reassure anyone. A book full of strong work and little written marking should make us curious rather than cross. If the work’s excellent, we should want to know how that excellence was produced. If the work’s poor, the presence of coloured pen, written comments and dutiful student responses shouldn’t reassure us.

The problem with practising bad writing

Beautiful-book culture also creates a particular problem for writing. If full books are treated as evidence of rigour, then teachers and students are pushed towards producing more writing, whether or not that writing is any good.

This is especially damaging when students are asked to write at length before they know enough to write well. They fill pages with clumsy sentences, vague claims, weak evidence, bolted-on quotations and half-formed explanations. The result may look like effort, but it can easily become rehearsal in doing the wrong thing. Practice doesn’t make perfect. Practice makes permanent. If students repeatedly practise poor writing, they don’t become better writers. Instead, they become more fluent at writing badly.

There’s a strange romance in schools about extended writing, as if more words automatically mean more thought. Sometimes they do. A developed paragraph, a sustained explanation or a full essay can force students to organise ideas, make connections and think hard. But only if they have the knowledge, models, vocabulary and sentence control to make the writing purposeful. Without that, extended writing becomes a kind of fog machine. It fills the page while obscuring the problem.

The pressure to fill books makes this worse. It rewards visible quantity over invisible improvement. A student who writes three poor pages may look more industrious than one who spends ten minutes improving a single sentence, comparing two examples, correcting a misconception, rehearsing a quotation or planning the structure of an argument. But the second student may be doing much more valuable work. Good writing often improves through selection, imitation, sentence-level practice, oral rehearsal, precise feedback, redrafting and attention to examples. Much of this can look unimpressive in a book.

This is another reason book scrutiny misleads. A book full of writing may tell us that students have written a lot. It doesn’t tell us that they’ve learned to write better. It may even tell us the opposite: that they’ve been left to practise errors, misconceptions and weak habits at scale. The idea that this will somehow make students better writers is naive.

Instead, as outlined in my book, Writing Fitness, the best way to build students’ writing stamina is to do less for longer, eliminate gaps, practise to the point of automaticity and move on only when essentials have been mastered.

What are students’ exercise books for?

To be clear, I’m not arguing that standards don’t matter, or that we should encourage illegible scrawl on dog-eared paper. Students need to set out work so that teachers, and ultimately examiners, can decipher what they’ve written. Care’s worth cultivating and accuracy really matters. There’s nothing progressive, humane or clever about allowing students to produce work no one can read.

But care isn’t decoration, order isn’t understanding and pride isn’t compliance. The problem begins when presentation replaces thought. A well-set-out calculation can help a student spot an error. A labelled diagram can make a process easier to retrieve. A paragraph plan can reduce the burden of composition. These things are valuable because they make thinking easier. The trouble starts when the ritual remains after the reason’s disappeared. The problem isn’t the ruler. The problem is the worship of the ruler.

Before we argue about whether exercise books should be beautiful, we need to ask a simpler question: what are they for? The answer’s less obvious than it seems because schools often expect books to do several incompatible jobs. In a single subject they may be a place to think, a record of practice, a revision resource and evidence for accountability purposes. We shouldn’t be surprised if a single artefact does none of these jobs particularly well.

If a book’s a place to think, it should contain false starts, crossings out, half-formed sentences, abandoned plans and rough attempts. Thinking usually looks untidy before it looks fluent. A page of real working may look like the wall in a detective film: arrows, fragments, wrong turns, a suspect circled too early, a clue that later turns out to be useless.

If a book’s a record of practice, it should show what students have attempted, what they found difficult, what improved and what still needs attention. It should help teachers to see whether tasks are too easy, too fragmented or too hard. It should show whether students have had enough practice, whether errors are recurring and whether explanations have been followed by application.

If a book’s evidence for other adults, it’ll probably start to look suspiciously polished. The neater it becomes, the less certain we should be that we’re looking at learning rather than performance. A useful map simplifies the territory so we can navigate it. A bad map gives false confidence. Many exercise books are more stage prop than map; arranged for the benefit of whoever’s expected to inspect them.

Research on student notebooks, especially in science, is useful here, but not in the way beautiful-book enthusiasts might hope. When notebooks are useful, it’s because they’ve been designed to capture reasoning, not because they’re neat. A science notebook that records predictions, observations, explanations and revisions of thought is doing a different job from a book filled with copied slides and glued-in worksheets. The value lies in the thinking the book’s been designed to reveal, not in the book’s appearance.

Handmade revision guides?

The least convincing claim is that exercise books are an opportunity for students to create their own bespoke revision resources. This is, at least in my view, bollocks. Revision resources need to be accurate, complete, organised and designed by someone who understands the subject inside out. Exercise books are none of these things. They provide a record of students’ encounters with the curriculum, not reliable maps of the curriculum itself. They may contain moments of genuine insight, but they may also contain misconceptions, half-truths, fragments, hurried notes and things copied without comprehension.

To think that this could make a useful revision guide is eccentric. Why would we expect novices to revise from materials produced by novices? A student’s notes are amateurish because students are amateurs. That’s not an insult; it’s the point of schooling. Students write down what seems important at the time, not necessarily what is important. They preserve misconceptions with the same confidence as correct ideas.

Exercise books are, for the most part, full of half-copied explanations that made sense only when the teacher talked around them; worksheets with two missing answers and one wrong one; spider diagrams with one useful idea and six titles in bubble writing; definitions that were nearly right but never corrected; model paragraphs that looked persuasive at the time but depended on knowledge the student no longer has; pages of notes taken while the student was also trying to listen, think, spell, keep up and avoid making the page look a mess. Forcing students to revise for exams from such a source would be a sure-fire way to lower outcomes.

Good revision, as we know, requires retrieval, spacing, carefully chosen examples, purposeful practice and opportunities to check and correct. Students need to remember, apply, compare, explain and improve. Their exercise books may contain some raw material for that process, but they’re not the process itself.

Worse, the belief that students can or should revise from their books privileges students who can already listen, select, summarise, organise, spell, write fluently and separate the essential from the incidental. In other words, it privileges students who are already relatively expert. For everyone else, the exercise book becomes a museum of partial understanding. A museum’s a fine thing, but it’s not a workshop. You can learn from a museum if you know what you’re looking at, but you can also wander around one for hours and learn little.

If we’re serious about revision, we should give students better materials than their own amateur records of learning. They need carefully designed summaries, model answers, worked examples, glossaries, diagrams, key quotations, timelines, cumulative quizzes and practice questions. These should be shaped by the curriculum, not accidentally assembled from the residue of lessons.

Exercise books may remind students what they’ve done, preserve useful practice and show old mistakes worth correcting. But they shouldn’t be treated as the main source of revision. Asking students to revise from their own books is rather like asking patients to treat themselves from the notes they took during the consultation.

Making a novice’s notes neater doesn’t make them more expert. A misconception is still a misconception no matter how beautifully underlined. A pristine exercise book may be easier to revise from than a chaotic one, but that doesn’t make it an effective revision resource.

The cost of beautiful books

A good exercise book is a working document. It should help students think, help teachers diagnose and help departments see whether the curriculum is doing what they hoped. Once it becomes a performance for other adults, its value begins to leak away. The book stops being a map that helps us navigate and becomes a prop that helps us pretend we know where we are.

This is the opportunity cost of beautiful books. Every hour spent standardising presentation is an hour not spent asking better questions. What do students need to know? What have they misunderstood? What needs more practice? Which explanation worked? Which examples helped? Which tasks generated confusion? Which students looked successful but remembered little? Which students looked untidy but understood more than we thought?

Opportunity cost isn’t theoretical. Time spent checking compliance can’t also be spent improving explanations, planning better questions, comparing student work, discussing misconceptions, designing practice or agreeing what high-quality work looks like in the subject. The cost of book scrutiny isn’t measured only by whether it finds anything useful but by the work it displaces.

We made a similar mistake with marking. We treated marking as if it were feedback, when feedback is only feedback if it changes the student’s thinking or action. A comment in a book may be useful feedback, but it may also be a performance of feedback. There are often better ways to find out what students know: a hinge question in the middle of a lesson, a short quiz, a whole-class response on mini-whiteboards, a carefully chosen sample of independent work, or an assessment completed under known conditions. If we want to know whether students can write an essay, perhaps we should collect the essay. If we want to know whether they understand today’s concept, perhaps we should check that today rather than write the same comment in thirty books two weeks later.

People pay attention to what gets inspected. If we praise full pages, coloured pen, glued-in sheets and uniform presentation, teachers will produce those things. Not because they’re foolish, lazy or cynical, but because systems train people. Whatever gets noticed becomes important. Students learn the same lesson. They learn that schoolwork is often about making the page look right, and that the appearance of effort can matter more than the substance of thought.

None of this means we should stop looking at books. It means we should look at them differently. A book can tell us useful things. It can show what tasks students have been given, whether they’ve had enough practice, whether work’s too easy or too fragmented, whether explanations have been followed by application, whether errors are recurring, whether the curriculum has coherence and whether students are being asked to think hard enough about the right things.

But a book can’t, on its own, tell us what a student knows. A book isn’t a brain scan. To know what students know, we need to ask them. We need to listen to their explanations, look at their independent work, compare responses across classes, use assessments carefully and see whether they can use what they’ve been taught when the surface of the task changes.

The best conversations about books aren’t really about books. They’re about curriculum, instruction, practice and assessment. What was taught here? Why was it taught in this order? What were students expected to think about? What did they get wrong? What did the teacher do next? What will be revisited? What should now be easier because of this work? Those questions might actually improve teaching, which is a long way from checking whether every date has been underlined.

A sane approach to accountability should make teachers more likely to do the right thing, not more likely to comply with a senior leader’s partial understanding of quality across the curriculum. Beautiful-book culture teaches teachers to protect themselves by making books more inspectable. The safe option is no longer better thinking, better explanation or better practice. It’s showcasing pages that survive scrutiny.

Beautiful books became popular because they offered a visible solution to an invisible problem. Learning is hard to see, so we’re desperate for something easy. We want evidence of care, progress, pride and rigour and neat books can seem to provide it. The danger of using books as a proxy is that we mistake what’s visible for what’s important.

So, yes, students should take pride in their work, but pride should mean doing justice to the content, not decorating the container. Yes, books should be usable, but usable for whom? The student thinking now? The teacher planning the next lesson? The adult looking for reassurance? Until we answer that question, beautiful books will remain an attractive distraction.

I’d rather see an ugly book full of hard thinking than a beautiful book full of compliant emptiness. I’d rather students revised from materials designed by someone who knows the subject than from their own amateur notes. And I’d rather we stopped pretending that the correct side of the page is a serious educational concern.

The exercise book is only ever a map, and rarely a good one. Sometimes it helps students think, practise and improve. Sometimes it helps teachers see what happened and decide what should happen next. But the map mustn’t be confused with the territory. The question shouldn’t be whether books look cared for but what they’re doing and whether they help students think better. If they don’t, their beauty is only evidence that we’re looking in the wrong place for the wrong things.

Leah Mermelstein

43m

This lands hard, especially the "I do, we do, you do" section. The thing I see constantly on social media — and I think it connects directly to what you're describing — is people sharing a few unbelievable writing samples, the kind that generate a ton of oohs and ahs because they look so polished.

More often than not, when I've dug deeper what's being celebrated is a copy of the shared piece, not an independent construction.

And that's a distinction worth sitting with, because "we do" isn't supposed to be transcription — it's supposed to be the messy, oral, collective work of building something together, again and again, until the moves become available to a child independently.

The whole value of that phase lives in the grappling, not in the artifact it produces.

When what gets shared is a polished copy, the most important instructional work — the noticing, the decision-making, the "why this word and not that one" — has usually happened invisibly, or not at all, exactly like you're describing with the model itself.

I wrote about this distinction in my book, We-Do Writing — the difference between a "we do" that's genuine shared construction (oral, iterative, revised in real time) versus a "we do" that's really much closer to an "I do" with an audience that's holding pencils.

The second version produces the samples that go viral. The first one produces writers.

Richard Trimble

I love beautiful books but you have given me much to think about. I have always wanted students to get over the fear of writing something wrong, be able to cross out and make mistakes. I love the point about an emphasis on presentation only being useful if it serves to enhance the clarity of thinking and safeguard communication. It has also made me question how we do book scrutiny without the student present. Based on what you are saying here a book scrutiny that is a conversation with the student would generate much more insight into what is going on in the classroom. I wonder about the idea of not explicit with staff about what you are looking for in a book look, I think they would find that unsettling but I take the point about the danger of being performative...

3 more comments...

Discussion about this post

Ready for more?