The Three Iron Laws of Education Research
A skeptical but optimistic guide to improving schools
So you want to improve education? You might hear people say, “research says…” to justify their pet idea that will help students learn more in school. Education research is messy, and is typically much less conclusive than it seems at first glance. Here are three iron laws of trying to use education research to improve schools.
Selection Effects Are Large
Some students, regardless of teaching quality, perform better than others. This is one of the most persistent phenomena in education. Differences in achievement are not random. They are correlated with a whole bunch of other stuff, like family income. They aren’t destiny, but they are the water in which we swim, creating currents that affect everything else we do in education. These effects are persistent and large.
The most basic form of this is something I’ve heard countless times in my career: school A does thing X and they get better results than we do, therefore we should do thing X.
Thing X might be good. But simply because selection effects are large, the most likely reason school A does better is that they have a different intake of students. In the vast majority of cases, if one school performs better than another, the reason is selection, simply because selection effects are large.
Here’s a second, more subtle example. Lots of schools encourage reading by setting aside a block of time where students choose books and read on their own. This is sometimes called SSR or “sustained silent reading.” The practice follows the commonsense observation that students who read more are stronger readers.
Unfortunately, research does not support this practice. Tim Shanahan has a nice blog post summarizing the research here. The basic observation is that this is a selection effect. Strong readers read more, but this may be because they are stronger readers. Telling striving readers to read independently is not the best way to help them become strong readers. There are all sorts of reasons for this — students with fewer reading skills may choose books that are too easy, or only engage shallowly with what they read. Or it may simply be because teacher-led reading is better at improving reading skills for this group of students than independent reading.
This is why education research exists. We don’t just say, “this group does better, therefore they must be doing things right.” Research involves conducting experiments, controlling for different variables, and triangulating multiple sources of evidence. It’s not easy, because selection effects often muddy the waters. You have to understand selection effects to figure out which interventions a school should prioritize.
Intervention Effects Are Small
I’m defining “intervention” here as “something we did in an attempt to help students learn more.”
You name it. Everything we try, the effect is small. That study folks were talking about a few weeks ago about tracking (setting) in math class? Small effect. Growth mindset? Small. Reducing class size? Small.
There are plenty of studies that report large effects. What’s up with these? Many are correlational: there’s a correlation between something and student learning, but that is typically capturing selection effects rather than intervention effects. Some studies use researcher-designed measures. The researchers use their own assessments and produce a large result. Inevitably, when that research is reproduced with something tougher to game, the effect is smaller. Many large effect sizes come from experiments done in a lab setting rather than in real classrooms. Finally, there are all sorts of ways one can use statistical tomfoolery to make effects look larger than they actually are.
Another thing to watch out for is taking a bundle of different initiatives and calling it one big thing. The “science of reading” movement is a great example here. Have some states shown impressive improvement in literacy outcomes? Absolutely. But the most notable successes have involved a bunch of distinct changes, from teacher training to curriculum to phonics to accountability. Many states have struggled to replicate those successes, which underscores the fact that this is a bundle of different interventions stacked together and it’s not easy to pick out the most important active ingredients. This doesn’t mean the “science of reading” is fake or useless. It just means that it’s hard to do well, and isn’t going to cause miraculous improvement all at once.
When I say “intervention effects are small,” you might wonder what I mean by small. By small, I mean intervention effects are much smaller than selection effects. If we are interested in closing the achievement gap, or if a school that is low-performing wants to improve compared to schools that are high-performing, intervention effects are small relative to those gaps, relative to selection effects.
Small effects aren’t meaningless. There’s no one thing that will transform education outcomes, but if you add up lots of small things you can get a big thing. This changes the calculus. First, let’s be realistic about what we should expect from any one intervention. Then let’s do our best to stack interventions together. Small doesn’t mean all interventions are the same; some are small, some are very small. Let’s pick the best bets, the bets that we can scale and sustain over time.
All Things Fade
Every intervention, everything we do in schools, fades back to the status quo. The “southern surge” — states that have outperformed the rest of the US in 4th grade reading scores, attributed to the science of reading — is mostly gone by 8th grade. High-quality preschool improves outcomes for a few years but outcomes mostly slip away by about 3rd grade. All things fade, and if you want to cause sustained improvement you need a bunch of interventions throughout a student’s education.
This doesn’t mean all interventions fade to nothing. Many effects persist at a low level. But some level of fadeout is inevitable. That doesn’t mean the previous interventions were a failure. It just means that the work is endless. There is no other way.
Optimism
I’m an optimist about education. I think there is enormous potential to improve both the conditions of schooling and outcomes for students. But it’s hard work. One of the hardest problems in education research is distinguishing predictors of achievement from causes of achievement, and figuring out which causes of achievement are highest-leverage for improvement.
If you buy these three laws, and you want to improve education, here's where they lead:
Step one: do a rigorous analysis of the interventions that truly move the needle and improve student outcomes, throwing away all of the lazy correlations and selection effects.
Step two: compare the costs and benefits and implement the highest-leverage interventions to improve education outcomes, recognizing that you will need many of them.
Step three: add more interventions and sustain them over the entirety of the education system. Keep expectations reasonable. Avoid silver bullets. Stay humble, and persist.



Good article.
I have an accompanying new proposal: banning blogosphere substackers who have spent 0 minutes in public education systems from proposing ideas for optimizing teaching based on literature summaries.
Listen to teachers! *gasp* - we know more about our job than Freddie de Boer or Matt Yglesias (yes, I am sick of you both putting your foot into education debate whilst having spent zero time doing the actual work).
The problems we deal with are far more intersectional, relational and complex than you think. But we are also a diverse, canny and malleable bunch. I happen to think most teachers are also pretty good at their job and the direction of travel is in the main, pretty damn good.
Love the optimism. Happy 5/12!