This is great! I thought that after grad school, I wouldn’t have to evaluate research studies anymore. Wrong! The longer I teach, the more I read, and the more research I evaluate in order to improve my pedagogical practice. Recently, I used much of what you’ve outlined here when evaluating Cognitive Mutualism research. I’m bookmarking this so that I can keep referring back to it.
"However, Hattie also distilled his synthesis by publishing an average effect size for over 100 different topics in education research."
After I read Visible Learning, I started to look for criticisms. There were very few. Some of the more persuasive ones were from mathematicians / statisticians / logicians, who said that the way he combined different studies was problematic (among other things). Didau summarizes those here:
Thanks for sharing! I hadn't seen that article from Didau. It's wild to me how often I hear Hattie referenced in regular PD sessions, and how clearly shallow his methodology is.
A few years ago, Ollie Lovell had Adrian Simpson on his podcast criticizing meta-analysis, which in Hattie's case is really the even less-defensible meta-meta-analysis. That was the first time I heard a criticism of Hattie in a rigorous way. Ollie had Hattie on the next month to defend his work: if anything, that episode convinced me even more that Simpson is on the right track with his critique.
Interesting. I'm not a big fan of podcasts but I'm very curious how Hattie defended himself. It's tricky because on the surface meta-analysis makes some sense. For me it's the endless oversimplifications I observe in practice that convinced me this stuff is not a good way to communicate about the science.
Thanks! Read it. I agree with his conclusions. Especially having seen the effect sizes misused repeatedly, even if they can theoretically be used well they are very hard to use well in practice.
I’m not a big podcast guy either, but Ollie’s Education Research Reading Room had Dylan Wiliam one time when I had a long drive coming up and got me hooked on his and just a couple of others.
Man, Hattie's website is almost unreadably self-promotional. I found a nice article from some McGill researchers about some of the statistical errors he's made: https://mje.mcgill.ca/article/view/9475/7229. It's pretty unfortunate given that he has a PhD in Statistics from U of Toronto as well.
Yup. I emphasize Hattie because I've had multiple people in multiple PD contexts cite his work uncritically. One note on that article: I believe it references errors in the first edition of Visible Learning that have since been corrected. Even without the statistical errors, I don't think Hattie's basic methodology is useful because it's so easy to share those numbers out of context.
I would also like to add, particularly on the topic of teachers believing in students, that even in underperforming schools when teachers believe in students they invest more. It’s not the belief, it’s what happens because of it. They can invest more time, energy, and sometimes tangible resources. There is evidence that those things benefit learning.
Brilliant article, thank you! I took graduate courses in statistics and am often dumbfounded at the type of stuff that makes it to the press in all forms of research. So much of this comes from educators and school leaders looking for that "one thing" that will fix the problems in their school. There's never going to be one thing. It's a combination of so many interrelated things.
From a research perspective, it's just not rigorous. I could pick out multiple issues with any of her studies. That doesn't mean it's all wrong, but should cause us to triangulate her claims with a lot of skepticism.
One of her main claims is about neuroplasticity and student potential. This I largely agree with, but I draw a different conclusion. I would say show, don't tell -- don't tell students about growth mindset and synapses, teach effectively and show students how much they're capable of learning. (Ok fine, we can do the growth mindset lessons, but let's be humble about how much they matter and also recognize that hers are less rigorous than many others.)
Another claim is about flexibility being more important than fluency. Here I just think she's wrong. Fluency is important, practice is important. We should be cautious about timed tests, but the end goal is absolutely fluency through practice. Flexibility is important too, but she advocates flexibility as far more important than fluency in a way I just can't get behind.
The last big claim I've seen is around open-ended tasks. Again, I agree with some stuff she says. I wrote in my post "Glimpses" on how I approach these. I think here we have similar goals, but I approach those goals in very different ways. https://fivetwelvethirteen.substack.com/p/glimpses
She has plenty of other claims as well but those are the ones I see as most prominent.
This is an excellent post, thanks for putting it together.
Someone else already mentioned him, but if you haven't read Adrian Simpson's critiques of Visible Learning (and meta analysis in education research more broadly), I cannot recommend him enough. I teach a college course on "inferential reasoning in data analysis", and I spend a good bit of time on his paper titled "Princesses are Bigger Than Elephants: Effect Size as a Category Error". He makes many of the same points you're making here, with lots of examples. An the introduction is just brilliant, had me hooked after a few paragraphs.
I agree that evidence-informed teaching is vital, but it’s crucial to be cautious about how we apply research. Often, we see correlation presented as causation, which oversimplifies complex issues. For example, the link between teacher beliefs and student achievement is important, but it’s not the whole story. Real, sustainable change comes from addressing the structural factors at play. How can we ensure the evidence we use is critically examined and effectively applied in our schools?
Great post, Dylan! I would add one more thing that I have started to watch out for - intervention vs whole class teaching. It’s much easier to get a decent effect size in an intervention when you can target precisely rather than in whole class teaching when you have to cater to everyone. However, I guess around 90% or higher of what teachers do is whole class teaching. There’s only a limited budget and time for interventions so it’s really not comparing like with like.
That’s a good point! The lines are a bit blurry because sometimes research moves from one to the other, a lot of classroom research started in the special education world. But I agree, the highest leverage area for growth has to be whole-class teaching so we really need to know what works in that context.
I saved this article for future reference. If I were a principal, I’d have all my staff read this at the beginning of the year before the professional development season kicks off. Kudos for such a clear and thorough explanation of how to make sense of research and see through the nonsense.
The part about comparing an intervention to nothing —that’s literally every single study on i-Ready. 45 extra minutes of week of math instruction is better than 0 extra minutes? No way!
What’s maddening to me is how quickly leadership accepts these studies and sales pitches without any critical thinking.
Yeah one problem I've seen repeatedly in school leadership is not thinking clearly about opportunity cost. Most things we could spend time on will contribute to learning. The question is, what's the best use of that time? (And, relatedly, what things can we stop doing to create more time that can be used well.)
The iReady reference made me think of this study: What about “when intervention systems lack coherence can unintentionally do harm”? Coherence by Design TNTP “The lowest performing students made less progress in Tier 3 intervention than similarly low performing peers who received no intervention at all.”
These are all so real! Great post with lots of food for thought. Another one I see is using a very select sample of students to generalize inappropriately about the population of all students. There's a great Math Medic lesson about which populations you can actually generalize results to based on how the sampling was done. There are many studies that can only reasonably be generalized to a narrow population from a particular school, state, grade level etc. that might not be representative of all students in the country, for example, but the headlines of the study will make broad claims about American students.
Interested in your take on the lurking variables behind the correlation between teacher belief in students and student achievement. I think your take is correct but possibly incomplete. I think there are also times where teacher belief in students impacts their instructional choices (in terms of scaffolding, task rigor, quantity of practice etc.) that can also impact student learning.
A little too real sometimes, a lot of this PD was pretty painful.
The generalization point is a good one. That standard actually starts in 7th grade! We look at validity of samples.
re: teacher beliefs <-> student achievement, I'm sure it's bidirectional. Both cause each other. The interesting question to me is: how can we make a change from the present equilibrium in a given school? The implicit hypothesis of the PD I've received is: "we should tell teachers it's important to believe in their students and that John Hattie reviewed studies of 3 million students and said this is the most important thing, that will make teachers will believe in their students more." I am confident that's not an effective intervention.
I would say, "show, don't tell." Pick something small but meaningful that can make a substantive change in school or classroom culture. Be really specific and clear about what the change looks like. Support teachers with the change, troubleshoot along the way, and highlight any successes. If that shift is successful, you are showing teachers that their actions can make a difference for students. Then you hope to kick off a virtuous cycle, where more actions and beliefs follow.
All that is way harder than it sounds. It's also risky: teachers who see that whole thing fall flat over and over again become jaded and are less likely to support future change.
Yes I completely agree that teachers need to see it in action and not be told about it. I actually wrote about this exact topic for my book because it’s such a hard nut to crack (!!) in terms of getting teachers to trust the process and then see the meaningful growth in their students, and like you said, it’s very easy to lose trust when it doesn’t go well. I think it takes a level of intensive coaching that most schools just aren’t down to implement.
Great piece. I recently submitted a paper for a study about the effect a 45 minute visit to a mobile microscope lab had on student attitude towards science.
Used a pre/post survey. Impact was statistically significant.
Effect size was in the .14 - .2 range which reviewers were critical of, my reply was similar to your perspective in this post. That is the fact that any impact at all in such a short duration seemed of note to me. Your piece further clarified this and added additional nuance.
Sounds interesting! I agree that the criticism of a small effect size feels like a lazy shortcut. It's hard to say from any one example what that means, but our attitude should be: let's add it to the literature! Let's see what other people can find, then add up all that research to draw some broader conclusions.
That attention span thing is... upsetting. It is a problem that education research and mainstream cognitive psychology research are separate spheres; the intervening vacuum leaves a lot of room for nonsense.
Not that cognitive psychologists never say ridiculous things--but that *particular* thing they would not get away with saying!
I thought I was going to combust when it happened. I had beefed with that same consultant previously and felt pretty demeaned by that response so I didn't speak up about the attention span nonsense. My experience has honestly been pretty negative trying to address a lot of this in the moment.
Considering how mad I am just reading about this, I can only imagine how it felt to be there!
What is so frustrating is that the slightest bit of critical reflection on experience with actual human beings is sufficient to falsify these kinds of claims.
This is definitely an article I wish I had encountered during my teacher training at university, which often purported apparently infallible conclusions about constructivist teaching and learning. There is a lot more to consider and a lot more perspectives to learn about so thank you again for your work on this topic.
This is great! I thought that after grad school, I wouldn’t have to evaluate research studies anymore. Wrong! The longer I teach, the more I read, and the more research I evaluate in order to improve my pedagogical practice. Recently, I used much of what you’ve outlined here when evaluating Cognitive Mutualism research. I’m bookmarking this so that I can keep referring back to it.
Thanks Adrian!
"However, Hattie also distilled his synthesis by publishing an average effect size for over 100 different topics in education research."
After I read Visible Learning, I started to look for criticisms. There were very few. Some of the more persuasive ones were from mathematicians / statisticians / logicians, who said that the way he combined different studies was problematic (among other things). Didau summarizes those here:
https://daviddidau.substack.com/p/visible-learning-invisible-errors
Thanks for sharing! I hadn't seen that article from Didau. It's wild to me how often I hear Hattie referenced in regular PD sessions, and how clearly shallow his methodology is.
A few years ago, Ollie Lovell had Adrian Simpson on his podcast criticizing meta-analysis, which in Hattie's case is really the even less-defensible meta-meta-analysis. That was the first time I heard a criticism of Hattie in a rigorous way. Ollie had Hattie on the next month to defend his work: if anything, that episode convinced me even more that Simpson is on the right track with his critique.
http://www.ollielovell.com/errr/adriansimpson/
Interesting. I'm not a big fan of podcasts but I'm very curious how Hattie defended himself. It's tricky because on the surface meta-analysis makes some sense. For me it's the endless oversimplifications I observe in practice that convinced me this stuff is not a good way to communicate about the science.
If you’re not into listening you can also read Ollie Lovell’s reflections on those discussions here
https://www.ollielovell.com/effect-sizes/
Thanks! Read it. I agree with his conclusions. Especially having seen the effect sizes misused repeatedly, even if they can theoretically be used well they are very hard to use well in practice.
I’m not a big podcast guy either, but Ollie’s Education Research Reading Room had Dylan Wiliam one time when I had a long drive coming up and got me hooked on his and just a couple of others.
Man, Hattie's website is almost unreadably self-promotional. I found a nice article from some McGill researchers about some of the statistical errors he's made: https://mje.mcgill.ca/article/view/9475/7229. It's pretty unfortunate given that he has a PhD in Statistics from U of Toronto as well.
Yup. I emphasize Hattie because I've had multiple people in multiple PD contexts cite his work uncritically. One note on that article: I believe it references errors in the first edition of Visible Learning that have since been corrected. Even without the statistical errors, I don't think Hattie's basic methodology is useful because it's so easy to share those numbers out of context.
I would also like to add, particularly on the topic of teachers believing in students, that even in underperforming schools when teachers believe in students they invest more. It’s not the belief, it’s what happens because of it. They can invest more time, energy, and sometimes tangible resources. There is evidence that those things benefit learning.
Brilliant article, thank you! I took graduate courses in statistics and am often dumbfounded at the type of stuff that makes it to the press in all forms of research. So much of this comes from educators and school leaders looking for that "one thing" that will fix the problems in their school. There's never going to be one thing. It's a combination of so many interrelated things.
Great piece. I would be curious to know more about your opinion on Jo Boaler's work.
From a research perspective, it's just not rigorous. I could pick out multiple issues with any of her studies. That doesn't mean it's all wrong, but should cause us to triangulate her claims with a lot of skepticism.
One of her main claims is about neuroplasticity and student potential. This I largely agree with, but I draw a different conclusion. I would say show, don't tell -- don't tell students about growth mindset and synapses, teach effectively and show students how much they're capable of learning. (Ok fine, we can do the growth mindset lessons, but let's be humble about how much they matter and also recognize that hers are less rigorous than many others.)
Another claim is about flexibility being more important than fluency. Here I just think she's wrong. Fluency is important, practice is important. We should be cautious about timed tests, but the end goal is absolutely fluency through practice. Flexibility is important too, but she advocates flexibility as far more important than fluency in a way I just can't get behind.
The last big claim I've seen is around open-ended tasks. Again, I agree with some stuff she says. I wrote in my post "Glimpses" on how I approach these. I think here we have similar goals, but I approach those goals in very different ways. https://fivetwelvethirteen.substack.com/p/glimpses
She has plenty of other claims as well but those are the ones I see as most prominent.
Thank you.
This is an excellent post, thanks for putting it together.
Someone else already mentioned him, but if you haven't read Adrian Simpson's critiques of Visible Learning (and meta analysis in education research more broadly), I cannot recommend him enough. I teach a college course on "inferential reasoning in data analysis", and I spend a good bit of time on his paper titled "Princesses are Bigger Than Elephants: Effect Size as a Category Error". He makes many of the same points you're making here, with lots of examples. An the introduction is just brilliant, had me hooked after a few paragraphs.
Thanks, I will add it to my list!
Have you read any research around Positive Deviance?
No. I just did a quick google search and it looks intriguing, but I'd have to do a deeper dive to get a sense of whether it checks out.
I agree that evidence-informed teaching is vital, but it’s crucial to be cautious about how we apply research. Often, we see correlation presented as causation, which oversimplifies complex issues. For example, the link between teacher beliefs and student achievement is important, but it’s not the whole story. Real, sustainable change comes from addressing the structural factors at play. How can we ensure the evidence we use is critically examined and effectively applied in our schools?
Great post, Dylan! I would add one more thing that I have started to watch out for - intervention vs whole class teaching. It’s much easier to get a decent effect size in an intervention when you can target precisely rather than in whole class teaching when you have to cater to everyone. However, I guess around 90% or higher of what teachers do is whole class teaching. There’s only a limited budget and time for interventions so it’s really not comparing like with like.
That’s a good point! The lines are a bit blurry because sometimes research moves from one to the other, a lot of classroom research started in the special education world. But I agree, the highest leverage area for growth has to be whole-class teaching so we really need to know what works in that context.
I saved this article for future reference. If I were a principal, I’d have all my staff read this at the beginning of the year before the professional development season kicks off. Kudos for such a clear and thorough explanation of how to make sense of research and see through the nonsense.
The part about comparing an intervention to nothing —that’s literally every single study on i-Ready. 45 extra minutes of week of math instruction is better than 0 extra minutes? No way!
What’s maddening to me is how quickly leadership accepts these studies and sales pitches without any critical thinking.
Great piece!
Thanks!
Yeah one problem I've seen repeatedly in school leadership is not thinking clearly about opportunity cost. Most things we could spend time on will contribute to learning. The question is, what's the best use of that time? (And, relatedly, what things can we stop doing to create more time that can be used well.)
Exactly. Terrance M. Scott, who approaches this from a behavioral standpoint, uses a great analogy for that here https://youtu.be/P7YB5g9oYqo?t=2707&si=NFK14IAc1uYJq7tz
The iReady reference made me think of this study: What about “when intervention systems lack coherence can unintentionally do harm”? Coherence by Design TNTP “The lowest performing students made less progress in Tier 3 intervention than similarly low performing peers who received no intervention at all.”
These are all so real! Great post with lots of food for thought. Another one I see is using a very select sample of students to generalize inappropriately about the population of all students. There's a great Math Medic lesson about which populations you can actually generalize results to based on how the sampling was done. There are many studies that can only reasonably be generalized to a narrow population from a particular school, state, grade level etc. that might not be representative of all students in the country, for example, but the headlines of the study will make broad claims about American students.
Interested in your take on the lurking variables behind the correlation between teacher belief in students and student achievement. I think your take is correct but possibly incomplete. I think there are also times where teacher belief in students impacts their instructional choices (in terms of scaffolding, task rigor, quantity of practice etc.) that can also impact student learning.
A little too real sometimes, a lot of this PD was pretty painful.
The generalization point is a good one. That standard actually starts in 7th grade! We look at validity of samples.
re: teacher beliefs <-> student achievement, I'm sure it's bidirectional. Both cause each other. The interesting question to me is: how can we make a change from the present equilibrium in a given school? The implicit hypothesis of the PD I've received is: "we should tell teachers it's important to believe in their students and that John Hattie reviewed studies of 3 million students and said this is the most important thing, that will make teachers will believe in their students more." I am confident that's not an effective intervention.
I would say, "show, don't tell." Pick something small but meaningful that can make a substantive change in school or classroom culture. Be really specific and clear about what the change looks like. Support teachers with the change, troubleshoot along the way, and highlight any successes. If that shift is successful, you are showing teachers that their actions can make a difference for students. Then you hope to kick off a virtuous cycle, where more actions and beliefs follow.
All that is way harder than it sounds. It's also risky: teachers who see that whole thing fall flat over and over again become jaded and are less likely to support future change.
Yes I completely agree that teachers need to see it in action and not be told about it. I actually wrote about this exact topic for my book because it’s such a hard nut to crack (!!) in terms of getting teachers to trust the process and then see the meaningful growth in their students, and like you said, it’s very easy to lose trust when it doesn’t go well. I think it takes a level of intensive coaching that most schools just aren’t down to implement.
Yup. Also your book has a website now! Exciting!
Great piece. I recently submitted a paper for a study about the effect a 45 minute visit to a mobile microscope lab had on student attitude towards science.
Used a pre/post survey. Impact was statistically significant.
Effect size was in the .14 - .2 range which reviewers were critical of, my reply was similar to your perspective in this post. That is the fact that any impact at all in such a short duration seemed of note to me. Your piece further clarified this and added additional nuance.
Sounds interesting! I agree that the criticism of a small effect size feels like a lazy shortcut. It's hard to say from any one example what that means, but our attitude should be: let's add it to the literature! Let's see what other people can find, then add up all that research to draw some broader conclusions.
That attention span thing is... upsetting. It is a problem that education research and mainstream cognitive psychology research are separate spheres; the intervening vacuum leaves a lot of room for nonsense.
Not that cognitive psychologists never say ridiculous things--but that *particular* thing they would not get away with saying!
I thought I was going to combust when it happened. I had beefed with that same consultant previously and felt pretty demeaned by that response so I didn't speak up about the attention span nonsense. My experience has honestly been pretty negative trying to address a lot of this in the moment.
Considering how mad I am just reading about this, I can only imagine how it felt to be there!
What is so frustrating is that the slightest bit of critical reflection on experience with actual human beings is sufficient to falsify these kinds of claims.
This is definitely an article I wish I had encountered during my teacher training at university, which often purported apparently infallible conclusions about constructivist teaching and learning. There is a lot more to consider and a lot more perspectives to learn about so thank you again for your work on this topic.
Thank you!