“Our discipline needs to be saturated with critique of ideas; and it should be welcomed. Every paradigm or set of conjectures should be tested to destruction and its authors, adherents, and users of the ideas should face public accountability.” (Hattie, 2017, p. 428).
The peer reviews are saturated with detailed critiques of Hattie’s work but most educators do not seem to be aware of them.
My aim is to raise awareness of these critiques and investigate Hattie’s claims in the spirit of Tom Bennett, the founder of researchEd,
‘There exists a good deal of poor, misleading or simply deceptive research in the ecosystem of school debate…
Where research contradicts the prevailing experiential wisdom of the practitioner, that needs to be accounted for, to the detriment of neither but for the ultimate benefit of the student or educator.’ The School Research Lead (p. 9).
The pages (right) reference over 50 peer reviews which detail a litany of major errors in VL, e.g.,
Snook, Clark, Harker, O’Neill & O’Neill (2010) – “Potentially misleading.”
Terhardt (2011) – is suspicious of Hattie’s economic interests.
Berk (2011) – “Statistical malpractice disguised as statistical razzle-dazzle.”
Higgins & Simpson (2011) – “the process by which this number (effect size) has been derived has rendered it effectively meaningless.”
O’Neill (2012) – Hattie is a Policy Entrepreneur, he positions himself politically to champion, shape and benefit from school reform discourses.
Schulmeister & Loviscach (2014) – “Hattie pulls the wool over his audience’s eyes.”
Poulsen (2014) – “Do I believe in Hattie’s results? No!”
Wrigley (2015) – “Bullying by Numbers.”
O’Neill, Duffy & Fernando (2016) – Detail the undisclosed 3rd party payments to Hattie.
Wecker et al. (2016) – “A large proportion of the findings are subject to reasonable doubt.”
Bergeron & Lysanne (2017) – “Pseudo-Science… House of Cards.”
Nilholm (2017) – “The Blue Sword” – a fantasy novel.
Nielsen & Klitmøller (2017) – “Neither consistent nor systematic.”
Shannahan (2017) – “potentially misleading.”
See (2017) – “Lives may be damaged and opportunities lost.”
Eacott (2018) – “A cult… a tragedy for Australian School Leadership.”
Slavin (2018) – “Hattie is wrong.”
McKnight & Whitburn (2018) – “The Visible Learning cult is not about teachers and students, but the Visible Learning brand.”
Larsen (2019) – “Blindness.”
Wiliam (2019) – “Has absolutely no role in educational policy making.”
Simpson (2011, 2017, 2018, 2019) – “using these ranked meta-meta-analyses to drive educational policy is misguided.”
Dr. Jim Thornton Professor of Obstetrics and Gynaecology,
“To a medical researcher, it seems bonkers that Hattie combines all studies of the same intervention into a single effect size.”
Another example of the widespread critique of Hattie’s work-
Thomas Aastrup Rømer just received the prestigious Nordic Educational Research Association, Ahlström Award (2019). For Criticism of John Hattie’s theory of Visible Learning. The Association states,
“…the paper makes a precise and subtle critique of Hattie‘s work, hence revealing several weaknesses in the methods and theoretical frameworks used by Hattie. Rømer and his critical contribution inform us that we should never take educational theories for granted; rather, educational theories should always be made subject to further research and debate.”
Hattie’s typical defense is to agree with the critiques but continue on anyway without changing the specific items that were critiqued. A good example is his June 2018 podcast with Ollie Lovell, regarding ranking, Hattie said,
“it worked then it got misleading so I stopped it”!!!
Yet with his commercial partner Corwin’s webinars, 2018-2019, Hattie continues to rank-
Does Hattie faithfully represent the research?
Most people assume he does, but a brief look at Hattie’s representation of the Class Size research should raise some questions! (more details on page links on the right menu).
In 2005 Hattie got the attention of educational administrators by labelling ‘reducing class size’ a disaster then later as going backwards (2005 ACER Lecture & VL, p. 250). He continued with Pearson (2015) naming ‘reducing class size’ as one of the major distractions! Then again, in the TV series Revolution School, claiming that, reducing class size does not make a difference to the quality of education!
The major class size study that Hattie used was by Glass & Smith (1979), they summarise their data in a graph and table:
The trend and the difference between good and poor quality research are clearly displayed. Glass & Smith conclude (p. 15),
‘The curve for the well-controlled studies then, is probably the best representation of the class-size and achievement relationship…
A clear and strong relationship between class size and achievement has emerged… There is little doubt, that other things being equal, more is learned in smaller classes.’
Hattie stated in Blatchford (2016, p. 106),
“Glass and Smith (1978) reported an average effect of 0.09 based on 77 studies…”
I analysed Glass & Smith (1978) but could not find an average of 0.09 reported anywhere!
I contacted Prof Glass to ensure I interpreted his study correctly, he kindly replied,
‘Averaging class size reduction effects over a range of reductions makes no sense to me.
It’s the curve that counts.
Reductions from 40 to 30 bring about negligible achievement effects. From 20 to 10 is a different story.
But Teacher Workload and its relationship to class size is what counts in my book.’
Bergeron (2017) reiterates,
‘Hattie computes averages that do not make any sense.’
Prof Peter Blatchford Class Size Eastern and Western perspectives (2016) states,
‘Given the important influence these [Hattie & others] reports seem to be having in government and regional education policies, they need to be carefully scrutinised in order to be sure about the claims that are made’ (p. 93).
Prof Scott Eacott (2017) in School Leadership and the cult of the guru: the neo-Taylorism of Hattie, describes the uncritical worship of Visible Learning as,
‘a tragedy for Australian school leadership’ (p. 1).
Prof Adrian Simpson’s detailed analysis of the calculation of effect sizes, The misdirection of public policy: comparing and combining standardised effect sizes states,
‘The numerical summaries used to develop the toolkit (or the alternative ‘barometer of influences’: Hattie 2009) are not a measure of educational impact because larger numbers produced from this process are not indicative of larger educational impact. Instead, areas which rank highly in Marzano (1998), Hattie (2009) and Higgins et al. (2013) are those in which researchers can design more sensitive experiments.
As such, using these ranked meta-meta-analyses to drive educational policy is misguided’ (p. 451).
Prof Dylan Wiliam in Leadership for teacher learning, concludes that,
‘…right now meta‐analysis is simply not a suitable technique for summarizing the relative effectiveness of different approaches to improving student learning…’
Again, Prof Wiliam (2019) writes,
‘…the entire project of evidence-based education can never be successful. Any claims about “what works” are necessarily local, in that they are limited to the participants and contexts actually studied and judgement will be needed to apply them in other settings…
If educational research is to contribute to the improvement of practice, teachers, school leaders, and policymakers will have to become critical consumers of educational research.’
McKnight & Whitburn (2018) in Seven reasons to question the hegemony of Visible Learning under the heading ‘Visible Learning courts fascism‘,
‘Should a single doctrine ever govern teachers, classrooms and schools? If we, the authors, were to develop a challenge to Visible Learning, should we be delighted to trademark it, copyright its materials and instigate a counter army of Not-so-Visible Learning acolytes? How often do professional learning programs on Visible Learning include critiques of Visible Learning? We have sought and found no evidence of reflexivity’ (p. 15).
Prof Terry Wrigley (2015) in Bullying by Numbers, critiquing the EEF in particular but also Hattie,
‘Teachers need to speak back to power, and one useful tool is to point to flaws in the use of data’ (p. 3).
‘Bullying by numbers has a restrictive effect on education, leads to superficial learning, and is seriously damaging teachers’ lives’ (p. 6).
‘One should applaud the view that public policy is to be based on evidence. However, what qualifies as evidence, let alone strong evidence, is too often left unspecified. Into this vacuum has been drawn a mix of evaluations ranging from excellent to terrible.
…the importance of meta-analysis for estimating causal effects has been grossly overrated. A conventional literature review will often do better. At the very least, readers will not be swayed by statistical malpractice disguised as statistical razzle-dazzle’ (p. 199).
Goldacre (2008) on meta-analysis in education,
‘I think you’ll find it’s a bit more complicated than that.’
Hattie in a recent interview with Ollie Lovell (June 2018) has done a 180-degree turn.
Hattie admits his rankings are misleading and does not rank anymore! (click here -1hr 21min 45sec).
‘it worked then it got misleading so I stopped it’!!!
In the same interview, Hattie does a very clever re-brand using the mantra “the Story, the Story, the story, …” But there are major contradictions with this – see here.
is broken up into different pages (menu on the right) designed so you can easily go to what interests you most.
In his interview with Hanne Knudsen (2017) John Hattie: I’m a statistician, I’m not a theoretician Hattie states,
‘What I find fascinating is that since I first published this back in the 1990s, no one has come up with a better explanation for the data…
I am updating the meta-analysis all the time; I am up to 1400 now. I do that because I want to be the first to discover the error, the mistake’ (p. 7).
I find these comments hard to reconcile since, as you will see, many scholars have published peer reviews identifying significant problems in Hattie’s work and have called into question his entire model.
I also recommend teachers look at the section – A Years Progress? It analyses what I think is Hattie’s most dangerous idea that an effect size of 0.4 = 1 year’s student progress.
Contributions are welcome. Many of the controversial influences only have 1-3 meta-analyses to read.
The peer reviews have documented significant issues with Hattie’s work ranging from flawed methodology, calculation errors, misrepresentation to questionable inference and interpretation.
Simpson (2017) and Bergeron (2017) detail methodological differences showing the effect size for the SAME experiment can differ enormously (0 to infinity!) depending on how it is calculated. So comparing effect sizes across different studies is meaningless!
Glass (1977) and Slavin (2016) also show this with Prof Slavin concluding,
‘These differences say nothing about the impact on children, but are completely due to differences in study design.’
Misrepresentation, calculation errors, questionable inference, and interpretation occur in a variety of ways. The most serious is Hattie’s use of studies that do not measure what he claims they do. This occurs in 3 ways:
Firstly, many studies do not measure achievement but something else, e.g., IQ, hyperactivity, behavior, and engagement. See Student Achievement for more details.
Secondly, most studies do not compare groups of students that control for the particular influence that Hattie claims. There is a litany of examples, e.g., self-report grades, reducing disruptive behavior, welfare, diet, Teacher Training, Mentoring, etc.
‘in addition to mixing multiple and incompatible dimensions, Hattie confounds two distinct populations:
1) factors that influence academic success and
2) studies conducted on these factors.‘
Lervåg & Melby-Lervåg (2014) also raise this issue,
‘Hattie has not investigated how a concrete measure tested in school affects the students’ skills, but the connection between different relationships.’
Thirdly, Hattie used ONE average to represent each meta-analysis, yet each meta-analysis represented from 4 up to 4000 studies (Marzano).
But, apart from giving equal weight to each average, the big question is,
What does ONE average mean? (no pun intended).
‘there is (or seems to be) an intrinsic conflict – perhaps even a logical contradiction and a paradox – in the set-up of Visible Learning for Teachers (Hattie 2012), and in the book’s backbone arguments and pedagogical advice. Proudly and stoutly as a devoted king of statistics Hattie presents his overwhelming 240 million data analyses, but a vigilant reader will notice that he is essentially a practitioner and a thinker who proclaims that each teacher must have an eye for the unique student.
Hattie does not see and does not want to know that the life and thought of this very student cannot be generalized and transformed into a best-practice-induced ideal type. Therefore the student is rather likely to disappear the more the meta-studies accumulate and pile up – and the more they get transformed into universal clues and keys for many nations’ educational political actions’ (p. 6).
‘What now stands proxy for a breadth of evidence is statistical averaging. This mathematical abstraction neglects the contribution of the practitioner’s accumulated experience, a sense of the students’ needs and wishes, and an understanding of social and cultural context…
When ‘evidence’ is reduced to a mean effect size, the individual person or event is shut out, complexity is lost and values are erased’ (p. 360).
Wrigley goes on to quote Gene Glass,
‘Indeed, Gene Glass, who originated the idea of meta-analysis, issued this sharp warning about heterogeneity: “Our biggest challenge is to tame the wild variation in our findings not by decreeing this or that set of standard protocols but by describing and accounting for the variability in our findings. The result of a meta-analysis should never be an average; it should be a graph.“(Robinson, 2004: 29)’ (p. 367).
The next major problem is the moderating variables.
Prof Dylan Wiliam casts significant doubt on Hattie’s entire model by arguing that the age of the students and the time over which each study runs is an important component contributing to the effect size.
Professor Dylan Wiliam summarises,
‘the effect sizes proposed by Hattie are, at least in the context of schooling, just plain wrong. Anyone who thinks they can generate an effect size on student learning in secondary schools above 0.5 is talking nonsense.’
The massive data collected to construct the United States Department of Education effect size benchmarks support Prof Wiliam’s contention.
These show a huge variation in effect sizes from younger to older students. Which demonstrates that age is a HUGE moderating variable since, in order to compare effect sizes, studies need to control for the age of the students and the time over which the study ran. Otherwise, differences in effect size can be due to the age of the students measured!
‘The model I will present… may well be speculative, but it aims to provide high levels of explanation for the many influences on student achievement as well as offer a platform to compare these influences in a meaningful way… I must emphasise that these ideas are clearly speculative’ (VL, p. 4).
The effect size is supposed to measure the change in student achievement; a controversial topic in and of itself (there are many totally different concepts of what achievement is – see here).
Examples of Peer Reviews:
Blatchford et al (2016) state that Hattie’s comparing of effect sizes,
‘is not really a fair test’ (p. 96).
‘the methodological claims arising from Hattie’s approach, and the overall appropriateness of this approach suggest a fairly clear conclusion: a large proportion of the findings are subject to reasonable doubt’ (p. 35).
‘When taking the necessary in-depth look at Visible Learning with the eye of an expert, we find not a mighty castle but a fragile house of cards that quickly falls apart…
To believe Hattie is to have a blind spot in one’s critical thinking when assessing scientific rigour. To promote his work is to unfortunately fall into the promotion of pseudoscience. Finally, to persist in defending Hattie after becoming aware of the serious critique of his methodology constitutes willful blindness.’
Prof Terry Wrigley (2015) in Bullying by Numbers,
‘Its method is based on stirring together hundreds of meta-analyses reporting on many thousands of pieces of research to measure the effectiveness of interventions.
This is like claiming that a hammer is the best way to crack a nut, but without distinguishing between coconuts and peanuts, or saying whether the experiment used a sledgehammer or the inflatable plastic one that you won at the fair’ (p. 5).
Dr. Neil Hooley, in his review of Hattie – talks about the complexity of classrooms and the difficulty of controlling variables,
‘Under these circumstances, the measure of effect size is highly dubious’ (p. 44).
Schulmeister & Loviscach (2014) Errors in John Hattie’s “Visible Learning”,
‘To think that didactics can be presented as a clear ranking order of effect sizes. It is a dangerous illusion. To an extreme degree, the effect of a specific intervention depends on the circumstances. Focusing on the mean effect sizes and ignoring their considerable variations and condensing the data to a seeming exact ranking order, Hattie pulls the wool over his audience’s eyes.’
Rømer (2016) in Criticism of Hattie’s theory about Visible learning,
‘On the whole, Visible Learning is not a theory of learning in its own right, nor is it an educational theory. Visible learning, on the other hand, is what happens when pedagogy and learning are exposed to a relatively unexplained evaluation theory’ (p. 1, translated from Danish).
‘The first among several flaws in Hattie’s book: what is an effect?
John Hattie never explains what the substance of an effect is. What is an effect’s ontology, its way of being in the world? Does it consist of something as simple as a correct answer on a multiple-choice task, the absence of arithmetic and spelling errors? And may all the power of teaching and learning processes (including abstract and imaginative thinking, history of ideas and concepts, historical knowledge, dedicated experiments, hands-on insights, sudden lucidity, social and language criticism, profound existential discussions, social bonding, and personal, social, and cultural challenges) all translate into an effect score without loss? Such basic and highly important philosophical and methodological questions do not seem to concern the evidence preaching practitioner and missionary Hattie.
Figures taken out of contexts say absolutely nothing, and Hattie never contextualizes his procedures’ (p. 3);
‘Teachers get identified as the primary and indispensable learning factor and thereby as a public, expensive, and untrustworthy potential enemy. This amounts to scapegoat projection par excellence… ‘ (p. 11).
‘The concluding remark must be that the advantage of John Hattie’s evidence credo is that is so banal, mundane and trivial that even educational planners and economists can understand it’ (p. 12).
Prof John O’Neill wrote a detailed letter to the New Zealand Education minister – see here,
‘At the very least, the problems below should give you and your officials pause for thought rather than unquestioningly accepting Professor Hattie’s research at face-value, as appears to have been the case.’
Nilholm, Claes (2017) Is John Hattie in Blue Sword? (The Blue Sword was a fantasy novel),
‘Hattie provides very scarce information about his approach. This makes it very difficult to replicate his analyses. The ability to replicate an analysis is considered by many as a crucial determinant of scientific work…
there is some evidence that his thoughts lead in many ways in the wrong direction’ (p. 3).
Dr. Mandy Lupton on Problem Based Learning,
‘The studies have different effect sizes for different contexts and different levels of schooling, thus averaging these into one metric is meaningless.’
Poulsen (2014) in John Hattie: A Revolutionary Educational Researcher?
‘Do I believe in Hattie’s results? No! I do not dare it’ (p. 6).
Schulmeister & Loviscach (2014) Errors in John Hattie’s “Visible Learning”,
‘If one corrects the errors mentioned above, list positions take big leaps up or down. Even more concerning is the absurd precision this ranking conveys. It only shows the averages of effect sizes but not their considerable variation within every group formed by Hattie and even more so within every individual meta-analysis.’
Dr. Jim Thornton Professor of Obstetrics and Gynaecology at Nottingham University said,
‘To a medical researcher, it seems bonkers that Hattie combines all studies of the same intervention into a single effect size. Why should “sitting in rows”, for example, have the same effect on primary children as on university students, on maths as on art teaching, on behaviour outcomes as on knowledge outcomes? In medicine it would be like combining trials of steroids to treat rheumatoid arthritis, effective, with trials of steroids to treat pneumonia, harmful, and concluding that steroids have no effect! I keep expecting someone to tell me I’ve misread Hattie.’
Why has Hattie become so popular?
In his excellent analysis in School Leadership and the cult of the guru: the neo-Taylorism of Hattie, Professor Scott Eacott says,
‘Hattie’s work has provided school leaders with data that appeal to their administrative pursuits’ (p. 3).
‘The uncritical acceptance of his work as the definitive word on what works in schooling, particularly by large professional associations such as ACEL, is highly problematic’ (p. 11).
McKnight & Whitburn (2018) concur with Eacott,
‘In speaking to teachers, we have found that many have concerns that are similar to ours, but that they are silenced by senior staff in their schools, who have hitched their own branding to particular bandwagons. There are dangers to educational freedoms and to teacher professionalism when schools have paid for pedagogy’ (p. 20).
Wrigley (2018) discusses this problem in detail,
‘It is hardly surprising that Visible learning (Hattie, 2009) and related books are international best-sellers. The prospect of having at your fingertips a summary of all you ever need to know is seductive for busy teachers, school leaders and administrators alike. The graphic device of a dial resembling a car’s speedometer adds to this seductive effect: you can see the effectiveness at a glance. The project is a synopsis of 800 meta-analyses based on over 50,000 separate research studies. Apart from the sheer hubris of the claim to have intelligently analysed such a broad field, we need to be aware of some specific problems’ (p. 368).
Wrigley then details these problems which are referenced throughout this blog.
Professor Gunn Imsen (2011) is also concerned about this,
‘The Hattie fever is held by equally keen politicians, municipal bureaucrats and leaders who strive to achieve quantitative results in their target management systems, which are part of a paper mill that is stifling school Norway. The best medicine against the fever is that Norwegian teachers take back the faith in themselves, their own judgment and trust in their own skills in the work of good teaching for the students. And that the school authorities support them in this.’
Professor Thomas Rømer (2016) concurs in the Danish context (p. 4).
The Rise of the Policy Entrepreneur:
Science begins with skepticism, however, in the hierarchical leadership structures of Educational Institutions skeptical teachers are not valued, although ironically, the skeptical skills of questioning and analysis are valued in students. This paves the way for the many ‘snake oil‘ remedies and the rise of policy entrepreneurs who ‘shape and benefit from school reform discourses’.
Professor John O’Neill in analysing Hattie’s influence on New Zealand Education Policy describes the process well:
‘public policy discourse becomes problematic when the terms used are ambiguous, unclear or vague’ (p. 1).
[The] ‘discourse seeks to portray the public sector as ‘ineffective, unresponsive, sloppy, risk-averse and innovation-resistant’ yet at the same time it promotes celebration of public sector ‘heroes’ of reform and new kinds of public sector ‘excellence.
Relatedly, Mintrom (2000) has written persuasively in the American context, of the way in which ‘policy entrepreneurs’ position themselves politically to champion, shape and benefit from school reform discourses’ (p. 2).
McKnight & Whitburn (2018) in Seven reasons to question the hegemony of Visible Learning are also concerned about Hattie’s portrayal in this TV series as,
‘the potential saviour of public education and redeemer of recalcitrant teachers’ (p. 2).
They also question the financial conflict of interest of Visible Learning,
‘Where are the flows of capital around Visible Learning? Where is capital and what kinds of capital are accruing for those producing “Visible Learning” as a brand? What material and financial benefits flow on to teachers and students?’ (p. 6).
‘A part of the criticism on Hattie condemns his close links to the New Zealand Government and is suspicious of his own economic interests in the spread of his assessment and training programme (asTTle).’
Professor Gene Glass with 20 other distinguished academics also concurs with John O’Neill in, 50 Myths and Lies That Threaten America’s Public Schools: The Real Crisis in Education.
‘The mythical failure of public education has been created and perpetuated in large part by political and economic interests that stand to gain from the destruction of the traditional system. There is an intentional misrepresentation of facts through a rapidly expanding variety of organizations and media that reach deep into the psyche of the nation’s citizenry. These myths must be debunked. Our method of debunking these myths and lies is to argue against their logic, or to criticize the data supporting the myth, or to present more credible contradictory data’ (p. 4).
We need to move from evidence to QUALITY of evidence:
There must now be at least some hesitation in accepting Hattie’s work as the definitive statement on Teaching.
Beng Huat See, in her paper, Evaluating the evidence in evidence-based policy and practice: Examples from systematic reviews of literature, suggests the direction where educational research must now go,
‘This paper evaluates the quality of evidence behind some well-known education programmes… It shows that much of the evidence is weak, and fundamental flaws in research are not uncommon. This is a serious problem if teaching practices and important policy decisions are made based on such flawed evidence.
Lives may be damaged and opportunities missed.
…funders of research and research bodies need to insist on quality research and fund only those that meet the minimum quality criteria.’
The debate must now shift from Evidence to Quality of Evidence.
The US Dept of Education has done this and has developed clearly defined quality criteria in their What Works Clearinghouse.
Most of the meta-analyses that Hattie used would NOT satisfy these quality criteria, see here.
Medical researchers decide to use Hattie’s methods for ranking influences on people’s Health. Current results are:
|Self-report health (expectation)||1.44|
|Number of beds in ward||0.30|
|Govt versus Private Hospital||0.03|
A Teacher’s Lament:
Gabbie Stroud resigned from her teaching position and wrote:
‘Teaching – good teaching – is both a science and an art. Yet in Australia today [it]… is considered something purely technical and methodical that can be rationalised and weighed.
But quality teaching isn’t borne of tiered ‘professional standards’. It cannot be reduced to a formula or discrete parts. It cannot be compartmentalised into boxes and ‘checked off’. Good teaching comes from professionals who are valued. It comes from teachers who know their students, who build relationships, who meet learners at their point of need and who recognise that there’s nothing standard about the journey of learning. We cannot forget the art of teaching – without it, schools become factories, students become products and teachers: nothing more than machinery.’
And Dylan Wiliam, after watching a Bill Moyers (A World of Ideas, 1989) interview. In which, Sarah Lawrence Lightfoot defined effective teaching as ideas conveyed through relationships (p. 159); says,
“probably the best 4-word definition I have seen.”
Whilst it may be simpler and easier to see teaching as a set of discrete influences, the evidence shows that these influences interact in ways in which no-one, as yet, can quantify. It is the combining of influences in a complex way that defines the ‘art’ of teaching.