The lack of Reliability is a huge problem in Educational Research, both at an individual study level but also at a toolkit or summary level. Given most schools in Victoria use evidence at a toolkit or summary level (e.g. the HITs), the focus below will be on comparing the various popular organisations.
Validity is also a major problem. Researchers measure a range of things - achievement, behaviour, IQ, engagement, attitudes, etc. Hattie often combines ALL of these different measures into One number - The Effect Size.
Reliability - by comparing Organisation's summaries
Why do they differ significantly & even contradict each other?
For example, Daniel Willingham, a key researcher and advisor to the Evidence Organisation Deans for Impact in his influential book "Why Don't Students Like School", totally contradicts John Hattie's claim in VL that it is possible to meaningfully separate & compare individual educational influences.
Willingham says there is "a big gap between research and practice" and influences "cannot be separated in the classroom" as "they often interact in difficult-to-predict ways." He provides the following example,
Is the theory related to a single individual or Guru?
Do the findings contradict the collective experience of teachers?
Is the context of the evidence sufficiently similar to ours?
Is the research peer reviewed?
For example, Daniel Willingham, a key researcher and advisor to the Evidence Organisation Deans for Impact in his influential book "Why Don't Students Like School", totally contradicts John Hattie's claim in VL that it is possible to meaningfully separate & compare individual educational influences.
Willingham says there is "a big gap between research and practice" and influences "cannot be separated in the classroom" as "they often interact in difficult-to-predict ways." He provides the following example,
"...laboratory studies show that repetition helps learning, but any teacher knows that you can’t take that finding and pop it into a classroom by, for example, having students repeat long-division problems until they’ve mastered the process.
Repetition is good for learning but terrible for motivation. With too much repetition, motivation plummets, students stop trying, and no learning takes place. The classroom application would not duplicate the laboratory result." (from the introduction).Willingham also in his book, When Can You Trust the Experts?, details these warning signs about educational research,
Is the theory related to a single individual or Guru?
Do the findings contradict the collective experience of teachers?
Is the context of the evidence sufficiently similar to ours?
Is the research peer reviewed?
Gilmore et al. (2021) in their report to the Association of Mathematics Teachers regarding the misuse of research also warn,
"Research is always complex, often contradictory and the findings may be nuanced. It is tempting to simplify these findings to provide a straightforward message. This does a disservice to both researchers and practitioners." (p. 35)
Wadhwa, et al. (2023) investigating 12 Clearing Houses recommendations conclude,
'Clearinghouses exist to identify “evidence-based” programs, but the inconsistency in their recommendations of the same program suggests that identifying “evidence-based” interventions is still more of a policy aspiration than a reliable research practice.' (abstract)
No other organisation has Hattie's #1 strategy of Collective Teacher Efficacy, nor his previous #1, Self Report Grades even listed let alone ranked!
Yet, Hattie claims more than 3 years advanced improvement for students using these strategies.
The simple question is, if education research is so reliable, why are these summaries so different and contradictory?
Interestingly, Hattie in an interview with Ollie Lovell (June 2018) has done a 180-degree turn with hits latest mantra "the story, the story, the story." He says just looking at the numbers and rankings is too simplistic!
The U.S Education Department fund a research organisation called the What Works Clearinghouse (WWC) consisting of 20+ distinguished professors & PhD research scientists.
They review research with a focus on the quality of research and have stringent quality control standards, see below.
Most of Hattie's studies would fail the WWC standards!
It is significant that the WWC provide some caution in their recommendations with the 'Level of Evidence' rating. Also, one of the strong features of these practice guides is that they are subjected to rigorous external peer review.
WWC Recommendations:
Prof Robert Slavin also gives another important distinction regarding the WWC,
The Director-General Pasi Sahlberg outlines in Finnish Lessons 2.0, the priorities of the Finnish system, Teacher Training both in subject knowledge and didactics (p. 77).
Mathematics teaching is strongly embedded in curriculum design and teacher education in Finnish primary schools (p. 77).
Less teacher workload,
4. The Education Endowment Foundation (EEF) & E4L:
Is an independent English organisation with a funding of around $250 Million. It published a Toolkit mostly based on good quality Randomised Controlled Studies (RCTs) and adds a costing feature.
It has a very different time scale to Hattie. An effect size of d = 0.1 equivalent to a months progress. In contrast, Hattie has an effect size of 0.4 equivalent to 12 months progress (Why the HUGE difference?).
Even though they use the same highly controversial method as Hattie - The Meta-meta-analysis (Simpson (2019), Wrigley & McCusker (2019), Slavin (2018)). Their ranking of influences are very different to Hattie's.
5. The Sutton Trust:
With a focus on good quality research, the top factors with the strongest evidence of improving pupil attainment are:
1. Teachers’ content knowledge (Strong evidence of impact on student outcomes) - a direct contradiction to Hattie's work where this is ranked #125 with a small effect size = 0.09.
2. The quality of instruction (Strong evidence).
9. Barak Rosenshine's Principals of instruction:
Rosenshine's 2012 paper - here.
A number of teachers cite that these principles are the most useful, e.g., Nick Rose,
"One of the problems with the latest research is that the conclusions are necessarily tentative and there’s a good chance that the next researcher might identify something that contradicts it. This leaves teachers with a problem when trying to identify evidence-informed approaches to developing their teaching: is it worth embarking on something involving lots of time and effort, only to discover that researchers change their minds in a year’s time?
One way around this problem is to look for findings that have been triangulated. For example, if we find an outcome in well-controlled, but quite artificial, laboratory experiments and we find the same result in authentic classroom settings, despite all the noisy variables involved, then that likely makes it a good bet to try to implement it in your classroom.
In this paper, Barak Rosenshine reviews different bodies of research, including cognitive science and classroom studies, to identify where the science and practice appear to tell us the same thing about how we might take a research-informed approach to improving our teaching. If you have time to read only one research summary this year, I would recommend this one."
10. Professor Paul Kirschner & Mirjam Neelen:
These cognitive science researchers recommendations:
11. Professor Jo Boaler:
Jo Boaler's focus is on visual representations of abstract ideas which seems to directly contradict Hattie's rankings where 'simulations' and 'visual representations' are ranked very lowly.
And Jo Boaler's example of simulations:
Although Boaler's use of the mindset research has been criticised see here.
12. Professor Michael Fullan:
In ACEL Monograph 52 Fullan states,
13. The Grattan Institute:
Wiliam also provides some extra useful strategies on the teacher. He suggests teachers must want to improve and teachers need to act as critical friends. The Grattan Institute analysed the high performing international educational systems and concluded that one of the reforms responsible for improving student achievement across the four high-performing education systems in East Asia was teachers acting as critical friends. Note the amount of time devoted to feedback, lesson planning, etc. Although, one caveat in these systems is that class sizes are higher (Grattan (2012) p. 15).
David Didau in his excellent blog on feedback pointed out that even though most researchers have feedback as an important teaching strategy, PISA has feedback NEGATIVELY correlated with Science performance.
From PISA 2015 Volume 2 (p. 228)
15. Others:
The Institute for Effective Education.
Prof Robert Slavin - The Fabulous 20%: Programs Proven Effective in Rigorous Research.
16. Student Research on What makes a great teacher:
Azul Terronez surveyed 26,000 students and found great teachers:
Firstly, a focus on the quality of research:
"...the gold standard techniques are meta-analyses of randomised controlled trials and individual trials. Such approaches are widely used in health research, but are not routinely used in Australian education research" (p. 17).
Verifying the quality of the research (p. 19):
"A range of processes can be used to ensure the findings from completed research are robust. These include independent validation of the findings, peer review of research, publication of all outputs to enable scrutiny and debate (irrespective of findings), and the provision of project data for secondary analysis."
Their recommendations (p. 28):
In assessing whether to improve the quality of existing education data, governments should examine on a case by case basis whether:
• the existing quality of the data is fit for purpose.
• data quality improvements are feasible given the context of data collection.
• other options are available.
• the benefits of improving data quality exceed the costs.
The US Education department state, 'In classifying levels of empirical support for the effectiveness of our recommendations, we have been mindful not only to the issue of whether a study meets the “gold-standard” of a randomized trial but also to the question “Effective as compared to what?” Virtually any educational manipulation that involves exposing students to subject content, regardless of how this exposure is provided, is likely to provide some benefit when compared against no exposure at all. To recommend it, however, the question becomes “Is it more effective than the alternative it would likely replace?” In laboratory studies, the nature of instruction in the control group is usually quite well defined, but in classroom studies, it is often much less clear. In assessing classroom studies, we have placed the most value on studies that involve a baseline that seems reasonably likely to approximate what might be the ordinary practice default' (p. 3).
Full Guide here - https://www2.ed.gov/policy/elsec/leg/essa/guidanceuseseinvestment.pdf
They review research with a focus on the quality of research and have a stringent quality control standards, e.g.,
As already stated, most of Hattie's studies would fail the WWC standards.
"What's the story, not what's the numbers..."
"that’s why this will keep me in business to keep telling the story…" (Audio here).Hattie then admits his rankings are misleading and does not rank anymore! (Audio here).
"it worked then it got misleading so I stopped it"
Hattie is critical of other researchers,
On Wiliam Hattie says,
"I'm not even sure there is a concept such as formative or summative assessment." (@ 30min)
On Marzano,
"Throw Bob Marzano's 450 strategies out the window, cause they are all over the place, teaching does not map to learning." (@ 38 min)2. The What Works Clearinghouse (WWC)
The U.S Education Department fund a research organisation called the What Works Clearinghouse (WWC) consisting of 20+ distinguished professors & PhD research scientists.
They review research with a focus on the quality of research and have stringent quality control standards, see below.
Most of Hattie's studies would fail the WWC standards!
It is significant that the WWC provide some caution in their recommendations with the 'Level of Evidence' rating. Also, one of the strong features of these practice guides is that they are subjected to rigorous external peer review.
WWC Recommendations:
Their strongest evidence is for "deep explanatory questions" which gives evidence for Problem & Inquiry based teaching. This is at odds with Hattie's ranking and Sweller's "cognitive load theory".
WWC compared to Hattie:
WWC compared to Hattie:
Oliver Lovell, has used strategies closely related to the U.S recommendations to improve his Year 12 Maths class score by 1 standard deviation; quite an improvement! Details of what Ollie did here.
The U.S Dept of Education Maths specific Recommendations:
The U.S Dept of Education Maths specific Recommendations:
Prof Robert Slavin also gives another important distinction regarding the WWC,
"The problem is that it is difficult to do reform one teacher at a time. In fact, it is very difficult to even do high-quality program evaluations at the teacher level, and as a result, most programs listed as effective in the What Works Clearinghouse or Evidence for ESSA are designed for use at least in whole grade levels, and often in whole schools. One reason for this is that it is more cost-effective to provide coaching to whole schools or grade levels. Most successful programs provide initial professional development to many teachers and then follow up with coaching visits to teachers using new methods, to give them feedback and encouragement. It is too expensive for most schools to provide extensive coaching to just one or a small number of teachers. Further, multiple teachers working together can support each other, ask each other questions, and visit each other’s classes. Principals and other administrative staff can support the whole school in using proven programs, but a principal responsible for many teachers is not likely to spend a lot of time learning about a method used by just one or two teachers."3. The Finnish System:
Mathematics teaching is strongly embedded in curriculum design and teacher education in Finnish primary schools (p. 77).
A focus on welfare policies, e.g free healthy lunch for all children (p. 62).
Student's have choice,
Student's have choice,
"Today, students build their own personalized learning schedules from a menu of courses offered in their school or by other education institutions. Studying in upper-secondary school is therefore flexible, and selected courses can be completed at a different pace depending on students’ abilities and life situations" (p. 87).Systematic counselling and career guidance (p. 87).
School Autonomy,
"little interference by the central education administration in schools’ everyday lives" (p. 88).
Less instructional time and more time for play and recreation (p. 91).
"Lower-secondary teachers’ total weekly working time in Finland was 31.6 hours; that is significantly less than in Australia (42.7 hr), the United States (44.8), England (45.9), Singapore (47.6), Alberta (48.2), or in the surveyed 34 countries on average (38.3)" (p. 91).Time for staff collaboration,
"...teaching is a holistic profession that combines work with students in the classroom and collaboration with colleagues in the staff room" (p. 93).
Teacher-designed curricula (p. 99).
Systematic care for students with diverse special needs (p. 99).
4. The Education Endowment Foundation (EEF) & E4L:
Is an independent English organisation with a funding of around $250 Million. It published a Toolkit mostly based on good quality Randomised Controlled Studies (RCTs) and adds a costing feature.
It has a very different time scale to Hattie. An effect size of d = 0.1 equivalent to a months progress. In contrast, Hattie has an effect size of 0.4 equivalent to 12 months progress (Why the HUGE difference?).
Even though they use the same highly controversial method as Hattie - The Meta-meta-analysis (Simpson (2019), Wrigley & McCusker (2019), Slavin (2018)). Their ranking of influences are very different to Hattie's.
With a focus on good quality research, the top factors with the strongest evidence of improving pupil attainment are:
1. Teachers’ content knowledge (Strong evidence of impact on student outcomes) - a direct contradiction to Hattie's work where this is ranked #125 with a small effect size = 0.09.
2. The quality of instruction (Strong evidence).
3. Classroom climate (Moderate evidence).
4. Classroom management (Moderate evidence).
5. Teacher beliefs (Some evidence).
6. Professional behaviours (Some evidence).
6. Evidence Based Education
The UK based Evidence Based Education produce their Great Teaching Toolkit.
They produce their Science of Learning Principles-
They also recommend teachers check sources of evidence - Video @1:17:10 here.
8. Cognitive Scientists - Agarwal et al. (2012)
Retrieval Practice improves students from a C to an A!
Summary @ 57 minutes Video here.
Rosenshine's 2012 paper - here.
A number of teachers cite that these principles are the most useful, e.g., Nick Rose,
"One of the problems with the latest research is that the conclusions are necessarily tentative and there’s a good chance that the next researcher might identify something that contradicts it. This leaves teachers with a problem when trying to identify evidence-informed approaches to developing their teaching: is it worth embarking on something involving lots of time and effort, only to discover that researchers change their minds in a year’s time?
One way around this problem is to look for findings that have been triangulated. For example, if we find an outcome in well-controlled, but quite artificial, laboratory experiments and we find the same result in authentic classroom settings, despite all the noisy variables involved, then that likely makes it a good bet to try to implement it in your classroom.
In this paper, Barak Rosenshine reviews different bodies of research, including cognitive science and classroom studies, to identify where the science and practice appear to tell us the same thing about how we might take a research-informed approach to improving our teaching. If you have time to read only one research summary this year, I would recommend this one."
10. Professor Paul Kirschner & Mirjam Neelen:
These cognitive science researchers recommendations:
11. Professor Jo Boaler:
Jo Boaler's focus is on visual representations of abstract ideas which seems to directly contradict Hattie's rankings where 'simulations' and 'visual representations' are ranked very lowly.
Although Boaler's use of the mindset research has been criticised see here.
12. Professor Michael Fullan:
In ACEL Monograph 52 Fullan states,
"there was one finding that stood out as twice as powerful as any other factor in 'effect size' - principals who participated as learners working with teachers to make improvements had twice the impact on school-wide student achievement compared to any other factor" (p. 3).Note that Fullan is a leadership consultant with a focus on Leadership.
13. The Grattan Institute:
Wiliam also provides some extra useful strategies on the teacher. He suggests teachers must want to improve and teachers need to act as critical friends. The Grattan Institute analysed the high performing international educational systems and concluded that one of the reforms responsible for improving student achievement across the four high-performing education systems in East Asia was teachers acting as critical friends. Note the amount of time devoted to feedback, lesson planning, etc. Although, one caveat in these systems is that class sizes are higher (Grattan (2012) p. 15).
The OECD Teaching and Learning International Survey (TALIS) 2018 also compares teaching time.
"Teachers in Chile, USA and Alberta (Canada) spend 28.5, 28.1 and 27.2 hours respectively in actual teaching compared to 15.8 in Norway, 16.8 in Italy and 17.4 hours in the Netherlands.
Time spent teaching increased by over an hour per week in Australia from 18.6 hours in 2013 to 19.9 hours in 2018. This was equal 5th largest increase out of 23 OECD countries for which comparable data is available."14. PISA 2015:
David Didau in his excellent blog on feedback pointed out that even though most researchers have feedback as an important teaching strategy, PISA has feedback NEGATIVELY correlated with Science performance.
The Negative Influences
From PISA 2015 Volume 2 (p. 228)
15. Others:
The Institute for Effective Education.
The Nuffield Foundation, Key Understandings in Mathematics Learning. Retrieved September 2021, from -https://www.nuffieldfoundation.org/project/key-understandings-in-mathematics-learning
Prof Robert Slavin - The Fabulous 20%: Programs Proven Effective in Rigorous Research.
Center for Research and Reform in Education at Johns Hopkins University - Proven Programs. Their quality evidence levels:
Strong evidence: At least one well-designed and well-implemented experimental (i.e., randomized) study.
Moderate evidence: At least one well-designed and well-implemented quasi-experimental (i.e., matched) study.
Promising evidence: At least one well-designed and well-implemented correlational study with statistical controls for selection bias.
Strong evidence: At least one well-designed and well-implemented experimental (i.e., randomized) study.
Moderate evidence: At least one well-designed and well-implemented quasi-experimental (i.e., matched) study.
Promising evidence: At least one well-designed and well-implemented correlational study with statistical controls for selection bias.
16. Student Research on What makes a great teacher:
Azul Terronez surveyed 26,000 students and found great teachers:
1. build positive relationships.
2. are chilled.
3. are good listeners.
4. love to learn.
5. knows kids have a life outside of school.
6. notice if kids struggle.
7. sings!
8. are humble and take risks.
His TEDx talk:
The Australian Government Productivity Commission.
In their latest report (2017) they make a number of recommendations regarding educational research.Firstly, a focus on the quality of research:
"...the gold standard techniques are meta-analyses of randomised controlled trials and individual trials. Such approaches are widely used in health research, but are not routinely used in Australian education research" (p. 17).
Verifying the quality of the research (p. 19):
"A range of processes can be used to ensure the findings from completed research are robust. These include independent validation of the findings, peer review of research, publication of all outputs to enable scrutiny and debate (irrespective of findings), and the provision of project data for secondary analysis."
In assessing whether to improve the quality of existing education data, governments should examine on a case by case basis whether:
• the existing quality of the data is fit for purpose.
• data quality improvements are feasible given the context of data collection.
• other options are available.
• the benefits of improving data quality exceed the costs.
The Australian Government should request and sufficiently fund the agencies that conduct the Longitudinal Study of Australian Children to establish new cohorts at regular intervals.
U.S. Ed Dept Levels of Evidence guide:
Full Guide here - https://www2.ed.gov/policy/elsec/leg/essa/guidanceuseseinvestment.pdf
They review research with a focus on the quality of research and have a stringent quality control standards, e.g.,
As already stated, most of Hattie's studies would fail the WWC standards.
No comments:
Post a Comment