Effect Size d= 0.73  (Hattie's Rank=10).

Nielsen & Klitmøller (2017) in 'Blind spots in Visible Learning - Critical comments on "Hattie revolution"', discuss in detail the many problems of Hattie's synthesis of feedback studies.

They use Hattie's definition of feedback,
'... feedback is information provided by an agent (e.g., teacher, peer, book, parent, or one’s own experience) about aspects of one’s performance or understanding. For example, a teacher or parent can provide corrective information, a peer can provide an alternative strategy, a book can provide information to clarify ideas, a parent can provide encouragement, and a learner can look up the answer to evaluate the correctness of a response. Feedback is a “consequence” of performance' (VL, p174).
'In summary, feedback is what happens second, is one of the most powerful influences on learning, occurs too rarely...' (VL, p178). 
They then detail a significant problem with Hattie's work in general but with the influence of feedback in particular, i.e., the different definitions of variables,
'it is our assessment that in four of the five "heaviest" surveys that mentioned in connection with Hattie's cover of feedback, it is conceptually unclear whether they are operates with a feedback term that is identical with Hattie's' (see also Blichfeldt, 2011). (p11, translated from Danish).
Furthermore, they state,
'The breadth of the phenomenon of feedback varies clearly in the meta-analyses used. In one of the meta-analyses (i.e., Kluger & DeNisi, 1996) the term feedback intervention is used, covering a very wide range of feedback situations, while another meta-analysis (i.e., Swanson & Lussier, 2001) something that Hattie calls feedback to Dynamic assessment (a way to help students during a test situation)' (p10ff).
They then go into more detail,
'... we will come closer to look at five of the meta-analyses that Hattie builds his calculation on... Hattie's feedback area consists of 23 meta-analyses including 67,931 people, 5 of them are a special heavy because they include 62,761 people corresponding to 92 percent of the total sample' (p11).
This also shows the major issue of how to weight studies and how different weightings derive totally different effect sizes (see Effect Size).

They define their criteria for examination of the studies (p11):

1. Are the surveys valid, do they measure what Hattie says, i.e., feedback?

2. Are the meta-analyses transparent, so it is possible to examine the quality of the individual studies in the meta-analyses?

3. The use of randomized control group studies? Studies which use control groups are of higher quality.

AuthorValidityTransCon. Grps?
Lysakowski &Walberg, 1980Focus on reinforcement techniques.LowNo
This is not clearly defined.
Unclear relation to feedback.
Low validity.
Lysakowski &Walberg, 1982Focus on corrective feedback.HighYes
Unclear relation to feedback in educational situations.
Feedback in connection with test situation.
but not in connection with instruction and teaching.
Low validity.
Kluger and DeNisi, 1996Feedback intervention consistent with Hattie's definition.HighYes
Focus on feedback in educational situations.
High validity.
Witt, Wheeless & Allen, 2006Feedback for the teacher, not students.LowNO
Does the student benefit from the teacher receiving feedback.
Low validity.
Swanson & Lussier, 2001Focus on dynamic assessment.Highnot consistent
Unclear relation to feedback in educational situations.
Low validity.

They conclude,
'...our analysis shows that the transparency is low in two out of five studies, also only one of the five studies is consistently working with a control group design.  
... the study by Kluger and DeNisi (1996), that Hattie (VL, p175) denotes "the most systematic study addressing the effects of various types of feedback" has an effect size of d = 0.38 - i.e., a much lower impact assessment than the 0.73 ... other than that 32 percent of the surveys that are included in Kluger and DeNisi's study, a negative effect on the learning process - which is moreover contrary to Hattie's assumption that "almost everything works". 
Kluger and DeNisi (1996) therefore denote feedback as a two-fold sword that both can lead to the student either learning significantly more or significantly less' (p11).

One of their pertinent observations is that many of the studies produce negative effects. They quote Shute (2008) 'Within this large body of feedback research, there are many conflicting findings and no consistent pattern of results.' Professor Dylan Wiliam confirms this saying 40% of studies show feedback has a negative effect (see video below).

Dylan Wiliam on other problems with feedback research - most of the studies are on university students and 85% of the feedback is ONE event lasting minutes!!!

Schulmeister & Loviscach (2014) Critical comments on the study "Making learning visible" (Visible Learning) also detail many problems with Hattie's analysis of feedback. The problem of averaging many different studies on different target groups (teachers and students) using very different feedback mechanisms. For example, the Standley (1996) study is about the impact of music on behavioral interventions. They conclude,
'Only in a very broad sense has this study something to do with feedback; it is behavioristic reinforcement' (p9).
Prof Terry Wrigely (2015) in Bullying by Numbers, critiquing the EEF in particular but also Hattie,
'Specifically, on Feedback, the Toolkit provides some more specific references to back up its very general claims, but many of these are over 20 years old and currently unobtainable. Seven more detailed references are given, each with an ‘effect size’, but these range from .97 to .20. Which is to be believed? Summaries follow, in highly technical language, mostly without indicating which stage or subject, what kind of learning, what kind of feedback, which countries the research took place in, and so on. Some of the sources are very critical of particular types of feedback... 
Meta-analyses are used in Medicine to enable researchers to complement the reading of other research, though not to substitute for it; for example, if experiments have been based on small samples, averaging the results can suggest a general trend. 
But the medical literature contains serious warnings against the misuse of meta-analysis. Statisticians are warned not to mix together different treatments, types of patient or outcome measures – the ‘apples and pears’ problem. If the original results differ strongly, they are advised to highlight the difference, not provide a misleading average. This is exactly what has not happened in the Toolkit, which should never have provided an average score for “Feedback” since the word has so many meanings' (p6).

David Didau in his excellent blog on feedback looks at the key studies used by Hattie and the EEF and confirms Wiliam's analysis and also shows feedback is complicated and nuanced.

He also pointed out that even though most researchers have feedback as an important teaching strategy, PISA has feedback NEGATIVELY correlated with Science performance.

The Negative Influences

From PISA 2015 Volume 2 (page 228).

Wiliam in the above video quotes from the study Examining formative feedback in the classroom context: New research perspectives, January 2013 by M.A. Ruiz-Primo and M. Li.

They reviewed over 9,000 studies on feedback (most of the studies Hattie used) and decided only 238 were of high enough quality to use (p217). But they also warn that,
'only 131 studies, or 4%, were considered appropriate for reaching some type of valid conclusion based on their selection criteria' (p218).
They detailed that the studies differed in many respects and needed to be coded for these nuances (p217):

(1) who provides the feedback (e.g., teacher, peer, self, or technology-based),

(2) the setting in which the feedback is delivered (e.g., individual student, small group, or whole class), 

(3) the role of the student in the feedback event (e.g., provider or receiver), 

(4) the focus of the feedback (e.g., product, process, self-regulation for cognitive feedback; or goal orientation, self-efficacy for affective feedback), 

(5) the artifact used as evidence to provide feedback (e.g., student product(s) or process),

(6) the type of feedback provided (e.g., evaluative, descriptive, or holistic), 

(7) how feedback is provided or presented (e.g., written, video, oral, or video), 

(8) reference of feedback (e.g., self, others, or mastery criteria), and 

(9) feedback occurrence in the study (e.g., one time or multiple times; or with or without pedagogical use).

They conclude,
'Clearly, the range of feedback definitions is wide. This leads to the question of how it is possible to identify patterns of results from such a small number of studies clustered around each definition. Or how is it possible to argue for feedback effects without considering the nuances and differences among the studies?' (p217).
They detail the quality issues and which we consistently see in Hattie's synthesis,
'A high percentage of papers investigating the impact of feedback did so without using a control group... 
Confounded effects, rarely mentioned in the synthesis and meta-analyses, pose another threat to validity when interpreting results of feedback studies' (p218).
'most of the studies do not provide information about the reliability and validity of the instruments used to measure the effects of feedback on the selected outcomes. The validity of feedback studies is threatened by a failure to attend to the technical characteristics of the instruments used to measure learning outcomes... Given these measures with ambiguity in technical soundness, can we fully trust results reported in synthesis and metaanalyses studies?
... there is an issue of ecological validity. For a research study to possess ecological validity and its results to be generalizable, the methods, materials, and setting of the study must sufficiently approximate the real-life situation that is under investigation. Most of the studies reported are laboratory-based or are conducted in classrooms but under artificial conditions (e.g., students were asked to identify unfamiliar uses of familiar objects)...
Furthermore, a high percentage of the studies focus on written feedback, and only a few on oral or other types of feedback, although oral feedback is more frequently observed in teachers’ daily assessment practices (see Hargreaves et al., 2000).
...we argue that formative feedback, when studied in the classroom context, is far more complex than it tends to appear in most studies, syntheses, or meta-analyses. Feedback practice is more than simply giving students feedback orally or in written form with externally or self-generated information and descriptive comments. We argue that feedback that is not used by students to move their learning forward is not formative feedback. We thus suggest that feedback needs to be examined more closely in the classroom setting, which should ultimately contribute to an expanded and more accurate and precise definition' (p219).
Some interesting findings,
'Research has made clear that students hardly read teachers’ written feedback or know how to interpret it (Cowie, 2005a, 2005b)' (p225).
'most of the publications on formative assessment and feedback include examples of strategies and techniques that teachers can use. Most of them, however, do not provide empirical evidence of the impact of these strategies on student learning; nor do they link them to contextual issues that may affect the effectiveness of the strategies...
there is a lack of studies conducted in real classrooms—the natural setting—where it would be important to see evidence that feedback strategies have substantive impact. Moreover, few studies have focused on feedback over extended periods or on factors that can moderate or mediate the effectiveness of feedback. Therefore, we cannot generalize what we know from the literature to classroom practices...
Rather than persisting with our common belief that feedback is something doable for teachers, we should strive to study formative assessment practices in the classroom, including feedback, to help teachers and students to do better. Given these unanswered questions, we need different and more trustworthy strategies of inquiry to acquire firsthand knowledge about feedback in the classroom context and to systematically study its effects on student learning' (p226).

No comments:

Post a Comment