CLE & Other Errors

The effect size (d) is equivalent to a 'Z-score' of a standard normal distribution. 

The Common Language Effect Size (CLE) is closely related to Pr (Z> x) and is used to interpret an effect size by converting an effect into a probability. 

Hattie borrowed the CLE from McGaw and Wong (1992, p. 361) which they defined as,
"the probability that a score sampled at random from one distribution will be greater than a score sampled from some other distribution."
Hattie used the example from McGaw and Wong (1992) of the differences in height between men and women (p. 362). If a man and women are sampled randomly, the chance the man will be taller than the women is pr (z > -1.41) or around 92% as shown below:

However, Hattie has calculated CLE probability values of between -49% and 219%. The researchers, Higgins and Simpson (2011), Topphol (2011) and Bergeron (2017) identified that these values are not possible.

Professor Bergeron states,
"To not notice the presence of negative probabilities is an enormous blunder to anyone who has taken at least one statistics course in their lives. Yet, this oversight is but the symptom of a total lack of scientific rigor, and the lesser of reasoning errors in Visible Learning."
As a result, Hattie finally has now admitted that he calculated all CLE incorrectly. Although, he now says the calculation was not important. 

However, originally he did say that,
"in all examples in this book, the CLE is provided to assist in the interpretation of the effect size" (p. 9).
Given Hattie's mantra that interpretations are the important aspect of a synthesis, this is a significant mistake.

Also, Eivind Solfjell correctly points out that Hattie's explanations of particular CLE's are also incorrect, e.g., an effect size d = 0.29 must create a CLE of at least 50%. Excerpt from VL p. 9,

Then again on page 42, Hattie makes the same error citing d = 0.67 has a CLE of 48%. But 48% would indicate a negative d value. See table 1 from Bergeron (2017),

TABLE 1. Correspondence between selected values of Cohen’s d and CLE equivalents

Professor Arne Kare Toppholwho published that Hattie had calculated CLE's incorrectly in his paper, 'Can we count on the statistics use in education research?' had a dialogue with Hattie: 
"My criticism of the erroneous use of statistical methods will thus probably not affect Hattie’s scientific conclusions. However, my point is, it undermines the credibility of the calculations and it supports my conclusion and the appeal I give at the end of my article; when using statistics one should be accurate, honest, thorough in quality control and not go beyond one's qualifications.
My main concern in this article is thus to call for care and thoroughness when using statistics. The credibility of educational research relies heavily on the fact that we can trust its use of statistics. In my opinion, Hattie’s book is an example that shows that we unfortunately cannot always have this trust.
Hattie has now given a response to the criticism I made. What he writes in his comment makes me even more worried, rather than reassured."
Topphol was referring to Hattie's response that he used a slightly different formula and that resulted in these strange values. Hattie stated,
"Yes, I did use a slightly different notion to McGraw and Wong. I struggled with ways of presenting the effect size data and was  compelled by their method – but … I read another updated article that was a transformation of their method – but see I did NOT say this in the text. So Topphol’s criticism is quite reasonable. I did not use CLE exactly as they described it!"
Read full dialogue here.

Hattie in a later defence of his work in 2015, contradicts the explanation he gives in 2012 to Topphol above, i.e., that he used a different calculation to Topphol:
"at the last minute in editing I substituted the wrong column of data into the CLE column and did not pick up this error; I regret this omission."
But this does not explain Hattie's incorrect use of the CLE probability statistic in VL (2009, p. 9&42) as mentioned by Solfjell above.   

Hattie no longer promotes the CLE as a way of understanding his effect sizes, he now uses a value of d = 0.40 as being equivalent to 1 year's progress. But as already stated, this creates other more significant problems.

Hattie mixes up the X\Y Axis:

Hattie uses a funnel plot (p. 21) to show that publication bias does not affect his research.But, Higgins and Simpson (2011) show that Hattie has mixed the X/Y axis and if drawn correctly the funnel plot does, in fact, show publication bias  (p. 198).

Yelle et al. (2016) What is visible from learning by problematization: a critical reading of John Hattie's work, say there is an implied underestimation of the publication bias in Hattie's synthesis.

Dr. Ben Goldacre on the Funnel Plot

For more information see -

Standard Error:
"In his "Effect Barometers" Hattie also gives a standard error for the determined effect size. However, their calculation is flawed (see also Pant 2014 a, p. 96, note 4, 2014b, p. 143, FN 4). As a rule, the specified value is the arithmetic mean of the specified standard errors of the individual first-stage meta-analyzes. For example, in the case of inquiry-based teaching, the given standard error of 0.092 is the arithmetic mean of the two standard errors of 0.154 and 0.030 given two of the four meta-analyzes. However, the precision of the estimate from both meta-analyzes can not be less than that of the individual meta-analyzes; in fact, the standard error of an effect magnitude estimate from these two overlap-free meta-analyzes in the primary studies is 0.029...

The reconstruction of Hattie's approach in detail using examples thus shows that the methodological standards to be applied are violated at all levels of the analysis. As some of the examples given here show, Hatties values are sometimes many times too high or low. In order to be able to estimate the impact of these deficiencies on the analysis results, the full analyzes would have to be carried out correctly, but for which, as already stated, often necessary information is missing. However, the amount and scope of these shortcomings alone give cause for justified doubts about the resilience of Hattie's results" Wecker et al (2016, p. 30).
Schulmeister & Loviscach (2014) Errors in John Hattie’s “Visible Learning”.
"Hattie’s method to compute the standard error of the averaged effect size as the mean of the individual standard errors ‒ if these are known at all ‒ is statistical nonsense."
Robert Coe has a detailed description of d and CLE here.

No comments:

Post a Comment