The problem with meta-analyses

What are meta-analyses?

A meta-analysis is a way of collating the outcomes of similar studies and converting the data into a common metric, then combining these in order to report an estimate which represents the impact or influence of interventions in that given area.

There are a number of advantages of meta-analyses when conducted as part of a systematic review. For example, they allow large amounts of information to be assimilated quickly.

They also help reduce the delay between research ‘discoveries’ and the implementation of effective strategies.

Meta-analyses enable the results of different studies to be compared, and in so doing highlight the reasons for any inconsistencies between similar studies.

However, meta-analyses are not without their problems…

Are meta-analyses reliable?

Firstly, it is a misconception that larger effect sizes are associated with greater educational significance.

Secondly, it is a misconception that two or more different studies on the same interventions can have their effect sizes combined to give a meaningful estimate of the intervention’s educational importance.


Because original studies that used different types of ‘control group’ cannot be accurately combined to create an effect size (not least because what constitutes ‘business as usual’ in each control group will be different).

Likewise, unless the studies used the same range of pupils, the combined effect size is unlikely to be an accurate estimate of the ‘true’ effect size of a particular strategy.

Also, the way in which researchers measure the effect can influence the effect size. For example, if you undertake an intervention to improve pupils’ ability to, say, decode words, you could choose to use a measure specifically designed to ‘measure’ decoding or you could use a measure of general reading competence that includes an element of decoding. The effect size of the former will be greater than the latter, due to the precision of the measure used.

Put simply, the original effect sizes we combine to calculate an average (or meta-analysis), in order to be meaningful, must relate to the same outcomes and similar conditions and pupils, including in the control groups.

What’s more, increasing the number of test items can influence the effect size. If the number of questions used to measure the effectiveness of an intervention is increased, this may significantly increase the effect size.

Finally, trials are often carried out without first analysing and understanding the barriers that pupils face.

When random controlled trials (RCTs) are used in medicine, they only take place after intensive theorisation. In education, the process often begins with the trial and subsequent measurements. For example, if it is identified that pupils eligible for the Pupil Premium are not doing as well as their peers in literacy, then a trial is launched to test an intervention and the outcome is measured and – if the results are positive – the intervention is recommended for all to use. However, rarely is there any theorising first about precisely why some pupils are not doing as well as their peers and rarely is there any detailed analysis of the actual barriers some of these pupils face in school. For example, for some pupils it may be that English is an additional language or it may be that their attendance is low. The intervention may work for some pupils but not all, and the meta-analysis may mask the complexity of the issue and send us down the wrong path.

Should we use meta-analyses?

Should we ignore meta-analyses and effect sizes altogether and go back to gut feelings?

Of course not.

Teaching is finally becoming an evidence-informed profession that uses data to ensure we get better at what we do and, ultimately, improve pupils’ life chances. But we should always exercise caution. We should not regard the data as an oracle; rather, we should contest it and balance what the evidence suggests works with what we know from our own experiences or our own contexts.

We should also dig beneath the meta-analyses and analyse the original studies on which the effect sizes are based because the averages may hide huge variations depending on the nature of the intervention and the context in which it was used.

In conclusion, and as I am often wont to say, teaching is a highly complex, nuanced art-form and we would do well not to reduce it to basic statistics or league tables of ‘what works’ for only madness lies that way.

We should be influenced by the evidence but informed by our own experience.

Follow me on Twitter: @mj_bromley

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: