Choose a not-for-profit research organisation for a lower effect size?

By Ben Styles

Wednesday 20 September 2017

As the head of NFER’s Education Trials Unit, I was interested to read Professor Stephen Gorard’s recent book on Randomised Controlled Trials (RCTs) in education.

It’s an interesting book but there are aspects that I would like to challenge. Take the passage that reads: ‘One of the main problems with [the Institute of Education Sciences (IES) and the Education Endowment Foundation (EEF)] lies in finding the capacity among traditional researchers in university departments of education to conduct and even appreciate such work… Instead, the funds have been taken up by the growing sector of not-for-profit organisations… IES (and EEF in the UK) need the capacity that these organisations offer in order to conduct evaluations, and the organisations themselves need the external funding maintained in order to pay the salary of staff employed to do the evaluations. This might make the organisations more likely to provide what they feel the funder wants…’

Like NFER, Gorard and his team have conducted several trials for EEF, so let’s do some ‘meta-analysis by evaluator’. Without a statistical analysis plan but with a commitment to report whatever we found, we collated all the effect sizes listed in the executive summaries of published EEF trials reported by both research groups to date. The results were: median effect size from NFER (8 trials): 0.11; median effect size from Gorard’s team (7 trials): 0.17. If the assertion in the book is that not-for-profits are returning higher effect sizes, this analysis does not bear this out.

It takes considerable research infrastructure to run a successful trial and it seems many university research departments lack this. As well as running its own trials, NFER is often asked to use its expertise to recruit schools, collect data and administer tests for other evaluators running RCTs. In his book, Gorard quite correctly cites the importance of minimising attrition during an education RCT. To do this during a large effectiveness trial requiring the independent administration of tests across many schools, it helps to have our 300-strong army of ex-teacher test administrators living across England to draw on.

Furthermore, a not-for-profit research organisation such as NFER offers other advantages for this kind of work. The ‘publish or perish’ culture is absent so there is less conflict with university developers, who invariably have a publication agenda, when working together on a trial.

And lastly, where can statisticians and methodologists flourish in the education trials community? University clinical trials units abound in healthcare yet only York Trials Unit regularly runs education trials. A not-for-profit research environment can foster healthy attention to issues of design and analysis that would not normally be valued in a university education research department.

Where I believe Gorard is correct in his criticism of the trials status quo is his critique of the pipeline of interventions that go forward to trial. This is a critique of the entire education research funding system rather than EEF, per se. The IES funds the complete process of intervention evaluation; from feasibility studies to effectiveness trials, which reflects the various phases of healthcare trials. EEF have recently ceased to fund pilot trials, instead leaving this stage to other funders. Whilst The Nuffield Foundation has stepped in to cover this stage for younger pupils, where do developers seek funding for pilots in the older years?

Gorard’s assertion that there is almost no EEF funding for replication studies of the kind promoted by 3ie is also worrying given the replication crisis unfolding across all of science. And NFER’s evaluation of UCL Institute of Education’s Best Practice in Setting is the ‘exception that proves the rule’ for Gorard’s assertion that EEF developer funding misses the fundamental issues in education that are already in place in English schools. These practices are often not suitable for randomisation, for example, selective schooling or different models of school funding. However, selection at scale, for example in Kent and Buckinghamshire, might well be amenable to a Regression Discontinuity analysis under conditions of open data. Until we move away from funding, as he puts it, ‘the product of a company, or the idea of an academic’ we are missing a trick in terms of helping children in English schools. This mirrors well the problem in healthcare where if you can buy the intervention an entire industry is devoted to its trialling whereas if you cannot, restricted state funding steps in where pharma companies are unwilling to tread. EEF should become more like the National Institute for Health Research which often evaluates healthcare practices already in place and, incidentally, also has quite a reputation for funding quality trials.

Lastly, just in case you were worried about the title of this blog, NFER follows EEF analysis guidance in all its trials and always produces a detail statistical analysis plan so there is no risk of us producing systematically biased results…in either direction. We recognise the importance of identifying what doesn’t work, as well as what does.

Back to all blogs