The importance of evaluation

By Marian Sainsbury

Wednesday 12 March 2014

Today we launch our Evaluation Policy, setting out the principles and practices we apply to evaluation projects. As NFER’s Head of the Centre for Evidence and Evaluation from 2010 to 2013, I was in the fortunate position of having an overview of our evaluations during this period. This has led me to a few reflections on the nature of evaluation and its importance within the education system.

Evaluation is arguably the type of research that is most likely to have a positive impact on policy or practice, as it sets out to examine the effectiveness of improvements. Yet it is not the evaluator who takes the initiative in developing and implementing these improvements. It may be central or local government, introducing new policies; or a professional association developing a new training course for teachers; or a publisher with a new teaching scheme for pupils. The evaluator has no control over the type of initiative being introduced, but, by providing independent evidence of how and how well it works, can influence its future. And gradually, a body of evidence builds up that can close the loop and have an impact of its own on the shape of further new initiatives.

Designing a good evaluation, as with most things, is more complex than it might seem. It is generally true that, if the aim of the evaluation is to measure the impacts of an initiative, and it is the kind of initiative that lends itself to such measurement, a randomised controlled trial (RCT) is the most robust design to adopt. There are far too few RCTs in education, and a growing number of opportunities to use this robust evaluation approach when new initiatives are at an appropriate stage of development.  The work of the Education Endowment Foundation has been exemplary in operationalising this principle, and NFER’s Education Trials Unit is undertaking several current RCTs within this framework. It is always better to have objective evidence that an intervention does have a positive impact on the outcomes for young people, rather than just knowing that they enjoyed doing it or that their teachers believe it has had a positive impact!  But there are other circumstances in which the nature of the initiative and the aims of the evaluation may not be appropriate to an RCT approach. For example, if we want to get an indication of whether an innovative enrichment activity has changed school students’ views of the nature of engineering as a career, some focused interviews following the enrichment day may be the best way to find out. It is for this reason that NFER’s repertoire of evaluation methodologies is broad and inclusive, and we pick a combination of methods suitable for each individual project: from standardised testing to focus groups, from online surveys to life grids to cost-benefit analyses.

Another recurring theme across a range of projects is the question of what impacts can reasonably be expected from an initiative, within what period of time. There is currently an entirely understandable emphasis on raising standards, so that improvements in GCSE and key stage 2 results have come to be seen as the gold standard for measuring the success of an initiative in schools. But it is worth asking the question, what chain of events has to take place to lead to such an improvement? For example, if you work to develop leadership skills within a school, or to improve the behaviour of disruptive pupils, how and when can that be expected to result in measurable improvements in exam performance? Even introducing a programme to improve teaching skills has to lead to changes in the teacher’s practice and changes in the pupils’ experiences before it can lead to improved attainment; and any effects on pupils in other classes, taught by other teachers, are likely to take even longer. Careful and realistic research design can capture the ‘intermediate’ results of an initiative, when improvements in standards may take several years.

The fact that there are no easy answers is not surprising, and nor should it be a deterrent. It remains the case that well-conducted evaluations can make a huge contribution to evidence-based education and offer a secure foundation for real, cumulative improvements.