History with HAL 9000: Should we use AI to…

Home

16 December, 2024 —

Blog

History with HAL 9000: Should we use AI to plan our lessons?

Reflections on the latest EEF Teacher Choices evaluation

Share on:

by Bradford Research School
on the 16th December 2024

Mark Miller

Director of Bradford Research School

Teachers using ChatGPT, alongside a guide to support them to use it effectively, can reduce their lesson planning time by 31 per cent. This is according to the findings from a new trial published by the Education Endowment Foundation (EEF) today.
EEF press release

This is the opening paragraph announcing the results of the Teacher Choices trial, ChatGPT in lesson preparation. Teacher Choices trials explore some of the most common questions teachers ask about their practice and the everyday choices they make when planning lessons and supporting pupils. While the primary measure of this trial was teacher workload, the evaluators (NFER) looked at the quality of resources, and shared this conclusion:

We found no evidence to suggest that the quality of the lesson resources used by the two groups differed, from an expert panel review (who did not know how the resources had been created) of lesson resources sampled from each of the two groups.

So does this mean that the quality of planning from AI is as good as the quality of planning from teachers? Should we outsource our planning to Wall‑E?

Judging ‘quality’

It helps for us to start with the context of this trial and how they considered the quality of resources.

The trial involved Year 7 and 8 Science teachers, 259 in total, with half allocated to a ChatGPT group and the rest asked to refrain from using generative AI. The trial lasted ten weeks. The first 5 involved engaging with an online teacher guide, with weeks six to ten being the time the outcome on teacher workload was measured.

Interestingly, the judgement of ‘quality’ was based on ‘the quality of the final resources, as used in the lesson, which included teachers’ checks and adaptations of the ChatGPT outputs’, so we are looking at the quality of the final resources, not necessarily the ChatGPT generated resources themselves. Therefore, the effectiveness of AI is moderated by the teacher.

The independent panel looked at lesson and resource materials. There were 15 from each arm in the trial, ranked from 1 – 30. The panel was comprised of 5 science teachers, three with over 20 years of science teaching, two earlier in careers, including an ECT. They did not know which resources were created by ChatGPT.

The rubric provided for the panel looked at:

Clarity of lesson resources
Extent that activities engage students with the learning and check their understanding
Appropriateness for the age group and ability level of the class
Quality and accuracy of science content

And they were asked not to extrapolate evidence on how they would deliver the lesson, only on the available resources provided. Where there was a challenge to the ‘quality and accuracy of science content’, the AI transcripts were triangulated. In all three cases there were no inaccuracies in the ChatGPT transcripts.

And of the 69 teachers who completed the endpoint survey, 52% said lesson resources were similar in quality, 26% ‘higher quality’ and 22% ‘slightly lower’.

What part does AI play in producing resources?

In this study, ChatGPT was not ‘planning lessons’. These six best practice ‘use cases’ were shared in the ChatGPT guide:

find activity ideas;
get ready-made practice questions;
adapt your materials to work for your group;
craft model answers and build mock exam questions;
get effective explanations and step-by-step examples; and
test student understanding and avoid misconceptions.

The most common uses of ChatGPT were creating questions or quizzes, finding activity ideas and helping to prepare cover work. Also, many schools had existing resources and often centralised planning, so were often engaged in adapting lessons rather than planning from scratch:

Based on interview data, seven case study schools used centralised planning and two used individual planning, although one of these was in the process of moving to centralised planning. In schools with centralised planning, leaders and teachers reported that teachers had access to pre-prepared lesson slides and other resources (for example, pupil worksheets) for every lesson.

Reflections

As with any piece of evidence, we have to reflect in what it can and can’t tell us. There are not many other sources of evidence on this topic, so we have to weigh that up. The size, duration, context of the trial all matter.

This trial is welcome, because it adds to our understanding of potential benefits of AI in reducing teacher workload, while protecting lesson quality. It does not unequivocally ‘prove’ that AI will always save time and produce high-quality resources, but it does help us to make further decisions, even if we acknowledge the limitations. It certainly suggests that it might be promising to explore this further.

The message for me isn’t ‘AI generated lessons were just as good’, it is that AI played a part in aspects of lesson planning which – moderated through the teacher – resulted in lessons that were perceived as just as good.

As a school leader, I would ask what has been gained and lost in this process? We have gained time, which might result in better lesson preparation. We have lost some engagement with the material that might better prepare us for teaching the lesson e.g. if ChatGPT has generated a quiz, are we well-enough prepared to respond to the answers to this quiz? (But we might make a similar point about other centrally prepared resources.)

The trial used the free ChatGPT version 3.5, so results may have been even better with a paid version.

So, should we use AI to plan our lessons? My personal opinion is that we might produce ‘good enough’ resources with the help of AI, but we may not be able to use them effectively if we have outsourced all the thinking. But as teachers source lessons from many contexts, learning as much as we can about the benefits and challenges of using AI tools effectively may only benefit the profession.