One-To-One Tutoring Works: AI Tutoring At Scale Could Mean Every Pupil Gets It
When the DfE committed Β£23m to bring AI tutoring to up to 450,000 disadvantaged pupils by 2027, the criticism came quickly and from several directions.
But two debates are being treated as one. The first is about AI tutoring as a category: what it is, what works, and what doesn’t. The second is about the DfE pilot specifically: whether this is the right intervention, for the right pupils, delivered in the right way.
Both debates are worth having. Collapsing them into one lets weak arguments hide behind strong ones, and lets the equity question β whether pupils who have never had access to one-to-one maths support might finally get it β get lost in the process.
Key takeaways
- The criticisms split into two debates: two are about AI tutoring as a category (safety, replacing teachers), and one is about the DfE pilot specifically (equity).
- βAI tutoringβ covers everything from pupils using ChatGPT for homework to structured, curriculum-aligned programmes with teacher oversight. Most criticism applies to the weak end of that spectrum.
- One-to-one tutoring accelerates learning by up to five additional monthsβ progress (EEF). Until now, who gets access has been determined by who can afford it.
- Holding AI tutoring to a perfection standard while accepting the status quo does not protect disadvantaged pupils. It preserves the gap
AI Tutoring in Schools: A Practical Guide to Improving Results Within Your Existing Budget
Use this clear step-by-step roadmap built on evidence-based impact to decide whether AI tutoring is right for your school.
Download Free Now!Critics are making three different arguments
When the DfE announced its AI tutoring pilot, the criticism came from several directions at once. It is worth separating them, because collapsing different concerns into one lets everyone off the hook β defenders and critics alike. Two of the three arguments below are aimed at AI tutoring as a category. The third is aimed specifically at the DfE pilot. They need different answers.
1. Safety and the evidence that backs this up
The first argument is about safety, and it is well-founded.
We know from research that students using ChatGPT to study performed around 17% worse in exams than those who studied without it (Bastani et al., 2025). The most common prompt was βgive me the answer.β Stanfordβs SCALE Initiative reviewed 20 causal studies and found that AI tools improve performance while pupils have access to them, but those gains weaken or disappear when assessed independently. Open AI tools in classrooms with no educational design behind them are unsafe for pupils. I donβt think anyone serious disputes this.
2. Replacing teachers: a serious problem
The second argument is about replacing teachers. This is also fair. Dan Meyer, who led the development of Amplifyβs first classroom AI feature and holds a Stanford doctorate in maths education, makes this case well on the Mr Barton Maths Podcast. Skilled tutoring involves knowing when to step in, when to hold back, and when a pupil is wrong but on a path to the right answer. These are human judgements that draw on social knowledge no AI has β what happened yesterday in class, who is struggling at home, which pupil needs pushing and which needs leaving alone. Pupils perform for their peers and their teacher in ways they will not for a screen.Β
There is, though, a conflation worth naming. Meyer slides between “teacher” and “tutor” as if they were the same role. They aren’t. Structured AI maths tutoring isn’t a substitute for a teacher who knows a pupil, sees them every day and reads the room. It’s a substitute for a one-to-one tutor β the kind a wealthier family pays Β£40 to Β£45 an hour for, twice a week. The comparison that matters is with that tutor, not with the classroom teacher. Nobody serious is proposing the latter. Any system that sidelines teachers rather than supports them is solving the wrong problem.
3. Equity: the argument with a hidden assumption
The third argument is about equity β that disadvantaged pupils are being used as test subjects while wealthier pupils are not. Molly Kingsley at SafeScreens calls the pilot a βfalse economy set to experiment on disadvantaged children.β Pepe DiβIasio at the ASCL says the government has acknowledged the benefits of tutoring while showing no appetite to fund a national tutoring programme to deliver it. These are reasonable positions, and they’re aimed squarely at the DfE pilot rather than at AI tutoring as a category. But they contain a hidden assumption I want to come back to.
Three things worth saying outright
- Generic AI in classrooms is not safe.
- AI will never replace teachers, and shouldn’t.
- The pupils who can least afford bad technology deserve the most scrutiny of what gets built for them, not the least.
These criticisms stand. But the doubt starts when you ask the next question: what exactly are we comparing AI tutoring to?
Not all AI tutoring is the same
Most of the criticism applies to the weak end of a spectrum that covers fundamentally different products. Until we pull the term apart, the debate is just people arguing about different things using the same word.
James Radburn, in an AI tutoring episode of the Thinking Deeply about AI for Schools podcast, lays out this spectrum clearly. At one end, a pupil using ChatGPT for homework help β no educational design, no curriculum alignment, no guard rails. At the other end, intelligent tutoring systems: structured systems with curriculum-mapped content, diagnostic assessment, teacher direction, and materials that the AI delivers but cannot alter. In between, a range of purpose-built tools with varying levels of pedagogical design. Neil Almond, his co-host and a primary school leader and teacher, puts it well: βYou have to look at the difference between what we mean when we say AI tutors to get that nuance of whatβs happening in these studies.β
What the research tells us
The research is studying different products under the same label. That is why it looks contradictory.
The Stanford evidence makes this concrete. AI harms learning when it removes productive struggle β when the pupil asks, βgive me the answerβ and the tool complies. Process-mining research shows the steps that disappear when generic AI is present: re-reading, self-evaluation, and planning. These are the steps that build understanding and retention. But this is an argument against bad design, not against AI tutoring as a category. When AI tutoring is built around the same principles that make human tutoring effective β curriculum alignment, structured conversation, scaffolded reasoning, teacher oversight β the emerging evidence shows it can produce real learning gains.
AI tutoring products that work
There are consistent markers that separate AI tutoring products that produce learning from those that produce the appearance of it. The products that work have a clear role for teachers β the AI sorts and delivers, rather than generating content on the fly. Intervention logic goes beyond βrespond whenever the pupil asks.β And efficacy data includes the whole population, not just the self-selecting, motivated minority.
The Eedi and Google DeepMind RCT β one of the most rigorous studies to date β found students were 5.5 percentage points more likely to solve novel problems using carefully designed AI tutoring than with human tutors alone. That study was built on exactly these principles: human oversight, curriculum-grounded design, and active learning rather than answer delivery. The AI had a 0.1% error rate.
The design question at the centre of the debate
Does the AI generate mathematical content on the fly, or deliver pre-created, expert-designed materials? This determines whether hallucinations are even possible. It determines whether the system shortcuts productive struggle or scaffolds it. Most commentary skips over this question entirely.
We have written separately about the full evidence base for AI tutoring and what it means for schools. But the short version: the markers of quality are consistent. Curriculum alignment. Structured pedagogy. Teacher direction. Content that cannot drift. These are what separate AI tutoring that produces learning from AI that produces the appearance of it.
In defence of the DfE pilot
The criticisms above split into two. Most are about AI tutoring as a category, and the evidence on that category is genuinely improving. The remaining criticism, the one aimed at the DfE pilot itself, needs answering directly. Here’s why the pilot is the right thing to be doing.
It’s worth being precise about what this pilot actually is. The Β£23m is funding initial research and co-design with teachers β aligning what the EEF tells us about effective tutoring with what AI tutoring can do. The DfE isn’t proposing to scale this across the country; the point is to learn what works first. Much of the criticism treats the pilot as a national roll-out rather than the research that should inform one.
The equity argument cuts both ways
The strongest criticism of the DfEβs AI tutoring pilot is that it experiments on disadvantaged pupils. But this criticism only holds if the alternative those pupils currently have is something better. For most of them, it is not β and it has not been for decades.
What the alternative actually looks like
Think about the reality for a Year 8 pupil who is two years behind in maths. They sit in a class of 30. They may get a teaching assistant shared across several pupils. They may get a catch-up group once a week. They are not getting what the pupil whose parents pay Β£40β45 an hour for a private maths tutor twice a week is getting.

Craig Barton makes a version of this argument on the Mr Barton Maths Podcast: compare AI tutoring against the reality of under-resourced classrooms, not against a perfectly skilled teacher who knows every pupil. The reality, in too many schools, is non-specialist teachers covering maths, high staff turnover, and classes where individual attention is physically impossible. The EEF identifies one-to-one and small-group tutoring as one of the most effective interventions available, with an average impact of five additional monthsβ progress. That evidence is settled. What has never been settled is who gets access.
Governments want affordable ways to get good results – and they should
Some critics say the DfE is doing tutoring on the cheap. But finding more cost-effective ways to deliver proven interventions is exactly what governments should be doing. Every government has had to balance the quality of provision against the cost of delivering it at scale. AI is an obvious lever because one-to-one tutoring at the dosage the evidence says works has never been financially possible for every pupil who needs it. Schools need to decide whether the tool works for the pupils in front of them.
The double standard in how we measure AI
There is also a double standard worth naming β carefully, because I do not want it misread. We hold AI tutoring to standards we have never applied to the status quo. Every AI error is logged, screenshotted, and shared on social media within hours. Neil Almond, speaking as a teacher himself, makes the point that if you sent observers into a representative sample of classrooms and counted errors, the number would almost certainly exceed anything an AI produces. His point is not that teachers are bad β it is that we have never measured the baseline we are comparing against.
Holding AI to a perfection standard while the alternative is an unmeasured, imperfect status quo does not protect anyone. It preserves a system that already fails the pupils the criticism claims to care about.
Political solutions are right, but they are not the whole answer
Dan Meyer raises a harder question, and it is one I take seriously. The problems AI tutoring claims to address β under-resourced classrooms, teacher shortages, the attainment gap β have known political solutions. Pay teachers more. Reduce class sizes. Support families. He is right. Those solutions would help, and they should happen.
But two things can be true at once. Political investment in teaching is necessary. And structured AI tutoring, built on the same pedagogical principles that make human tutoring effective, can reach pupils that the current system does not β whether or not that investment comes. We do not see AI tutoring as a stopgap while we wait for something better. We see it as a different kind of intervention that works alongside everything else schools are doing, one that can be evaluated, measured, and held accountable in real time in a way that most classroom practice cannot.
None of this justifies bad AI. The evidence is clear that generic tools make things worse. But holding structured AI tutoring to a perfection standard β while the alternative for disadvantaged pupils is nothing resembling one-to-one support β is a choice. And it preserves the gap.
What structured AI maths tutoring actually looks like
At Third Space Learning, we have built Skye, our AI maths tutor, around the same principles that separate effective AI tutoring from the kind that the evidence shows harms learning. As the founder, I am not a neutral observer of this debate. But the design choices we have made are not a response to today’s critics. They come from twelve years of one-to-one tutoring, 2.1 million human-led sessions, and the principles that the EEF, Stanford SCALE and the emerging AI research all point to. They happen to answer the criticisms above, which is why they’re worth explaining.

Teacher-directed, not teacher-replacing
Skye is teacher-directed. Teachers choose programmes, reorder lessons to match their classroom teaching, monitor sessions in real time, and review detailed progress reports. The teacher provides the human expertise: the social intelligence and contextual knowledge that no AI has. Skye delivers the instruction they direct. It is not a chatbot that decides what to teach β it is a delivery mechanism for structured, one-to-one maths tutoring that the teacher controls.

Content that cannot hallucinate
Every lesson, question, and explanation is pre-created by qualified teachers and maths experts, drawing on insights from 2.1 million human tutoring sessions. Skye cannot generate its own mathematical content, access external content, or deviate from the lesson. The system cannot give the answer because it is designed not to. That is a structural answer to the hallucination and cognitive outsourcing problem.

Scaffolding that preserves productive struggle
Lessons follow the I Do, We Do, You Do model of explicit instruction. Every lesson begins with a diagnostic Skill Check-In that adapts the pathway in real time. If a pupil demonstrates mastery, they skip to independent practice. If not, they work through scaffolded teaching β hints, then more explicit support, then step-by-step modelling. The productive struggle the research calls for is preserved by design.
Accessible at scale – and honest about limitations
One fixed annual cost β starting from Β£3,500 for primary schools β gives unlimited sessions for unlimited pupils. No tutor recruitment, training, or management. Sessions available every five minutes. For the first time, https://thirdspacelearning.com/tutoring/pupil-premium/ at a dosage that the evidence says makes a difference.
We are honest about what Skye cannot do. It does not have access to a pupilβs social or pastoral context. It cannot replicate the social dimension of a classroom. For pupils with complex needs, implementation quality matters β quiet spaces, suitable headsets, and adult supervision. The Sutton Trust has flagged that schools serving more disadvantaged pupils are less likely to have received formal AI training. Those schools need implementation support, not just a licence.
Victoria Hedlund, an AI bias researcher, has written for us about how schools can use AI to promote equity in education β including what genuine equity requires beyond equal access to a tool. Her piece is worth reading alongside this one.
Educate Ventures Research independently evaluated 9,320 Skye sessions. They found pupils improved from 34% accuracy on diagnostic check-in questions to 92% on check-out assessments within a single session. 63.8% of pupils ended sessions more confident than they began. Educate Ventures Research is clear that this is not yet an estimate of standardised attainment impact β that would require a controlled design β and so are we. But it is consistent, independently verified evidence of within-session learning and consolidation, and it aligns with the research principles the EEF calls for.
These pupils cannot wait
The campaigners are right that disadvantaged pupils deserve better. They deserve better than bad AI dropped into classrooms without thought. And they deserve better than the status quo.
The real experiment is not the DfEβs pilot but rather the thirty-year experiment this country has already run β one in which one-to-one tutoring was treated as a privilege of the few, and the pupils who needed it most got least. We know the results of this experiment – the attainment gap remains stubbornly wide.
We welcome scrutiny of ourselves, the DfE pilot and the companies bidding for it, but we need to remember what weβre comparing it to.
Emerging evidence says that structured, evidence-based AI maths tutoring β built by teachers, directed by teachers, held to measurable standards can reach the pupils who currently get nothing resembling one-to-one support. So we support any pilot that looks into this further.
Refusing to find out, on behalf of the pupils who have the most to gain, preserves the gap it claims to close.
References
Bastani et al. (2025). Generative AI Can Harm Learning. The Wharton School, University of Pennsylvania.
Stanford SCALE Initiative (2026). The Evidence Base on AI in K-12 Education.
Education Endowment Foundation. Teaching and Learning Toolkit: One-to-one Tuition.
Eedi and Google DeepMind (2025). Human-in-the-Loop AI Tutoring RCT.
Luckin, R. et al. (2025). Educate Ventures Research: High-Impact AI Maths Tutoring Case Study.
Dan Meyer on the Mr Barton Maths Podcast, Episode #220 (AI in Ed series).
James Radburn and Neil Almond, Thinking Deeply about Primary Education podcast.
Department for Education. AI tutoring pilot announcement (2025).
The Sutton Trust (2025). Artificial Advantage?
Daily Mail (2026). βAI teachers set to be unleashed in UK classrooms.β Elizabeth Ivens.
DO YOU HAVE STUDENTS WHO NEED MORE SUPPORT IN MATHS?
Skye β our AI maths tutor built by teachers β gives students personalised one-to-one lessons that address learning gaps and build confidence.
Since 2013 we’ve taught over 2 million hours of maths lessons to more than 170,000 students to help them become fluent, able mathematicians.
Explore our AI maths tutoring or find out about the AI tutor for your school.