In yesterday's post, I signalled that one of the factors that convinced me to develop Harriet, our ECONS101 AI tutor, was this 2024 paper by Gregory Kestin (Harvard University) and co-authors. The created an AI tutor that would walk students through some lessons in the Harvard physics class Physical Sciences 2 (PS2), and compared the learning gains with those for students who experienced the traditional (active learning) approach. As Kestin et al. explain:
Through content-rich prompt engineering, we developed an online tutor that uses GAI and best practices from pedagogy and educational psychology to promote learning in undergraduate science education. We conducted a randomized controlled experiment in a large undergraduate physics course (N = 194) at Harvard University to measure the difference between 1) how much students learn and 2) students’ perceptions of the learning experience when identical material is presented through an AI tutor compared with an active learning classroom.
So, this wasn't a case of comparing an AI tutor with nothing, or even comparing an AI tutor with a traditional static learning experience, but comparing an AI tutor with a teaching approach that has demonstrated high efficacy (active learning). Kestin et al. separated the class into two groups, and conducted their experiment over two weeks (two lessons):
The first week, group 1 engaged with an AI-supported lesson at home while group 2 participated in an instructor-guided active learning lecture. The conditions were reversed the following week. To establish baseline knowledge, students from both groups completed a pre-test prior to each lesson—focusing on surface tension in the first week and fluid flow in the second. Following the lessons, students completed post-tests to measure content mastery and answered four questions aimed at gauging their learning experience, including engagement, enjoyment, motivation, and growth mindset.
After controlling for the pre-test score, midterm exam score, and a measure of prior proficiency in physics, Kestin et al. found that:
...controlling for all these factors, the students in the AI group performed substantially better on the post-test compared with those in the active lecture group. We show this to be a highly significant... result with a large effect size.
The effect size was a 0.63 standard deviation greater improvement in knowledge for the AI tutored group, compared with the active learning group. That is a substantial difference! And on top of that, students reported feeling more engaged in the lesson and more motivated to learn.
All in all, this was quite a convincing endorsement of AI tutoring. However, Kestin et al. then conclude that:
As in a “flipped classroom” approach, an AI tutor should not replace in-person teaching—rather, it should be used to bring all students up to a level where they can achieve the maximum benefit from their time in class.
If I had one gripe about the paper, it is that Kestin et al. didn't look at differences in effect size between the top students and the bottom students. Having said that, this was a Harvard physics class, so it's not clear what an analysis of heterogeneity might tell us (because even the average Harvard students are much better than most). My worry is that, like blended learning and a lot of other initiatives that we could put in place as teachers, an AI tutor has the potential to increase the divide between the most engaged students and the least engaged students. And that's part of the reason why Harriet, our ECONS101 tutor, is being used as a complement to other ways that my ECONS101 students can improve their learning. I don't see the AI tutor replacing the human tutor entirely just yet.
[HT: Marginal Revolution]
No comments:
Post a Comment