Sex, Drugs and Economics: Robert Bray's insights on using generative AI in teaching

I think we are barely scratching the surface on the ways that we can use generative AI for innovative teaching. However, new and exciting applications are increasingly surfacing. For example, I really enjoyed this new and reflective paper by Robert Bray (Northwestern University). Bray discusses how he has used large language models in the teaching of data analytics, but I think the examples can be adapted to work well in many other disciplines.

I recommend reading Bray's paper in its entirety (but skipping over the specifics of the R programming if that isn't your thing). However, there are a few key bits that I want to highlight here. Most importantly, generative AI is not something that teachers can ignore. This shouldn't need to be said, but I suspect that too many of my colleagues are trying to avoid engaging with it, or doing so in surface ways, or trying to play 'whack-a-mole' with their approaches to assessment in order to limit students' opportunities to cheat using generative AI. Incorporating AI into teaching is also going to change the way that teaching and learning needs to be framed. Bray notes that:

The most important new work introduced by AI is learning how to use AI.

I explain to my students that they can think of my R instruction as a pretext for the real lesson of the course, which is learning how to leverage AI. If you’re a regular ChatGPT user, you may wonder what there is to learn, as conversing with ChatGPT is so natural. Well, some people are natural in front of a camera, some are natural in front of an audience, and some are natural in front of an LLM. If you’re in this third category, count yourself lucky, because most people are not. Most people must explicitly learn how to use LLMs.

Using generative AI effectively is an important transferable skill. Our students will benefit in the labour market if we can get them interacting with generative AI and learning key skills in prompting and collaborating with generative AI tools. Bray gives many examples of how courses can include meaningful interactions with generative AI, but I particularly liked two of them: (1) turning homework into an AI tutoring session using an 'AI assistant'; (2) using AI to engage in learning by teaching.

On turning homework into an AI tutoring session, Bray notes that:

LLMs give rise to a new homework modality: the AI tutoring session. Rather than save a homework as a PDF or a Canvas assignment, you can embed it in an AI assistant that walks students through the assignment, like a tutor would... For example, I asked students to collaborate with a custom-made GPT on a set of study questions before each class in 2024. Students would “submit” these assignments by sending the grader a link to the chat transcripts.

This is definitely something that I have been considering for my classes, particularly for students who cannot attend on-campus tutorials. Bray gives fairly detailed instructions on how to set up ChatGPT as an AI assistant, and notes that students actually preferred the AI tutoring sessions to regular homework. However, there is a downside to this - not for the lecturer, but for the human tutoring team. Bray notes that:

I hired four tutors for my class in 2022 to help students work through the labs and master the R syntax. I didn’t hire any tutors in 2023, however, because I wanted my students to practice querying ChatGPT.

Bray also notes that:

ChatGPT is an ideal tutor: It provides immediate, thoughtful, and voluminous feedback on any topic; it has infinite patience and is incapable of scrutiny; and it’s superior at parsing and correcting sloppy code. In fact, ChatGPT’s one-on-one instruction is so good that my office-hours attendance fell from around six per week in 2022 to about three per quarter in 2023. And the textbook I wrote is even more obsolete: hardly any students download their free copy.

The days of humans tutoring other humans may well be numbered (and that number is rather small). I will be sorry to see the end of human tutors, and the negative here is that tutoring provided a key means for top students to signal their combined technical and interpersonal skills to employers. The loss of that signal will clearly make those students worse off.

On learning by teaching, Bray notes that:

ChatGPT gives you a superpower: the ability to turn students into teachers... The best way to learn something is to teach it to someone else, but before now, there was no easy way to flip the roles during class and cast students as teachers. AI gives us three new techniques for doing so. First, you can use the chatbot to role play as a student: Give a lesson to the class and have students teach the AI what they learned. Second, you can use the chatbot to parallelize assessment: Have all students propose solutions to a problem and use AI to identify the students who should share their answers with the class. And third, you can use the chatbot to parallelize instruction: Have students learn different material with different chatbots and then reconvene to teach each other what they learned.

Sadly, this again highlights what is lost when we give up human tutoring. The tutors benefit from improving their skills. However, all is not lost since they may still benefit from learning by teaching if it is incorporated into classes.

Finally, we may worry that generative AI will make our papers too 'easy', and reduce our ability to distinguish between good students and not-so-good students. Bray notes early in the paper that introducing generative AI into his courses didn't much affect the grade distribution. He offers a number of potential explanations:

Several factors muted ChatGPT’s effect on grades. First, the AI didn’t convert incorrect answers into correct answers as frequently as it converted egregiously incorrect answers into slightly incorrect answers. And since all wrong answers yielded zero points, most ChatGPT improvements didn’t increase student scores...

Second, correct AI code wouldn’t always translate into correct answers...

Third, whereas my pre-AI students learned to code, my post-AI students learned to code with ChatGPT, an entirely different proposition. Like second-generation immigrants who understand but can’t speak the mother tongue, my post-AI students could read but not write R code unassisted. Accordingly, most of my students were at the chatbot’s mercy...

Fourth, offloading the low-level details to a chatbot may have compromised the students’ high-level understanding... Several students expressed regret, in their course evaluations, for outsourcing so much of the thinking to the chatbot:

Since ChatGPT did most of the heavy lifting, I feel like I didn’t learn as much as I wanted. Especially in data–analytics.

Because we relied so heavily on ChatGPT—I truly don’t know what a lot of R even means or what I would use to complete tasks. As well, it was hard to stay engaged.

It was occasionally the case that I would mindlessly complete the quiz without fully knowing what I was doing due to the time constraint, but I got away with it since ChatGPT is so good at coding. If there is a way to effectively force students to think about how to use ChatGPT rather than simply pasting prompts, then that could prove more impactful...

Fifth, students often developed tunnel vision because crafting GPT prompts would command their undivided attention. Indeed, the students largely ignored the template solutions we covered in class, opting to spend their limited quiz time conversing with the chatbot rather than perusing their notes...

Sixth, echoing the Peltzman Effect, students used ChatGPT to improve their performance and to decrease their study time. The reported weekly study time in the compulsory and elective sections fell from an average of 3.88 and 4.85 hours in 2022 to an average of 2.62 and 3.57 hours in 2023 (the former drop is statistically insignificant, but the latter is statistically significant at the p = 0.01 level). Furthermore, 22% of students reported not studying for quizzes, which would have been inconceivable in 2022.

Those explanations are quite convincing, but the last one in particular (the Peltzman effect) is one that I have noticed over many years. Whenever I introduce some new innovation into the teaching of one of my papers, which I expect to make students' studying experience easier and more effective, the result is that many students spend less time on my paper and re-distribute their scarce studying time to their other papers that they are now finding more difficult (in relative terms). It is quite a dispiriting experience, but entirely understandable from the students' perspective. They have limited time available to spend on studying, so at the margin their next hour of study time is best spent where it will generate the greatest gains. That tends to be in the paper that is more difficult, rather than the one that is easier.

What we can take away from Bray's grade distributions is that we can't necessarily expect big learning gains from incorporating AI into teaching. So, why should we do it? Aside from wanting to set our students up with important transferable skills (as noted above), Bray notes in the conclusion to the paper that:

Simply put, ChatGPT made investing in my class fun again. AI allowed me to do things that had never before been done in the classroom. I got hooked on finding AI-empowered teaching innovations.

I want my teaching to be fun. Not just for the students, but for me.

[HT: Marginal Revolution]