Back in December last year, I briefly discussed why we eliminated Moodle quizzes from my ECONS101 assessment (we still have quizzes, but they are not for credit, and they happen every day - more on that in a future post). The problem with Moodle quizzes is that there are browser extensions that will automatically answer Moodle quizzes using generative AI. That makes Moodle quizzes largely a waste of time as an assessment tool (although I believe that they still have value as a learning tool).
Of course, Moodle quizzes are not the only assessment that have been rendered obsolete by generative AI. As Justin Wolfers notes, any high-stakes at-home assessment is now essentially worthless. But it's not just high stakes assessment. Problem sets or homework assignments are also affected. This new article by Rachel Faerber-Ovaska (Youngstown State University) and co-authors, published in the journal Bulletin of Economic Research (open access) asks the question, "Has ChatGPT made economics homework questions obsolete?"
Faerber-Ovaska et al. test the ability of ChatGPT to answer 1112 multiple choice and 186 long answer questions (which they call essay questions) from the question banks of the 2nd edition of Principles of Economics by Greenlaw and Shapiro. They find that:
The bot answered 67.63% of the 1112 multiple-choice questions correctly.
Faerber-Ovaska et al. then looked at the characteristics of the questions that ChatGPT got wrong, and report that:
The inclusion of tables or figures, as well as higher levels of difficulty, were found to significantly decrease the odds of ChatGPT answering correctly. For example, the model estimated odds of ChatGPT answering a question with a table correctly were only 0.45, corresponding to an 80% lower probability compared to a question with no table. Overall, the bot struggled with material from chapters requiring visual interpretation, such as supply and demand, elasticity, theory of the firm, and financial economics.
As for the long answer (essay) questions:
...we found that the bot scored higher for clarity than for content. The bot score for content was an A on 72.0% of questions, whereas for clarity, the bot scored an A on 93.5% of the questions. Overall, for essay questions, the bot’s responses earned an A in 72.0% of the questions and a B in 18.3% of the questions.
It is worth noting that Faerber-Ovaska et al. were testing ChatGPT 3.5, and more recent versions of ChatGPT (and other large language models) are likely to perform even better. Nevertheless, in answering the question they post, Faerber-Ovaska et al. conclude:
Have the economics homework and test questions we currently rely on been rendered obsolete by ChatGPT? The answer is: yes, as they are used now.
The "as they are used now" is important in that sentence. Economics teachers (and teachers in other disciplines) need to change the way that we do things. Homework may still be effective as a learning tool, but it will not be effective if the approach is simply to have students turn in (or complete online) homework problems that are copy-pasted from a large language model. For the moment, these models are not great at drawing accurate diagrams in economics, but that just means that economics has mere moments more time to adapt than some other disciplines. Homework may still have a place in student learning, but it needs to be structured in a way that ensures that students engage, even if they are using generative AI. In my ECONS101 and ECONS102 classes, our low-tech solution is to have students complete homework in handwritten form only. The homework is not graded, but is built on in-person in tutorials (and completion of the tutorials is worth a small amount of marks). This ensures that even if students are using generative AI, they need to engage in class as well. This is reinforced by an approach to assessment that is heavily weighted towards in-person invigilated assessment, where the use of generative AI is unlikely (for now!).
Homework isn't dead (in economics, or in general). But its role in student learning and assessment needs to change.
Read more:
No comments:
Post a Comment