Schools, universities, and teachers at all levels are having to grapple with the challenges of student use of generative AI. This new article by Thomas Corbin and colleagues (all from Deakin University), published in the journal Assessment and Evaluation in Higher Education (open access), describes it as a 'wicked problem':
Wicked problems, as originally conceptualised by Rittel and Webber (1973), describe challenges that defy simple solutions... Unlike their counterpart ‘tame’ problems, which have clear definitions and measurable solutions, wicked problems lack definitive formulations, and their solutions are not true or false but rather better or worse, requiring judgment, compromise, and adaptation. This distinction is key because it disrupts the assumption that there is a ‘correct’ policy, assessment method, or institutional response waiting to be discovered. Instead, every approach carries trade-offs, is shaped by context, and must be continually reassessed in response to evolving conditions. For those tasked with navigating wicked problems, this reality has a significant personal toll; every decision feels provisional, every choice open to criticism, and the pressure to find the ‘right’ solution persists even when no such solution exists...
I'm sure that description resonates with many teachers, when they think about generative AI and assessment. Corbin et al. back up their assertion that this is a wicked problem with qualitative research, based on interviews with 20 'Unit Chairs', who are responsible for running a subject. It would have been interesting if they had interviewed lecturers as well, since they are on the front lines in dealing with students' use of generative AI, but I suspect the results would not have differed too much.
The results make for interesting reading. Corbin et al. work their way through all of the criteria that Rittel and Webber used to define a 'wicked problem' in their 1973 article (ungated version here). I don't agree with them on all criteria, so I'm going to use this post to push back on a few things. However, I think that their paper does provide some good talking points, starting with:
The first defining feature of wicked problems is that they cannot be clearly or conclusively defined. Unlike technical problems where stakeholders can in theory agree on what needs fixing, wicked problems mean different things to different people and these varying definitions pull solutions in contradictory directions. Without agreement on what the problem is, a singular, cohesive response becomes impossible.
This pretty much captures things I think:
Consider for example the frustration of the teacher who stated: ‘I’ve spent so much fucking time on developing this stuff. They’re really good as units, things that I’m proud of. Now I’m looking at what AI can do, and I’m like, what the fuck do I do? I’m really at a loss, to be honest’. (T10).
We are all just trying to find our way in the era of generative AI. But no one agrees on what should be done, or even what the problem is (see yesterday's post as one example!). Second:
The second defining characteristic of a wicked problem is that it has no stopping rule – that is, there are no clear criteria for knowing when you have reached ‘the solution’...
When asked about determining success, one teacher responded: ‘How do we actually tell? You can’t’ (T15).
I guess we just do what we can in the moment. However, all of us are looking around at what other people are trying, and constantly wondering if we can do better. I have a solution for my papers. I don't think it is the solution, and certainly it isn't a one-size-fits-all solution for every paper. It seems to be working all right for now, at least. But in coming up with a solution that has some benefits, it trades off with other things that we have to give up. And that is the third characteristic of a wicked problem:
Technical problems have correct answers that can be verified. Wicked problems, on the other hand, have only trade-offs, where every response sacrifices something valuable...
Another unit chair worried: ‘We can make assessments more AI-proof, but if we make them too rigid, we just test compliance rather than creativity’ (T3).
These types of statements illustrate how moves toward assessment security sacrifice something else, be it authenticity, creativity, or real-world relevance.
In my case, we assess knowledge and comprehension and application (which are low on Bloom's taxonomy), but by adopting in-person tests we forego the ability to authentically assess higher-level skills such as analysis and synthesis and evaluation (which, to be fair, shouldn't necessarily be assessed in a first-year paper anyway!).
On the fourth criterion, Corbin et al. note that:
...wicked problems lack clear metrics for testing whether solutions have succeeded...
Several unit chairs expressed uncertainty about whether their assessment adaptations were effective. When asked about determining success, one stated simply: ‘If a student uses AI appropriately for brainstorming, we might never know. If they use it inappropriately, we also might never know’ (T18).
Again, this one definitely depends on assessment style, and in some cases, you can tell whether your approach has succeeded. In my case, I am fairly confident that I am able to assess my students' learning in the test environment, and that the use of AI tutors is, if anything, improving that learning (more to come on that point though, as I will be reporting on the actual evaluation in the next month). And that means that I also disagree with Corbin et al's next point, which is:
The fifth characteristic of wicked problems is that solutions cannot be found through experimenting with solutions because every attempt has real consequences.
I think you still can try things, and see if they work (and if I didn't think that, then I probably wouldn't try things in the first place!). Yes, there are consequences. But there are also consequences to not experimenting with finding a solution. The era of generative AI is not going to pause so that we can just keep doing what we always have done. We have to embrace the uncertainty! And that links to the next point that Corbin et al. raise, which is that:
...wicked problems present limitless possible approaches with no way to determine if all options have been considered.
Yes, but to be fair that was probably true before generative AI as well. If there was a single silver bullet solution to teaching and learning, we all would have been doing it already. All teachers have their own pedagogical approaches, which hopefully leverage their strengths as teachers and academics, while mitigating their weaknesses. And that means that there isn't one approach that will work in all circumstances for all teachers. In fact, I adopt different approaches in different papers, given what I hope will work (and experimenting, while testing whether my approach is successful). And that links to the next point:
The appeal of standardized solutions - whether "best practice" templates or institutional mandates - assumes that similar-looking problems can be solved with similar approaches. But wicked problems resist this logic because each instance emerges from an irreducibly specific context.
Yes, but not necessarily for the reasons that Corbin et al. outline (or, not only for the reasons that they outline). As I noted above, every teacher has different strengths and weaknesses, and so what is best practice for one teacher need not be best practice for everyone else.
The next criterion is:
Wicked problems do not exist in isolation but instead emerge from and reveal deeper structural issues.
Several participants saw AI vulnerabilities as symptoms of institutional business models. One teacher argued: ‘A university like [the one in which I work], which is based on a business model, which is online-based, where you cannot incentivize students to come in person, and all the assessments are based on tasks you ask students to do at home in their own time, this model is the most vulnerable to fraud in an age of AI’ (T9).
Generative AI is not operating in a vacuum, so of course it intersects with other issues. Online assessment was already a problem before generative AI came on the scene. How quickly have we all forgotten about Chegg, the bane of online assessment during the lockdowns? Moving on:
The ninth characteristic of a wicked problem is that the way the problem is framed shapes which solutions become possible. This relies on the claim that how we define a problem constrains what kinds of responses can be imagined or pursued. In other words, how we frame the AI and assessment challenge predetermines which solutions appear reasonable and which remain invisible...
When teachers framed AI as a threat to academic integrity, they favoured control-based solutions. One stated: ‘I know I would still prefer exams to come back on campus because it would be the only piece of assessment that we can truly say this is their own work’ (T4)... Those who framed AI as a professional necessity proposed integration: ‘I think GenAI is going to stay, right? It’s already part of the workforce, like us as well. Students need to be able to use it efficiently. The part of their skills they will need to learn would be to use GenAI efficiently’ (T17).
This is definitely an issue. I know of colleagues from both ends of this spectrum. The worst part is that I have sympathy for both views (as regular readers of this blog will probably recognise)! But again, there need not be a one-size-fits-all solution here, and while AI might be a threat in some papers, it might be an integral part of the teaching and learning and assessment in another. Both of those things can be true at the same time. Finally:
The tenth characteristic of wicked problems is that decision-makers bear full responsibility for the consequences of their choices. Unlike theoretical problems where errors have no real-world impact, those addressing wicked problems are, as Rittel and Webber (1973, 167) note, ‘liable for the consequences of the solutions they generate’...
One teacher worried about graduating unprepared professionals: ‘How many are we missing? Are we in fact sending students out into the workforce who can get through an interview, but when they start doing the job, they can’t?’ (T11). The personal vulnerability this created was articulated starkly: ‘I feel very, very vulnerable within the university running assessments like this because I know that there are pockets of the university management who would really like to just see us do traditional, detached, academic assessments that don’t threaten to push students’ (T6).
As teachers, we do bear some responsibility. The problem here, and this is highlighted in the second quote above, is where the university creates an environment where teachers' ability to ensure students have met learning objectives is undermined by institutional practices. And too often, teachers are finding themselves in that position. As noted in yesterday's post, Simas Kucinskas made the point that "take-home assignments are obsolete". Our assessments need to reflect that fact, and universities shouldn't be putting teacher staff in a position where they are forced to adopt assessment practices that are no longer fit for purpose. Of course, this would still be an issue even if generative AI and assessment wasn't a wicked problem.
While I'm not convinced by all elements of Corbin et al's argument, I do agree that generative AI and assessment is a wicked problem. That doesn't mean that we should give up. There are solutions out there, but there is unlikely to be one solution that will work for all teachers and in all circumstances. We need to keep experimenting, and sharing our learnings. That is the only way that we will move forward, in ensuring that student learning is still assessed in a meaningful way.
Read more: