Friday, 4 January 2019

Gender differences in multiple choice answering

In a post in 2017, I discussed multiple choice and constructed response questions in exams. One of the key points of that post was to highlight the gender difference in performance in multiple choice questions (female students do worse, but they do better on constructed response questions). It turns out that result is fairly common in the academic literature, but the reason why female students don't perform as well as male students on multiple choice questions is less clear. Maybe it is that female students don't respond well to high pressure or competitive situations (and multiple choice questions are reasonably high pressure). Women are more risk averse than men, so maybe it is related to that? Female students might be more likely to skip questions in order to avoid the risk of losing marks [*]. Also, men are more overconfident than women, so maybe it is related to that? Again, male students might be less likely to skip questions, because they are less likely to be unsure they have the right answer.

So, I was quite interested to read this 2017 article by Gerhard Riener and Valentin Wagner (both Düsseldorf Institute for Competition Economics), published in the journal Economics of Education Review (sorry, I don't see an ungated version, but it looks like it might be open access anyway). Riener and Wagner conduct an experiment with 2060 German secondary school students across 89 classes in 25 schools, who each sat a maths test (based on the Math Kangaroo questions).

There were three levels of questions (easy, worth 3 marks; medium, worth 4 marks; and difficult, worth 5 marks). Students lost one mark for every question they got wrong. So, there is an incentive not to guess the easy questions if you don't know (because there is a 1/5 chance of getting three marks, and a 4/5 chance of losing a mark, so the expected value of guessing is -0.2 [0.2 * 3 + 0.8 * -1]). There is no incentive either way for the medium questions (the expected value is 0 [0.2 * 4 + 0.8 * -1]), and a positive incentive to guess on the difficult questions (the expected value is 0.2 [0.2 * 5 + 0.8 * -1]). So, Riener and Wagner expect students to skip more of the easy questions, and fewer of the difficult questions. It turns out that wasn't the case, since they:
...find that the number of skipped questions is increasing in difficulty.
I'm not surprised by this at all. Students don't understand expected value, so it wouldn't surprise me that they didn't realise that there was a positive expected value for guessing, for the difficult questions. And the positive expected value is a relevant consideration for a risk neutral decision-maker. Since there were only 14 questions in the test (and only 4 difficult questions), then it is relatively risky to guess on one of them, in terms of the impact on overall score.

That wasn't the main results from the paper though, which concerned the gender differences, and the experiment. The experiment was that students in some classes were rewarded, if their test score was better than their earlier mid-term result (they didn't know about the experiment until after the mid-term, so there is no risk of the students engaging in strategic behaviour). The reward (according to Riener and Wagner, a source of extrinsic incentive to do well in the test) was one of: (1) a medal, awarded in front of the rest of the class; (2) a letter to their parents, praising their good performance; (3) a "no homework" voucher, which entitled them to take a day off homework; or (4) a "surprise gift" (which was actually a combination of (1) and (2)).

They find that:
Females always tend to skip more questions than males regardless of whether they are incentivized or not. However, incentivized pupils tend to skip fewer questions than non-incentivized pupils...
...girls in our low stakes baseline treatment skip significantly more questions than boys... However, the gender gap depends on item difficulty. While girls skip as many questions as boys when items are easy... they skip significantly more questions for medium... and difficult questions... Interestingly, providing extrinsic incentives for performance, and hence increasing the stakes, closes the gender gap in skipping test items. 
So, providing an incentive closed the gender gap. They also find that the gender gap is only present in academic high schools (Gymnasium) and not in vocational high schools (Gesamtschule, Realschule, and Hauptschule). However, I'm not convinced by those results, as the vocational students sat an easier test, with fewer questions used in the analysis, which renders the results not comparable with the academic high school students.

Riener and Wagner argue that their results are:
...suggestive evidence that the gender gap could be explained by a stereotype threat. Girls in high school only skip significantly more questions if the questions are difficult, although the attractiveness of answering is higher for difficult questions than for easy questions. Further support for a stereotype threat explanation is the fact that the gender gap vanishes if the difficulty of the task is made less salient (shifting the focus of pupils to winning an extrinsic reward).
It's possible that this is stereotype threat. However, it is also possible that the types of rewards that they were offering were the types of rewards that girls value more than boys (especially given that the students got to choose their preferred reward). Perhaps the reward increased the stakes of the test for girls, but not for boys? In any case, it's hard for me to see how the offering of the reward reduces stereotype threat. So, although they were able to eliminate the gender difference in performance, I don't think this paper really helps us get to the bottom of why female students usually perform worse than male students in multiple choice questions.


[*] This applies if there are negative marks for a wrong answer. It is a bit harder to argue this when there is no penalty for a wrong answer (or at least, no difference in the penalty between skipping and getting the answer wrong).

Read more:

No comments:

Post a Comment