If you are being judged as part of a competition, is it better to be judged first, or last, or somewhere in the middle? In theory, it shouldn't matter. If judges were impartial and purely rational decision-makers, then your ranking shouldn't depend on where you are in the order of judging. However, judges are human, and humans are typically not purely rational. So, does the order of judging matter?
That is the question addressed in this recent article by Maira Reimão, Rachel Sabbadini, and Eric Rego (all Villanova University), published in the journal Kyklos (sorry, I don't see an ungated version online). Their main analysis uses hand-collected data from 14 seasons of the Great British Bakeoff (which really means that they watched every episode of the show and recorded the details they needed for their dataset). They limit themselves to the 'technical challenge' portion of the show. Importantly:
In the technical challenge of each episode, all contestants are required to bake the same dish from a pared-down recipe provided by the judges. When the time allotted is over, the contestants place their dish on their randomly assigned spot on the judging table... The judges blind taste each bake, moving in order from their right to left on the table. The footage is also shown in this order, as judges move from dish to dish, and we are confident from the verbal cues that this is the order in which they try the dishes...
The random allocation of the order that dishes are tasted (and rated) by the judges is important, because it allows Reimão et al. to test whether the order in which a dish is judged matters for its ranking. However, not content with looking at the Great British Bakeoff (GBBO), they also:
...complement this analysis with data from international versions of the GBBO, including The Great Canadian Baking Show and The Great Kiwi Bake Off. Data collection for this portion was limited by accessibility, but we have information from thirteen seasons across six additional English-speaking countries: Australia (4 seasons), Canada (4) Kenya (1), New Zealand (2), South Africa (1), and the United States (1).
That's a lot of baking shows to watch! The judges rank every dish from worst to best, which means that the analysis can take advantage of the full ranking (rather than just relying on whatever was selected as the best). This makes the analysis a bit more efficient. Overall, using the GBBO data they find that:
... dishes tasted first are statistically significantly more likely to be ranked higher by GBBO judges than those tasted later... This primacy effect is quite large—in an episode with 10 dishes, the magnitude represents an advantage of one spot in the ranking. More broadly, dishes tasted first are 14–15 percentage points more likely to be ranked in the top half of all dishes in the technical challenge than those tasted later...
Reimão et al. then look at whether it is just the first dish that receives a boost, or whether order effects are apparent all through the order. They find that there is:
...no significant relationship between order and ranking beyond the first dish, and dishes tasted in the first third are no less likely to be ranked in the top half than dishes tasted later.
In other words, it is the first dish that is rated higher than all other dishes, rather than an effect that occurs all through the order. Reimão et al. then go on to show similar effects in the international editions of the show, which suggests that this effect might be generalisable to other settings.
Some further evidence is provided in this recent article by Real Arai (Ryukoku University) and Ryosuke Okazawa (Osaka Metropolitan University), published in the Journal of Economic Behavior and Organization (open access). They look at the effects of order in the judging of a Japanese comedy television show. As they explain:
We analyze data from ‘‘Bakusho On-Air Battle’’ and ‘‘Onbato-Plus’’, competition-style comedy shows held by NHK (Nihon Housou Kyokai, Japan Broadcasting Corporation) once a week from April 1999 to March 2014.
Similar to GBBO, there is some randomisation involved:
In every contest, ten comedy groups, possibly including solo performers, give their performances in turn. Except for the time limit of approximately five to six minutes, there are no restrictions on the performance style and number of group members.
The performance orders are randomly determined using a lottery before each contest begins. Comedians participating in the contest retrieve a ball from a box containing ten balls with numbers on them indicating the order of performance, and perform in the order of their selection...
The contests adopt a step-by-step procedure as its judging system. One hundred amateur program viewers serve as judges and evaluate each performance. Immediately after each performance, the judges are required to choose whether it is worth broadcasting. If the judge approves of the performance, he or she casts a favorable vote. If not, he or she does not vote. Given that there are no quotas, judges can approve as many groups as they want, although the program airs only the top-five performances... Voting takes place after each performance, but the results are announced only after all performances have been completed.
Arai and Okazawa look at how the order of performance affects the average vote share and the probability of being selected in the top five performances, using data from 4774 contests. They find that:
...vote share increases by approximately 9.3% for the first position and 3.1% for the last position compared with the second through ninth middle positions. Both the estimates are statistically significant at the 1% level...
...the probability of a comedian group earning a slot in the top five increases by approximately 20% if assigned to the first position. The first-position effect on the probability of being in the top five is statistically significant and robust, even if we control for the additional covariates. This also shows that being assigned to the last position increases the probability of earning a slot in the top five by approximately 5%...
Arai and Okazawa then look to explain why the first position earns a better ranking:
A promising explanation for the first-position advantage may be the calibration effect. The judges tend to give default evaluations of early performances to preserve the freedom of the judging process. This may give an advantage to the first competitors in a contest with numerous successful applicants. The implication is that an advantageous position in the sequential evaluation of a contest will depend on how competitive it is.
Perhaps. When judges are unsure how the rest of the competitors will rate, they may tend to rate the first competitor close to the middle of the range (a default). They then rate subsequent competitors relative to the first one they rated. This might also apply when judges are only asked to provide their rating (or ranking) after considering all of the competitors. I think there would need to be more research in order to unpick these effects further.
These results have important implications outside of the context of television competitions. Selection panels for awards or job interviews must consider applicants sequentially. Teachers grading assessments must consider their students' work sequentially. In both of these cases, it is worth considering whether there is a bias in favour of those who are considered first. Do the first interviewees have an advantage? Does the first student whose work is marked have an advantage? If so, then we need to consider some way of mitigating that bias. When it comes to grading, I often come to the end of an assessment, and go back and look over the first ones that I marked, in order to assure myself that I haven't been overly easy (or hard) on the first ones. Perhaps that approach needs to be adopted more widely.
Judges are not impartial and purely rational. These results tell us that judges are affected by a 'serial position bias', where the order that they consider competitors affects their ranking (and most positively for the competitor that is considered first relative to all others). With this in mind, the next time you are a contestant on the Great Kiwi Bakeoff, you want to hope that your technical challenge is judged first.
No comments:
Post a Comment