A commonly used test for dishonesty or cheating in lab experiments is to ask research participants to privately (unobserved by the researcher) roll a number of dice and report the number of sixes they roll, or to privately toss a number of coins and report the number of heads. If research participants are paid for each six they roll, or each head they toss, then they have an incentive to cheat, by reporting more sixes or more heads than they actually obtained. Now, this doesn't give an individual measure of dishonesty, since a research participants might really have tossed six heads in a row (on average, one in every 64 research participants should have that outcome), but it does provide a measure of the extent of cheating across the whole sample (for example, if half of research participants report tossing five or more heads out of six, you can be pretty sure there is a lot of cheating going on).
The real question, though, is does this measure of dishonesty replicate outside of the lab environment? That is essentially the question addressed in this 2018 article by Alain Cohn (University of Michigan) and Michel André Maréchal (University of Zurich), published in the Economic Journal (ungated earlier version here). Using a sample of 162 public middle and high school students in Switzerland, Cohn and Maréchal started with the lab measure of dishonesty:
Subjects first opened an envelope containing 10 coins, each worth 0.5 Swiss francs (about US $0.55). Then, they were instructed to toss each coin in private and report their outcomes on paper. For every coin toss for which subjects reported the outcome ‘heads’ they were allowed to keep the coin; they had to put the coin back into the envelope otherwise. Participants thus faced a financial incentive to cheat by misreporting the outcomes of their coin flips without any risk of getting caught... The stakes were considerable as the maximum possible payoff in this task corresponds roughly to half the amount students of similar age receive in pocket money every week.
Cohn and Maréchal then look at the relationship between the number of heads reported and measures of school misconduct (reported by their teachers): (1) disruptiveness in class; (2) non-completion of homework; and (3) absenteeism. Since the three measures of misconduct were highly correlated, they created a single index from them (but their results also appear to hold for each measure individually). They found that:
On average, the students took 62.8% of the coins in the envelopes (95% confidence interval: 60.0%, 65.7%)...
Given that on average, students should have taken 50 percent of the coins, there was a substantial amount of cheating. Who cheated most? Cohn and Maréchal report that:
...female students behaved more honestly than male students as they took significantly less coins (p < 0.000, t-test)... Moreover, we found that high school students took significantly less coins than those from middle school after controlling for age (p = 0.011, t-test), which could be explained by less deviant students selecting into higher education. Earnings in the coin tossing task and the two measures of cognitive ability are negatively correlated. However, the correlations do not reach statistical significance, neither for crystallised nor for fluid intelligence (p = 0.599 and p = 0.744, t-tests).
What about school misconduct (cheating outside the lab)? Cohn and Maréchal report that:
...behaviour in the coin tossing task is significantly related to school misbehaviour when controlling for age, gender, nationality, education level and parental education. A higher number of coins taken is associated with increased behavioural problems in school (p = 0.015)... The coefficient estimate implies that the difference in school misbehaviour between students who took 10 coins (presumably cheaters) and those who took five coins (presumably honest individuals) is more than 0.7 points (or 0.53 standard deviations) on average. For comparison, it would require students to differ by 2.7 standard deviations in cognitive ability (i.e. crystallised intelligence) to produce the same difference in school misbehaviour.
So, the important takeaway from this paper is that there is support for the external validity of the lab measure of cheating or dishonesty. At least, to the extent that misbehaviour by school students is the same as dishonesty (which, of course, it isn't). So, while this should give experimental economists and others a little bit more comfort in using the lab measures of dishonesty, more studies of the external validity of these measures, in other contexts, are sorely needed.