Sex, Drugs and Economics: Hardly the final words on student evaluations of teaching

Tuesday, 15 February 2022

Hardly the final words on student evaluations of teaching

I've written a few posts this week about student evaluations of teaching (see here and here), a few others in previous years. One of those earlier posts asked whether student evaluations of teaching are even measuring teaching quality. The research I cited there was a meta-analysis that suggested that there was no correlation between teaching quality (as measured by final grade or final exam mark or similar) and student evaluations of teaching. However, measuring teaching quality objectively using grades could be problematic if there is reverse causation (for example, teachers give higher grades in hopes of receiving better teaching evaluations). A better approach may be to use some measure of teacher value-added, such as the grade in subsequent classes (with different teachers), or grades in standardised tests (that the teacher doesn't grade themselves).

The former approach, based on teacher value-added, is the one adopted in this 2014 article by Michela Braga (Bocconi University), Marco Paccagnella (Bank of Italy), and Michele Pellizzari (University of Geneva), published in the journal Economics of Education Review (ungated earlier version here). They use data from students in the 1998/99 incoming cohort at Bocconi University, where the students were randomly allocated to teaching classes in all of their compulsory courses (which eliminates problems of selection bias). Looking at the effect of future student performance on current teaching evaluations, Braga et al. find that:

Our benchmark class effects are negatively associated with all the items that we consider, suggesting that teachers who are more effective in promoting future performance receive worse evaluations from their students. This relationship is statistically significant for all items (but logistics), and is of sizable magnitude. For example, a one-standard deviation increase in teacher effectiveness reduces the students’ evaluations of overall teaching quality by about 50% of a standard deviation. Such an effect could move a teacher who would otherwise receive a median evaluation down to the 31st percentile of the distribution.

Those results are consistent with the meta-analysis results, that teachers who do a better job of preparing students for their future studies receive worse teaching evaluations. However, when looking at exam performance in the current class, Braga et al. find that:

...the estimated coefficients turn positive and highly significant for all items (but workload). In other words, the teachers of classes that are associated with higher grades in their own exam receive better evaluations from their students. The magnitudes of these effects is smaller than those estimated for our benchmark measures: one standard deviation change in the contemporaneous teacher effect increases the evaluation of overall teaching quality by 24% of a standard deviation and the evaluation of lecturing clarity by 11%.

They interpret those results as showing that teachers who 'teach to the test' for the current semester receive better teaching evaluations. Braga et al. conclude, unsurprisingly, that:

Overall, our results cast serious doubts on the validity of students’ evaluations of professors as measures of teaching quality or effort.

Aside from being a measure of teaching quality or effort, perhaps student evaluations of teaching provide useful information that teachers use to improve? This 2020 article by Margaretha Buurman (Free University Amsterdam) and co-authors, published in the journal Labour Economics (ungated earlier version here), addresses that question using a field experiment. Specifically, from 2011 to 2013 Buurman et al.:

...set up a field experiment at a large Dutch school for intermediate vocational education. Student evaluations were introduced for all teachers in the form of an electronic questionnaire consisting of 19 items. We implemented a feedback treatment where a randomly chosen group of teachers received the outcomes of their students’ evaluations. The other group of teachers was evaluated as well but did not receive any personal feedback. We examine the effect of receiving feedback on student evaluations a year later...

They find that:

...receiving feedback has on average no effect on feedback scores of teachers a year later. We find a precisely estimated zero average treatment effect of 0.04 on a 5-point scale with a standard error of 0.05...

Buurman et al. suggest that this may be because they estimate the effect a year later, and that the evaluations feedback may have shorter run effects. I don't find that convincing. However, there were differences by gender:

Whereas male teachers hardly respond to feedback independent of the content, we find that female teachers’ student evaluation scores increase significantly after learning that their student evaluation score falls below their self-assessment score as well as when they learn their score is worse than that of their team. Moreover, in contrast to male teachers, female teachers adjust their self-assessment downwards after learning that students rate them less favorably than they rated themselves.

That should perhaps worry us, given the gender bias in evaluations. If it causes female teachers to expend additional effort in trying to improve their teaching evaluations to match those of male teachers, then they will be expending more effort on teaching than the male teachers will for the same outcome. In a university context, that would likely have a negative impact on female teachers' research productivity, with negative consequences for their career. This might be an intervention that is best avoided, unless the gender bias in student evaluations of teaching is first addressed (see yesterday's post for one idea).

As the title of this post suggests, this is hardly the final words on student evaluations of teaching. However, we need to understand what works best, what avoids (or minimises) biases against female or minority teachers, and how teachers can best use the outcomes of evaluations to improve their teaching.

Sex, Drugs and Economics

Tuesday, 15 February 2022

Hardly the final words on student evaluations of teaching

No comments:

Post a Comment

Get new posts by email: