Saturday, 25 June 2022

More on teaching evaluations and grade inflation

My Study Leave period is now over, and I've been preparing for my B Trimester teaching. At this time, many teachers would go back over past student evaluations, to determine what things they need to change or update in their teaching (instead, I tend to keep a detailed list of changes from the end of the previous teaching period, e.g. my current list of changes for ECON102 has about 90 minor tweaks to various topics, often incorporating new examples or new ways of organising or explaining the ideas.). So, student evaluations could in theory be a useful tool for improving teaching. However, as I've outlined in many past posts (for example, see this post and the links at the end of it), student evaluations of teaching (SETs) have a number of serious problems, particularly gender bias.

However, student evaluations also create incentive problems. Students tend to evaluate courses higher if they get a better grade. Teachers know that student evaluations affect their chance of promotion or advancement (even if research performance is considered more important by universities). So, teachers have an incentive to give students higher grades, and in turn receive better teaching evaluations as a result. I've posted on this incentive effect before.

Some further evidence of the relationship between grades and student evaluations is provide in this 2008 article by Laura Langbein (American University), published in the journal Economics of Education Review (ungated earlier version here). Langbein uses data from over 7600 courses taught at American University over the period from 2000 to 2003. She notes that even over that period, grade inflation is apparent in the data, as:

...the mean grade in 100-level courses increased from 3.1 to 3.2; the percent who earn less than a B in these courses dropped from 28% to 25%. For 200-level courses, the mean grade increased from 3.1 in Fall 2000 to 3.2 in Spring 2003, and the percent earning less than a B dropped from 29% to 24%. For 300- level courses, the mean grade remained unchanged at 3.3, but the percent below B dropped from 22% to 18%, and the median grade increased from B+ to A-. Among higher level classes, no such clear pattern of aggregate grade inflation is apparent.

Langbein then shows that students' actual grades and expected grades are both correlated with the students' evaluation of teaching, finding that:

...that the impact of a unit increase in the expected grade (say, from B to A, which contains most of the observations) would raise the instructor’s rating by an average of nearly 0.6 on a 6-point scale...

...a one-point increase in the average actual grade (say, from B to A, which is also where the observations lie) raises the SET by about 0.1 point on a 6-point scale...

Those results in themselves aren't necessarily a cause for concern. If the teaching is good, students should learn more (better actual and expected grades), and should evaluate the teaching higher. On the other hand, it could arise because when students are receiving and expecting a higher grade, they may 'reward' the teacher with a higher evaluation of their teaching, regardless of the actual quality of the teaching.

To try and disentangle those two competing explanations, Langbein applies a Hausman endogeneity test. Essentially, she tests whether there is reverse causality between the actual grade in a course and the SET - that is, whether the higher SETs cause higher grades (as well as higher grades causing higher SETs). She finds that:

While the results... clearly uphold the conclusion that faculty are rewarded with higher SETs if they reward students with higher grades, the sign of the residual variable depends on the specification of the endogeneity test. Under one specification, the sign of the residual is negative; under the other, it is positive. Consequently, the results give no clear indication about whether some component of the SET is a measure of ‘‘good’’ teaching and more learning, or easy class content and less learning.

So, while higher grades to lead to higher SETs, Langbein isn't able to definitively tell us why, or whether SETs are a good measure of teaching quality. However, from other research we already have good reason to doubt SETs, due to biasedness. Add this research to the (growing) list of papers that suggest that student evaluations can't tell us anything meaningful about teaching quality, and instead simply create incentives for grade inflation.

Read more:

No comments:

Post a Comment