Monday, 2 January 2023

The effect of computer-based testing on measured student achievement

The coronavirus pandemic meant an immediate shift to online assessment, including online tests. So, I initially thought that this 2019 article by Ben Backes and James Cowan (both American Institutes for Research), published in the journal Economics of Education Review (ungated earlier version here), would be rather interesting. However, it turns out that Backes and Cowan are interested in something slightly different - whether computer-based testing (rather than online testing per se) biases measures of student achievement downwards. Their setting is Massachusetts high schools, where:

In 2015, some districts began transitioning to the PARCC [Partnership for Assessment of Readiness for College and Careers] assessment. These districts had the choice of using the paper or online version of the test, and nearly half administered the online format in 2015 or 2016.

Backes and Cowan essentially compare average student performance when the test is conducted in a computer-based mode, with average student performance when the test is conducted on paper. Their sample of students:

...includes about half of all students enrolled in Grades 3 through 8 between 2011 and 2016 and 88 percent of students in schools administering the PARCC in 2015 and 2016...

That's about 1.1 million student-year observations in some of their analyses. They find that:

...students administered an online exam score systematically lower than if they had taken the test on paper. In particular, students taking the online version of PARCC scored about 0.10 standard deviations lower in math and about 0.25 standard deviations lower in English language arts (ELA) than students taking the paper version of the test...

Our estimates of mode effects in math and ELA represent extremely large changes in measured student learning: up to 5.3 months of learning in math and 11.0 months of learning in ELA in a 9 month school year...

And there is some evidence for heterogeneity of the effects:

While we find little systematic evidence of variation in treatment effects by student demographic group in math, we find that ELA mode effects are stronger for students at the bottom of the achievement distribution, for English language learners, and for special education students.

Overall, there is a robust difference in the performance of students between when the exam is computer-based and when it is paper-based. Students do much better in the paper-based test. However, Backes and Cowan aren't able to explain why. There is some evidence that the effect reduces as students become familiar with the computer-based mode, although the effect is still rather large in the second year. That suggests that student unfamiliarity with computers might be at play. However, that is quite speculative.

I had hoped this study (based on its title) would tell us a bit about whether online testing was biased in some way (in either direction). However, all it really tells us is that combining the results of tests that were conducted using different modes is pretty fraught, and we should only do so with great caution. That is an important result, especially as tests such as those that Backes and Cowan study are often used to rank school districts, schools, and even teachers, in terms of performance. However, despite the importance the results are not particularly surprising.

No comments:

Post a Comment