Assessment & Evaluation in Higher Education
We examine the effects of computer-based versus paper-based assessment of critical thinking skills, adapted from English (in the U.S.) to Chinese. Using data collected based on a random assignment between the two modes in multiple Chinese colleges, we investigate mode effects from multiple perspectives: mean scores, measurement precision, item functioning (i.e. item difficulty and discrimination), response behavior (i.e. test completion and item omission), and user perceptions. Our findings shed light on assessment and item properties that could be the sources of mode effects. At the test level, we find that the computer-based test is more difficult and more speeded than the paper-based test. We speculate that these differences are attributable to the test’s structure, its high demands on reading, and test-taking flexibility afforded under the paper testing mode. Item-level evaluation allows us to identify item characteristics that are prone to mode effects, including targeted cognitive skill, response type, and the amount of adaptation between modes. Implications for test design are discussed, and actionable design suggestions are offered with the goal of minimizing mode effect.