Applying machine learning in science assessment: a systematic review

TitleApplying machine learning in science assessment: a systematic review
Publication TypeJournal Article
Year of Publication2020
AuthorsZhai, X, Yue, Y, Pellegrino, JW, Haudek, KC, Shi, L
JournalStudies in Science Education
Start Page111
KeywordsAssessment, Machine learning, pedagogy; automatic scoring, science education, technology
AbstractMachine learning (ML) is an emergent computerised technology that relies on algorithms built by ‘learning’ from training data rather than ‘instruction’, which holds great potential to revolutionise science assessment. This study systematically reviewed 49 articles regarding ML-based science assessment through a triangle framework with technical, validity, and pedagogical features on three vertices. We found that a majority of the studies focused on the validity vertex, as compared to the other two vertices. The existing studies primarily involve text recognition, classification, and scoring with an emphasis on constructing scientific explanations, with a vast range of human-machine agreement measures. To achieve the agreement measures, most of the studies employed a cross-validation method, rather than self- or split-validation. ML allows complex assessments to be used by teachers without the burden of human scoring, saving both time and cost. Most studies used supervised ML, which relies on extraction of attributes from student work that was first coded by humans to achieve automaticity, rather than semi- or unsupervised ML. We found that 24 studies were explicitly embedded in science learning activities, such as scientific inquiry and argumentation, to provide feedback or learning guidance. This study identifies existing research gaps and suggests that all three vertices of the ML triangle should be addressed in future assessment studies, with an emphasis on the pedagogy and technology features.
Refereed DesignationRefereed

thumbnail of small NSF logo in color without shading

This material is based upon work supported by the National Science Foundation (DUE grants: 1438739, 1323162, 1347740, 0736952 and 1022653). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.