An iterative approach to developing, refining and validating machine-scored constructed response assessments

TitleAn iterative approach to developing, refining and validating machine-scored constructed response assessments
Publication TypeConference Paper
Year of Publication2016
AuthorsPrevost, L, Bierema, AM-K, Kaplan, D, Knight, J, Lemons, PP, Lira, CT, Merrill, JE, Moscarella, R, Nehm, RH, Sydlik, MAnne, Urban-Lurain, M
Conference NameAAAS Envisioning the Future of Undergraduate STEM Education
Date Published04/2016
PublisherAmerican Association for the Advancement of Science
Conference LocationWashington, DC
KeywordsAACR, automated scoring, Lexical analysis
AbstractNeed: The Automated Analysis of Constructed Response (AACR) project seeks to develop a community of faculty who use evidence based practices to improve instruction by presenting faculty with novel assessment platforms for written assessment. Written assessments provide faculty in-depth evidence of student learning as they allow faculty to gather student understanding in students’ own words. However, written assessments are used infrequently in undergraduate biology courses, particularly courses with high student enrollment, because of the time and effort necessary to read and provide feedback. Goals: The primary AACR goals are to (1) provide the means for faculty to gather evidence on student learning using formative written assessments and computerized analysis tools and (2) facilitate widespread use of these written assessments. The goal of the question development group within AACR is twofold 1) to develop a suite of formative written assessments in biology, chemistry and statistics that uncover student conceptual difficulties and 2) develop text analysis and machine learning models that automatically analyze student writing, providing faculty with immediate feedback. Approach: Our approach is to use pre-existing concept inventories, the science education literature, and interviews with faculty to identify areas of biology, chemistry and statistics where students have persistent conceptual difficulties. We then develop questions that target these conceptual difficulties. Questions are refined based on input from faculty and data from student interviews. Questions are piloted and revised, so answers can be analyzed by computers. After we have developed a question, we use two approaches to analyze student answers: text analysis and machine learning. Both methods identify and extract words and phrases from student writing that are used to build models of human scoring. The models classify the key concepts or correctness of a response and do so in high agreement with human scoring. Finally, models are piloted in the classrooms of members of our faculty learning communities at six different institutions. Outcomes: We have developed 53 questions in biology, chemistry, chemical engineering, and statistics. We have collected responses from 7854 students and provided 123 reports to faculty. We have also improved our process of question development through the use of clustering and multinomial logistic regression analyses. We also have created more interactive and user friendly feedback reports for faculty. Broader Impacts: Currently 31 faculty are using AACR assessments and participating in our faculty learning communities. We have also recruited 12 new faculty members across our institutions to join our FLCs and use AACR assessments and resources. Additionally, we have expanded to collaborate with faculty in physics at Michigan State University and Stony Brook University and statistics at Grand Valley State University. To date, we have disseminated our findings though 37 presentations and 12 journal articles. AACR products are currently available to faculty via 2 websites.
Refereed DesignationRefereed


thumbnail of small NSF logo in color without shading

This material is based upon work supported by the National Science Foundation (DUE grants: 1438739, 1323162, 1347740, 0736952 and 1022653). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.