ProcData: An R package for process data analysis

Abstract

Process data refer to data recorded in the log files of computer-based items. These data, represented as timestamped action sequences, keep track of respondents’ response processes of solving the items. Process data analysis aims at enhancing educational as- sessment accuracy and serving other assessment purposes by utilizing the rich information contained in response processes. The R package ProcData presented in this article is de- signed to provide tools for processing, describing, and analyzing process data. We define an S3 class ‘proc’ for organizing process data and extend generic methods summary and print for ‘proc’. Two feature extraction methods for process data are implemented in the package for compressing information in the irregular response processes into regular nu- meric vectors. ProcData also provides functions for fitting and making predictions from a neural-network-based sequence model. These functions call relevant functions in package keras for constructing and training neural networks. In addition, several response process generators and a real dataset of response processes of the climate control item in the 2012 Programme for International Student Assessment are included in the package.

Publication
Journal of Statistical Software