Feasibility of computerized adaptive testing evaluated by Monte-Carlo and post-hoc simulations

Lubomír Štěpánek; Patricia Martinková

Feasibility of computerized adaptive testing evaluated by Monte-Carlo and post-hoc simulations

Lubomír Štěpánek, Patricia Martinková

DOI: http://dx.doi.org/10.15439/2020F197

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 359–367 (2020)

Full text

Abstract. Computerized adaptive testing (CAT) is a modern alternative to classical paper and pencil testing. CAT is based on an automated selection of optimal item corresponding to current estimate of test-taker's ability, which is in contrast to fixed predefined items assigned in linear test. Advantages of CAT include lowered test anxiety and shortened test length, increased precision of estimates of test-takers' abilities, and lowered level of item exposure thus better security. Challenges are high technical demands on the whole test work-flow and need of large item banks.

References

Wim J Linden, Wim J van der Linden, and Cees AW Glas. Computerized adaptive testing: Theory and practice. Springer, 2000.
Howard Wainer, Neil J Dorans, Ronald Flaugher, et al. Computerized adaptive testing: A primer. Routledge, 2000.
David Magis, Duanli Yan, and Alina A Von Davier. Computerized adaptive and multistage testing with R: Using packages catr and mstr. Springer, 2017.
David J Weiss and G Gage Kingsbury. “Application of computerized adaptive testing to educational problems”. In: Journal of Educational Measurement 21.4 (1984), pp. 361–375.
Jan Stochl, Jan R Böhnke, Kate E Pickett, et al. “Computerized adaptive testing of population psychological distress: simulation-based evaluation of GHQ-30”. In: Social psychiatry and psychiatric epidemiology 51.6 (2016), pp. 895–906.
Jan Stochl, Jan R Böhnke, Kate E Pickett, et al. “An evaluation of computerized adaptive testing for general psychological distress: combining GHQ-12 and Affectometer-2 in an item bank for public mental health research”. In: BMC medical research methodology 16.1 (2016), p. 58.
Dagmar Amtmann, Alyssa M Bamer, Jiseon Kim, et al. “A comparison of computerized adaptive testing and fixed-length short forms for the Prosthetic Limb Users Survey of Mobility (PLUS-MTM)”. In: Prosthetics and orthotics international 42.5 (2018), pp. 476–482.
Karon F Cook, Seung W Choi, Paul K Crane, et al. “Letting the CAT out of the bag: comparing computer adaptive tests and an eleven-item short form of the Roland-Morris Disability Questionnaire”. In: Spine 33.12 (2008), p. 1378.
Patricia Martinková, Lubomír Štěpánek, Adéla Drabinová, et al. “Semi-real-time analyses of item characteristics for medical school admission tests”. In: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems. Ed. by M. Ganzha, L. Maciaszek, and M. Paprzycki. Vol. 11. Annals of Computer Science and Information Systems. IEEE, 2017, pp. 189–194. http://dx.doi.org/10.15439/2017F380.
Čestmír Štuka, Patrícia Martinková, Karel Zvára, et al. “The prediction and probability for successful completion in medical study based on tests and preadmission grades”. In: New Educational Review 28 (2012), pp. 138–52.
Patrícia Martinková and Adéla Drabinová. “ShinyItemAnalysis for Teaching Psychometrics and to Enforce Routine Analysis of Educational Tests.” In: R Journal 10.2 (2018).
Wim J. van der Linden and Cees A.W. Glas. “25 Statistical Aspects of Adaptive Testing”. In: Handbook of Statistics. Elsevier, 2006, pp. 801–838. http://dx.doi.org/10.1016/s0169-7161(06)26025-5.
R. Darrell Bock and Murray Aitkin. “Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm”. In: Psychometrika 46.4 (Dec. 1981), pp. 443–459. http://dx.doi.org/10.1007/bf02293801.
Yoshio Takane and Jan de Leeuw. “On the relationship between item response theory and factor analysis of discretized variables”. In: Psychometrika 52.3 (Sept. 1987), pp. 393–408. DOI : 10.1007/bf02294363.
Cees A. W. Glas. “Modification indices for the 2-PL and the nominal response model”. In: Psychometrika 64.3 (Sept. 1999), pp. 273–294. http://dx.doi.org/10.1007/bf02294296.
A. P. Dempster, N. M. Laird, and D. B. Rubin. “Maximum likelihood from incomplete data via the EM algorithm”. In: Journal of the Royal Statistical Society, Series B 39.1 (1977), pp. 1–38.
Hua-Hua Chang and Zhiliang Ying. “Nonlinear sequential designs for logistic item response theory models with applications to computerized adaptive tests”. In: The Annals of Statistics 37.3 (June 2009), pp. 1466–1488. DOI : 10.1214/08-aos614. URL: https://doi.org/10.1214/08-aos614.
Daniel O. Segall. “Multidimensional adaptive testing”. In: Psychometrika 61.2 (June 1996), pp. 331–354. DOI :10.1007/bf02294343.
Thomas A. Warm. “Weighted likelihood estimation of ability in item response theory”. In: Psychometrika 54.3 (Sept. 1989), pp. 427–450. DOI : 10.1007/bf02294627.
Frederic Lord. Applications of item response theory to practical testing problems. Hillsdale, N.J: L. Erlbaum Associates, 1980. ISBN: 978-0898590067.
Frank L. Schmidt, John E. Hunter, and Vern W. Urry. “Statistical power in criterion-related validation studies.” In: Journal of Applied Psychology 61.4 (1976), pp. 473–485. DOI : 10.1037/0021-9010.61.4.473.
Wim J. van der Linden and Richard M. Luecht. “Observed-score equating as a test assembly problem”. In: Psychometrika 63.4 (Dec. 1998), pp. 401–418. DOI :10.1007/bf02294862.
Rebecca D. Hetter and J. Bradford Sympson. “Item exposure control in CAT-ASVAB.” In: Computerized adaptive testing: From inquiry to operation. American Psychological Association, 1997, pp. 141–144. DOI : 10.1037/10244-014.
Martha L. Stocking and Charles Lewis. “Controlling Item Exposure Conditional on Ability in Computerized Adaptive Testing”. In: Journal of Educational and Behavioral Statistics 23.1 (1998), p. 57. http://dx.doi.org/10.2307/1165348.
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2017. https://www.R-project.org/.
R. Philip Chalmers. “Generating Adaptive and Non-Adaptive Test Interfaces for Multidimensional Item Response Theory Applications”. In: Journal of Statistical Software 71.5 (2016), pp. 1–39. DOI : 10.18637/jss.v071.i05.