Abstract
This paper proposes a novel collapsed Gibbs sampling algorithm that marginalizes model parameters and directly samples latent attribute mastery patterns in diagnostic classification models. This estimation method makes it possible to avoid boundary problems in the estimation of model item parameters by eliminating the need to estimate such parameters. A simulation study showed the collapsed Gibbs sampling algorithm can accurately recover the true attribute mastery status in various conditions. A second simulation showed the collapsed Gibbs sampling algorithm was computationally more efficient than another MCMC sampling algorithm, implemented by JAGS. In an analysis of real data, the collapsed Gibbs sampling algorithm indicated good classification agreement with results from a previous study.
Similar content being viewed by others
References
Bishop, M. (2006). Pattern recognition and machine learning. Springer. https://doi.org/10.1641/B580519
Blei, D. M., & Jordan, M. I. (2006). Variational inference for Dirichlet process mixtures. Bayesian Analysis, 1(1A), 121–144. https://doi.org/10.1214/06-BA104
Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455. https://doi.org/10.2307/1390675
Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Baysean estimation of the the DINA Q matrix. Psychometrika, 83, 89–108. https://doi.org/10.1007/s11336-017-9579-4
Chen, Y., Culpepper, S., & Liang, F. (2020). A sparse latent class model for cognitive diagnosis. Psychometrika, 85(1), 121–153. https://doi.org/10.1007/s11336-019-09693-2
Chiu, C. Y., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225–250. https://doi.org/10.1007/s00357-013-9132-9
Chung, M. (2019). A Gibbs sampling algorithm that estimates the Q-matrix for the DINA model. Journal of Mathematical Psychology, 93, 102275. https://doi.org/10.1016/j.jmp.2019.07.002
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. https://doi.org/10.1177/001316446002000104
Cowles, M. K., & Carlin, B. P. (1996). Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association, 91(434), 883–904. https://doi.org/10.1080/01621459.1996.10476956
Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40, 454–476. https://doi.org/10.3102/1076998615595403
Culpepper, S. A. (2019). An exploratory diagnostic model for ordinal responses with binary attributes: Identifiability and estimation. Psychometrika, 84(3), 921–940. https://doi.org/10.1007/s11336-019-09683-4
Culpepper, S. A. (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84(2), 333–357. https://doi.org/10.1007/s11336-018-9643-8
Culpepper, S. A., & Hudson, A. (2018). An improved strategy for Bayesian estimation of the reduced reparameterized unified model. Applied Psychological Measurement, 42, 99–115. https://doi.org/10.1177/0146621617707511
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199. https://doi.org/10.1007/S11336-011-9207-7
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353. https://doi.org/10.1007/BF02295640
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35, 8–26. https://doi.org/10.1177/0146621610377081
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-Matrix via a Bayesian extension of the DINA Model. Applied Psychological Measurement, 36, 447–468. https://doi.org/10.1177/0146621612449069
Galindo-Garre, F., & Vermunt, J. K. (2006). Avoiding boundary estimates in latent class analysis by Bayesian posterior mode estimation. Behaviormetrika, 33(1), 43–59. https://doi.org/10.2333/bhmk.33.43
George, A. C., Robitzsch, A., Kiefer, T., Groß, J., & Ünlü, A. (2016). The R package CDM for cognitive diagnosis models. Journal of Statistical Software. https://doi.org/10.18637/jss.v074.i02
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics (4th ed., pp. 169–193). Oxford University Press.
Gu, Y., & Xu, G. (2020). Partial identifiability of restricted latent class models. Annals of Statistics, 48(3), 2082–2107. https://doi.org/10.1214/19-AOS1878
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210. https://doi.org/10.1007/S11336-008
Jiang, Z., & Carter, R. (2019). Using Hamiltonian Monte Carlo to estimate the log-linear cognitive diagnosis model via Stan. Behavior Research Methods, 51, 651–662. https://doi.org/10.3758/s13428-018-1069-9
Lee, Y.-S., Park, Y. S., & Taylan, D. (2011). A cognitive diagnostic modeling of attribute mastery in Massachusetts, Minnesota, and the U.S. national sample using the TIMSS 2007. International Journal of Testing, 11, 144–177. https://doi.org/10.1080/15305058.2010.534571
Li, F., Cohen, A., Bottge, B., & Templin, J. (2016). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76, 181–204. https://doi.org/10.1177/0013164415588946
Liu, J. S. (1994). The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association, 89(427), 958–966. https://doi.org/10.1080/01621459.1994.10476829
Ma, W., & de la Torre, J. (2020). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(12), 1–26. https://doi.org/10.18637/jss.v093.i14
Ma, W., & Jiang, Z. (2021). Estimating cognitive diagnosis models in small samples: Bayes modal estimation and monotonic constraints. Applied Psychological Measurement, 45(2), 95–111. https://doi.org/10.1177/0146621620977681
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212. https://doi.org/10.1007/BF02294535
McLachlan, G. J., Lee, S. X., & Rathnayake, S. I. (2019). Finite mixture models. Annual Review of Statistics and Its Application, 6, 355–375. https://doi.org/10.1146/annurev-statistics-031017-100325
McLachlan, G., & Peel, D. (2000). Finite mixture models. Wiley.
Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide (8th ed.). Muthén & Muthén.
Nakajima, S., Watanabe, K., & Sugiyama, M. (2019). Variational Bayesian learning theory. Cambridge University Press. https://doi.org/10.1017/9781139879354
Papastamoulis, P. (2015). label.switching: An R package for dealing with the label switching problem in MCMC outputs. VV(February). https://doi.org/10.18637/jss.v069.c01
Philipp, M., Strobl, C., de la Torre, J., & Zeileis, A. (2018). On the estimation of standard errors in cognitive diagnosis models. Journal of Educational and Behavioral Statistics, 43, 88–115. https://doi.org/10.3102/1076998617719728
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. The 3rd international workshop on distributed statistical computing (Vol. 124, pp. 1–8). http://www.ci.tuwien.ac.at/Conferences/DSC-2003/
Plummer, M., Best, N., Cowles, K., & Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R News 6, 7–11. https://journal.r-project.org/archive/
Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., & Welling, M. (2008). Fast collapsed Gibbs sampling for latent Dirichlet allocation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E98A(1), 569–577. https://doi.org/10.1587/transfun.E98.A.144
R Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Revelle, W. (2020). psych: Procedures for personality and psychological pesearch (Version 2.1.3) [Computer software]. CRAN. https://CRAN.R-project.org/package=psych
Rupp, A. A., & Templin, J. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research & Perspective, 6, 219–262. https://doi.org/10.1080/15366360802490866
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods and applications. Guilford Press.
Sato, I. (2016). Bayesian Nonparametrics. Kodansha.
Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62(3), 795–809.
Suyama, A., & Sugiyama, M. (2017). Introduction to machine learning by Bayesian inference. Kodansha.
Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79, 317–339. https://doi.org/10.1007/s11336-013-9362-0
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(2), 287.
Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32, 37–50. https://doi.org/10.1111/emip.12010
von Davier, M. (2008). A general diagnostic model applied to language testing data. The British Journal of Mathematical and Statistical Psychology, 61(Pt 2), 287–307. https://doi.org/10.1348/000711007X193957
Wang, S., & Douglas, J. (2015). Consistency of nonparametric classification in cognitive diagnosis. Psychometrika, 80(1), 85–100. https://doi.org/10.1007/s11336-013-9372-y
Xu, G., & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113(523), 1284–1295. https://doi.org/10.1080/01621459.2017.1340889
Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika, 47, 159–187. https://doi.org/10.1007/s41237-020-00104-w
Yamaguchi, K., & Okada, K. (2020). Variational Bayes inference for the DINA model. Journal of Educational and Behavioral Statistics, 45(4), 569–597. https://doi.org/10.3102/1076998620911934
Yamaguchi, K., & Okada, J. (2021). Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika, 85(3), 973–995. https://doi.org/10.1007/s11336-020-09739-w
Yamaguchi, K., & Templin, J. (2022). A Gibbs sampling algorithm with monotonicity constraints for diagnostic classification models. Journal of Classification, 39, 24–54. https://doi.org/10.1007/s00357-021-09392-7
Zhan, P., Jiao, H., Man, K., & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(3), 473–503. https://doi.org/10.3102/1076998619826040
Zheng, Y., Chiu, C.-Y., & Douglas, A. J. (2019). Package ‘NPCD’ (1,0-11). https://cran.r-project.org/web/packages/NPCD/index.html
Acknowledgements
This work was supported by JSPS Grant-in-Aid for JSPS Research Fellow 18J01312 and JSPS KAKANHI 19H00616, 20H01720, and 21H00936
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Data analysis syntax is available in Open Science Framework page (https://osf.io/bk75q). We have no conflicts of interest to declare. This work was supported by JSPS Grant-in-Aid for JSPS Research Fellow 18J01312 and JSPS KAKANHI 19H00616, 20H01720, and 21H00936. Jonathan Templin was supported by Grants DRL-1813760 from the National Science Foundation and R305A190079 from the Institute of Education Sciences. .
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Yamaguchi, K., Templin, J. Direct Estimation of Diagnostic Classification Model Attribute Mastery Profiles via a Collapsed Gibbs Sampling Algorithm. Psychometrika 87, 1390–1421 (2022). https://doi.org/10.1007/s11336-022-09857-7
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-022-09857-7