A Comparison of Multivariate Mixed Models and Generalized Estimation Equations Models for Discrimination in Multivariate Longitudinal Data

Tolulope Sajobi

Conferences CIMPA, 18th International Federation of Classification Societies

Tolulope Sajobi

Last modified: 2024-05-14

Abstract

Discriminant analysis procedures have been developed for classification in multivariate longitudinal data, but the development of such procedures for count, binary or mixed types of outcome variables have not received much attention. Researchers have proposed novel longitudinal discriminant analysis (LoDA) methods using multivariate generalized linear mixed effects models (GLMM) and generalized estimation equations (GEE) to address challenges posed by such data. However, a comprehensive comparison of their predictive accuracy in multivariate longitudinal data remains lacking. This study evaluates the predictive accuracy of these modelbased classification procedures via a Monte Carlo simulation study under a variety of data analytic conditions, including sample size, between-variable and within-variable correlation, number of measurement occasions, and number and distribution of outcome variables. Simulation results show that LoDA based on multivariate GEE and GLMM classifiers exhibited similar overall accuracy in multivariate longitudinal data with normal or binary outcome variables. However, the GEE procedure resulted in higher average classification accuracy (between 3% and 23% higher) over the GLMM in multivariate longitudinal data with count or mixed types of outcome variables. We provide some recommendations for guiding the choice between these two procedures for classification in multivariate longitudinal data.

Keywords

discriminant analysis; generalized estimating equations; longitudinal designs; mixed models