Conferences CIMPA, 18th International Federation of Classification Societies

Font Size: 
Model Selection for Linear Regression Under Data Aggregation
Pieter C. Schoonees

Last modified: 2024-04-17

Abstract


Aggregating over individuals belonging to different groups is sometimes unavoidable, such as when data from different views are merged. When performing linear regression, aggregation is known to induce a so-called aggregation bias in the ordinary least-squares (OLS) coefficient estimates compared to those obtained without aggregation. The effect of this aggregation bias on common model selection procedures is however poorly understood. Using simulations based on the matrixvariate normal distribution, we discuss the properties of common selection procedures using a variety of metrics when aggregation is applied.

Keywords


aggregation bias, matrixvariate normal, ordinary least-squares, model selection