Optimal penalized sparse PCA

Rosember Isidoro Guerra Urzola

Conferences CIMPA, 18th International Federation of Classification Societies

Rosember Isidoro Guerra Urzola

Last modified: 2024-05-03

Abstract

In Machine Learning and Data Science, there is a growing interest in algorithms that yield sparse solutions due to their interpretability and computational efficiency. Penalized methods are commonly employed to achieve sparsity in Principal Component Analysis (PCA) problems. However, a key critique of these methods is that their performance is assessed via numerical experiments, lacking theoretical assurances of optimality. Our research aims to explore the theoretical properties that determine the success of one of these algorithms in achieving optimality.Specifically, we concentrate on penalized PCA formulations utilizing cardinality as a sparsity-inducing penalty. We introduce a Minorization-Maximization scheme to tackle the problem and theoretically demonstrate that the resulting solution constitutes a local optimum. We establish essential conditions for optimality. We require the minimum eigenvalue of the sample covariance matrix greater than one, leading to the absence of a feasible ascent direction at the solution. However, this condition may not be applicable in practice, especially in high-dimensional scenarios.To address this challenge, we propose a straightforward procedure to ensure adherence to this condition across diverse datasets. Subsequently, we conduct numerical experiments utilizing both synthetic and empirical datasets to elucidate the practical implications of this condition.

Keywords

Dimension reduction, Sparse solutions, Principal component analysis.