Conferences CIMPA, 18th International Federation of Classification Societies

Font Size: 
Model-based bi-clustering using multivariate Poisson-lognormal with general block-diagonal covariance matrix and its applications Submitted to IFCS 2024 Book of Abstracts
Caitlin Kral

Last modified: 2024-05-14

Abstract


Bi-clustering is a technique that simultaneously clusters observations and features (i.e., variables) in a dataset. This technique is used in bioinformatics to gain valuable insight. For example, biclustering gene expression data can help to simultaneously identify clusters of disease and non-diseased patients and the network of genes with distinct correlation patterns based on their expression values. While several Gaussian mixture models-based biclustering approaches currently exist in the literature for continuous data, approaches to handle discrete data have not been well researched. Extending biclustering approaches to discrete data is imperative as such data is commonly found within real world applications such as bioinformatics. Recently, multivariate Poisson-lognormal (MLPN) models have emerged as an efficient model for modelling multivariate count data. It arises from a hierarchical Poisson structure which allows for over-dispersion and correlation (both positive and negative). Here, we propose a MPLN model-based bi-clustering approach that utilizes a block-diagonal covariance structure to allow for a more flexible structure of the covariance matrix. We demonstrate the clustering performance of the proposed model for clustering both observations and features using simulated and real-world data.

Keywords


Bi-clustering, Multivariate Poisson-lognormal, Bioinformatics