Last modified: 2024-05-14
Abstract
R-mode hierarchical clustering (HC) identifies interrelationships between variables which are useful for variable selection and dimension reduction. The application of HC in R-mode to Compositional Data (CoDa) must be consistent with the fundamental properties of the compositional geometry, also known as the Aitchison geometry. A composition is a multivariate quantitative description of the parts or components of a whole conveying relative information, commonly expressed as a vector of proportions. The critical element of the Aitchison geometry is the inner product defined via the log-ratio coordinates of the compositions. This geometry allows to express a composition as coordinates in an orthonormal basis, formed by log-ratios and called olr-coordinates.
Recent publications introduce R-mode agglomerative HC methods in CoDa for creating orthonormal log-ratio basis. The HC methods form hierarchical groups of mutually exclusive subsets of parts which can be associated to a sequential binary partition of the parts. In this talk, we explore the basic concepts of the R-mode clustering algorithms and the connections between concepts such as distance between parts, cluster representative of a group of parts, and compositional biplot. Practical examples will be presented to visually illustrate the proposed approach.