Using polynomials to explain classification outputs from neural networks

Pablo Morala

Conferences CIMPA, 18th International Federation of Classification Societies

Pablo Morala

Last modified: 2024-05-14

Abstract

Neural networks, specially with the advent of deep learning, have shown and outstanding performance in a wide variety of tasks, bus specially on classification. However, understanding their inner mechanisms, especially concerning their interpretability or explainability of their outputs, remains a challenging area of research. Within this field of eXplainable Artificial Intelligence (XAI), there have been proposals of representing neural networks as different models.

In this context, the algorithm NN2Poly [1] provides an way of explaining fully-connected feed-forward artificial neural networks, commonly known as multilayer perceptrons (MLPs), using an explicit polynomial representation that only relies on the trained weights of that network and its activation functions.

In this work we will show how this method can be used to explain classification outputs on tabular datasets, by means of the obtained polynomial coefficients. One polynomial representation is obtained for each output neuron, i.e., for each class. We will also show the trade off between training the neural network with the needed constraints for NN2Poly to work and the computational cost.

References:

[1] Morala, P., Cifuentes, J. A., Lillo, R. E., Ucar, I.: NN2Poly: A polynomial representation for deep feed-forward artificial neural networks. IEEE Transactions on Neural Networks and Learning Systems. Early Access (2023)

Keywords

XAI, neural networks, interpretability, classification