Comparative evaluation of persistence diagram vectorisation methods in classification tasks

Topological Data Analysis (TDA) enables the analysis of the geometric structure of data using tools from algebraic topology. A central technique in TDA is persistent homology, whose results are represented by persistence diagrams (PDs) describing the lifespan of topological features. Since PDs lack a natural vector-space representation, their direct use in machine learning (ML) classifiers is challenging. Therefore, several vectorisation methods have been proposed, including Persistence Image (PI), Persistence Landscapes (PL), Betti Curves (BC), and Persistence Silhouettes (PS). This study presents a comparative analysis of these vectorisation methods in classification tasks involving both synthetic and real-world datasets, using three classifiers: Logistic Regression (LR), XGBoost (XGB), and Multilayer Perceptron (MLP). Hyperparameter tuning and cross-validation were applied, and model performance was evaluated using accuracy, precision, recall, and F1-score. The results show that PI and PL consistently achieve the highest classification performance across different data types and classifiers. For synthetic datasets, these methods reached scores above 0.98, while for the ECG dataset, they outperformed alternative approaches by up to 30%. In contrast, all methods exhibited limited effectiveness on the MNIST dataset due to high geometric complexity and noise in pixel-based point cloud representations. For the ModelNet10 dataset, PI clearly outperformed other techniques, achieving scores of approximately 0.75. Overall, the results indicate that PI provides a robust and versatile topological representation for classification tasks, while PL stands out for its stability and interpretability in complex data analysis.

Submit your paper

FAQ

Instructions for Authors

All issues

Articles in press

Send by email

Vibration-based fatigue life prediction of glass fibre reinforced polymer laminates using modal frequency degradation with semi-empirical and machine learning models

Efficient radar signal classification using wavelet features and machine learning for embedded systems

Artificial intelligence in the diagnosis of endometrial pathologies: A narrative review of current methods and technological advances

Heuristic and machine learning methods for optimizing magnetorheological brake performance

Analysis of the stress triaxiality impact on the fatigue strength of a structural component with machine learning tools

Indexes

Keywords index

Authors index