PL EN
A CNN-driven model with adaptive feature fusion for polish national dance music recognition
 
 
More details
Hide details
1
Lublin University of Technology, Faculty of Electrical Engineering and Computer Science, Department of Computer Science, Nadbystrzycka 38D, 20-618 Lublin, Poland
 
These authors had equal contribution to this work
 
 
Publication date: 2025-08-29
 
 
Corresponding author
Kinga Chwaleba   

Lublin University of Technology, Faculty of Electrical Engineering and Computer Science, Department of Computer Science, Nadbystrzycka 38D, 20-618 Lublin, Poland
 
 
Adv. Sci. Technol. Res. J. 2025;
 
KEYWORDS
TOPICS
ABSTRACT
Mel spectrograms have been widely applied in music identification, often yielding successful results when combined with well-known pre-trained classification methods such as VGG16, DenseNet121, or ResNet50. However, the acquired performance may still be improved by employing fusion techniques and proposing a dataset consisting of more samples, which generally demonstrate superior results. Thus, a novel approach employing these methods with the formerly pre-trained classifiers has been introduced. The core innovation of our study is feature fusion utilizing Mel spectrograms, spectrograms, scalograms, and Mel-Frequency Cepstral Coefficients plots, created based on audio recordings from the created dataset encompassing Polish national dance music. The adaptive model is suggested as a mechanism adjusting the highly relevant features for Polish national dance music identification. Furthermore, the use of SHapley Additive exPlanations makes it possible to visualize which parts of the input feature maps are crucial to the model fusion decisions. Subsequently, the most prevalent classification metrics were employed including accuracy, precision, recall, and F1-score to compare the obtained results with state-of-the-art. Hence, the present method yields highly satisfactory results, exceeding 94% accuracy. Consequently, this study not only sets a new benchmark for Polish national dance recognition but also underscores the broader potential of multi-representation fusion as a general blueprint for next-generation audio classification systems.
Journals System - logo
Scroll to top