Computer Science & Electrical

Computer Science & Electrical

Archive
Join as an Editor/Reviewer

Text Classification of National Anthem using Agglomerative Hierarchical Clustering

Volume: 146  ,  Issue: 1 , April    Published Date: 05 April 2024
Publisher Name: IJRP
Views: 521  ,  Download: 314 , Pages: 221 - 230    
DOI: 10.47119/IJRP1001461420246269

Authors

# Author Name
1 Prajwal Rai
2 Nirdosh Bista
3 Kumar Prasun
4 Gajendra Sharma

Abstract

Text clustering allows users to categorize different documents based on their similarities. Over the course of several years, this research topic has attracted significant attention from scholars, resulting in the emergence of many approaches and procedures. Nevertheless, the study primarily focuses on English and other languages that have ample resources. This paper presents a comprehensive assessment of clustering methods in the context of national anthems across 190 countries worldwide. The task of conceptually categorizing Anthem is difficult because of its restricted duration. The present study involved the extraction of various features from the anthem, such as stop-words, stemming, corpus tokenization, noise removal, and TF-IDF features. The Agglomerative Hierarchical Clustering technique is utilized for the clustering process. The results indicate that the utilization of a clustering technique in combination with an Agglomerative Hierarchical Clustering algorithm, which incorporates TF-IDF properties, is highly beneficial.