Text Classification of National Anthem using Agglomerative Hierarchical Clustering

Prajwal Rai; Nirdosh Bista; Kumar Prasun; Gajendra Sharma

doi:https://doi.org/10.47119/IJRP1001461420246269

Home/Paper Details

Text Classification of National Anthem using Agglomerative Hierarchical Clustering

Authors:Prajwal Rai Nirdosh Bista Kumar Prasun Gajendra Sharma

Open Access

Journal Type:Research Article

Subject:Computer Science & Electrical

Subject Field:Machine Learning Research

Volume:146, Issue: 1, April, 2024

Publish Date:April 4, 2024 8:00 pm

Pages:221-230

Download:713

DOI: 10.47119/IJRP1001461420246269

Abstract

Text clustering allows users to categorize different documents based on their similarities. Over the course of several years, this research topic has attracted significant attention from scholars, resulting in the emergence of many approaches and procedures. Nevertheless, the study primarily focuses on English and other languages that have ample resources. This paper presents a comprehensive assessment of clustering methods in the context of national anthems across 190 countries worldwide. The task of conceptually categorizing Anthem is difficult because of its restricted duration. The present study involved the extraction of various features from the anthem, such as stop-words, stemming, corpus tokenization, noise removal, and TF-IDF features. The Agglomerative Hierarchical Clustering technique is utilized for the clustering process. The results indicate that the utilization of a clustering technique in combination with an Agglomerative Hierarchical Clustering algorithm, which incorporates TF-IDF properties, is highly beneficial.

Archive

Text Classification of National Anthem using Agglomerative Hierarchical Clustering

Abstract