Computer Science & Electrical

Computer Science & Electrical

Archive
Join as an Editor/Reviewer

Text Classification using KNN with different Feature Selection Methods

Volume: 9  ,  Issue: 1 , July    Published Date: 07 August 2018
Publisher Name: IJRP
Views: 917  ,  Download: 790 , Pages: 51 - 58    

Authors

# Author Name
1 Rajshree Jodha
2 Gaur Sanjay B.C
3 K.R Chowdhary

Abstract

This paper presents a fast and efficient approach for text classification using KNN for different feature selection method. Typically, this approach evaluates the performance of the system for minimum number of features required to classify the text documents. 20 Newsgroup dataset collected by Ken Lang, have been taken to check performance of the KNN classifier algorithm. The above dataset is separated into two parts viz. training set (60%) and test set (40%).

  The KNN classifier has been implemented against the different number of stemmed and unstemmed features for CHI (Chi-Squared Statistic), IG (Information Gain) and MI (Mutual Information). The Accuracy, Precision, Recall and F1-Score are used to test the system.

Keywords

  • KNN
  • Text Classification
  • feature extraction
  • stemmed data