首页> 外文会议>2014 IEEE International Symposium on Innovations in Intelligent Systems and Applications >The assessment of feature selection methods on agglutinative language for spam email detection: A special case for Turkish
【24h】

The assessment of feature selection methods on agglutinative language for spam email detection: A special case for Turkish

机译:评估用于垃圾邮件检测的凝集性语言特征选择方法:土耳其语的特例

获取原文
获取原文并翻译 | 示例

摘要

In this study, the assessment of three different feature selection methods including Information Gain (IG), Gini Index (GI), and CHI square (CHI2) is made by utilizing two popular pattern classifiers, namely Artificial Neural Network (ANN) and Decision Tree (DT), on the classification of Turkish e-mails. The feature vectors are constructed by the bag-of-words feature extraction method. This paper is focused on the Turkish language since it is one of the widely used agglutinative languages all around the world. The results obviously reveal that CHI2 and GI feature selection methods are more efficacious than IG method for Turkish language.
机译:在这项研究中,通过利用两种流行的模式分类器,即人工神经网络(ANN)和决策树,对包括信息增益(IG),基尼系数(GI)和CHI方(CHI2)在内的三种不同特征选择方法进行了评估。 (DT),有关土耳其电子邮件的分类。通过词袋特征提取方法构造特征向量。本文主要针对土耳其语,因为它是世界各地广泛使用的凝集语言之一。结果显然表明,对于土耳其语,CHI2和GI特征选择方法比IG方法更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号