首页> 外文会议>Machine learning and data mining in pattern recognition >Text Categorization Using an Ensemble Classifier Based on a Mean Co-association Matrix
【24h】

Text Categorization Using an Ensemble Classifier Based on a Mean Co-association Matrix

机译:基于均值协关联矩阵的集成分类器的文本分类

获取原文
获取原文并翻译 | 示例

摘要

Text Categorization (TC) has attracted the attention of the research community in the last decade. Algorithms like Support Vector Machines, Naive Bayes or k Nearest Neighbors have been used with good performance, confirmed by several comparative studies. Recently, several ensemble classifiers were also introduced in TC. However, many of those can only provide a category for a given new sample. Instead, in this paper, we propose a methodology - MECAC - to build an ensemble of classifiers that has two advantages to other ensemble methods: 1) it can be run using parallel computing, saving processing time and 2) it can extract important statistics from the obtained clusters. It uses the mean co-association matrix to solve binary TC problems. Our experiments revealed that our framework performed, on average, 2.04% better than the best individual classifier on the tested datasets. These results were statistically validated for a significance level of 0.05 using the Friedman Test.
机译:在过去十年中,文本分类(TC)引起了研究界的关注。多项比较研究证实,支持向量机,朴素贝叶斯算法或k最近邻算法等算法都具有良好的性能。最近,TC中还引入了几个集成分类器。但是,其中许多只能为给定的新样本提供类别。取而代之的是,在本文中,我们提出了一种方法-MECAC-来建立一个分类器集合,该分类器具有其他集合方法的两个优点:1)可以使用并行计算运行,节省处理时间; 2)可以从中提取重要的统计信息获得的簇。它使用均值协关联矩阵来解决二进制TC问题。我们的实验表明,我们的框架比测试数据集上的最佳单个分类器平均平均好2.04%。使用弗里德曼检验,这些结果的显着性水平为0.05,具有统计学意义。

著录项

  • 来源
  • 会议地点 Berlin(DE)
  • 作者单位

    Departamento de Engenharia Informatica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s 4200-465 Porto, Portugal LIAAD-INESC Porto L.A. Rua de Ceuta, 118,6°4050-190 Porto,Portugal;

    Departamento de Engenharia Informatica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s 4200-465 Porto, Portugal LIAAD-INESC Porto L.A. Rua de Ceuta, 118,6°4050-190 Porto,Portugal;

    LIAAD-INESC Porto L.A. Rua de Ceuta, 118,6°4050-190 Porto,Portugal Faculdade de Economia, Universidade do Porto Rua Dr. Roberto Frias, s 4200-465 Porto, Portugal;

    LIAAD-INESC Porto L.A. Rua de Ceuta, 118,6°4050-190 Porto,Portugal Faculdade de Economia, Universidade do Porto Rua Dr. Roberto Frias, s 4200-465 Porto, Portugal;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Text Categorization; Ensemble Classification; Consensus Clustering; Text Mining;

    机译:文本分类;合奏分类;共识聚类;文字挖掘;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号