Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach

ASIF EKBAL; SRIPARNA SAHA

首页> 外文期刊>ACM transactions on Asian language information processing >Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach

【24h】

Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach

机译：基于加权投票的分类器集成，用于命名实体识别：一种基于遗传算法的方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we report the search capability of Genetic Algorithm (GA) to construct a weighted vote-based classifier ensemble for Named Entity Recognition (NER). Our underlying assumption is that the reliability of predictions of each classifier differs among the various named entity (NE) classes. Thus, it is necessary to quantify the amount of voting of a particular classifier for a particular output class. Here, an attempt is made to determine the appropriate weights of voting for each class in each classifier using GA. The proposed technique is evaluated for four leading Indian languages, namely Bengali, Hindi, Telugu, and Oriya, which are all resource-poor in nature. Evaluation results yield the recall, precision and F-measure values of 92.08%, 92.22%, and 92.15%, respectively for Bengali; 96.07%, 88.63%, and 92.20%, respectively for Hindi; 78.82%, 91.26%, and 84.59%, respectively for Telugu; and 88.56%, 89.98%, and 89.26%, respectively for Oriya. Finally, we evaluate our proposed approach with the benchmark dataset of CoNLL-2003 shared task that yields the overall recall, precision, and F-measure values of 88.72%, 88.64%, and 88.68%, respectively. Results also show that the vote based classifier ensemble identified by the GA-based approach outperforms all the individual classifiers, three conventional baseline ensembles, and some other existing ensemble techniques. In a part of the article, we formulate the problem of feature selection in any classifier under the single objective optimization framework and show that our proposed classifier ensemble attains superior performance to it.

机译：在本文中，我们报告了遗传算法（GA）的搜索功能，可为命名实体识别（NER）构造基于加权投票的分类器集成。我们的基本假设是，每个分类器的预测可靠性在各种命名实体（NE）类之间是不同的。因此，有必要对特定分类器针对特定输出类别的投票数量进行量化。在此，尝试使用GA为每个分类器中的每个类别确定合适的投票权重。针对四种主要的印度语言（即孟加拉语，印地语，泰卢固语和奥里亚语）对提议的技术进行了评估，这些语言本质上都是资源匮乏的。评估结果显示，孟加拉语的召回率，精确度和F测量值分别为92.08％，92.22％和92.15％。印地语分别为96.07％，88.63％和92.20％；泰卢固语分别为78.82％，91.26％和84.59％； Oriya分别为88.56％，89.98％和89.26％。最后，我们使用CoNLL-2003共享任务的基准数据集评估了我们提出的方法，该方法的总体召回率，精确度和F测量值分别为88.72％，88.64％和88.68％。结果还表明，通过基于GA的方法识别的基于投票的分类器集合优于所有单个分类器，三个常规基线集合以及一些其他现有的集合技术。在本文的一部分中，我们在单一目标优化框架下阐述了任何分类器中的特征选择问题，并表明我们提出的分类器集合具有比其更好的性能。

著录项

来源
《ACM transactions on Asian language information processing》 |2011年第2期|p.9-9.37|共37页
作者
ASIF EKBAL; SRIPARNA SAHA;
展开▼
作者单位

Department of Computer Science and Engineering, Indian Institute of Technology, Patna, India, 800013;

Department of Computer Science and Engineering, Indian Institute of Technology, Patna, India, 800013;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
algorithms; design; experimentation;

机译：算法;设计;实验;

相似文献

外文文献
中文文献
专利

1. Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: Application to named entity recognition [J] . Ekbal A., Saha S. Soft computing: A fusion of foundations, methodologies and applications . 2013,第1期

机译：使用多目标模拟退火方法将特征选择和分类器集成相结合：在命名实体识别中的应用
2. A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in Indian languages as case studies [J] . AsifEkbal, SriparnaSaha Expert Systems with Application . 2011,第12期

机译：分类器集成的多目标模拟退火方法：以印度语言中的命名实体识别为案例研究
3. Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition [J] . Sriparna Saha, Asif Ekbal Data & Knowledge Engineering . 2013,第may期

机译：使用基于投票的分类器集成技术组合多个分类器以进行命名实体识别
4. Weighted Vote Based Classifier Ensemble Selection Using Genetic Algorithm for Named Entity Recognition [C] . Asif Ekbal, Sriparna Saha Natural language processing and information systems . 2010

机译：基于遗传算法的加权投票分类器集合选择
5. A data-intensive approach to named entity recognition using domain and language independent methods [D] . Osesina, Olukayode Isaac. 2010

机译：使用领域和语言无关的方法进行的数据密集型命名实体识别方法
6. A genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs [O] . Dingfang Li, Longqiang Luo, Wen Zhang, 2016

机译：基于遗传算法的加权集成方法预测转座子来源的piRNA
7. Dutch named entity recognition using classifier ensembles [O] . Desmet Bart, Hoste Veronique 2011

机译：荷兰语使用分类器集合命名实体识别

Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach

摘要

著录项

相似文献

相关主题

期刊订阅