首页> 外文会议>2013 International Conference on Current Trends in Information Technology >Automatic keywords extraction from the domain texts: Implementation of the algorithm based on the MapReduce model
【24h】

Automatic keywords extraction from the domain texts: Implementation of the algorithm based on the MapReduce model

机译:从领域文本中自动提取关键字:基于MapReduce模型的算法的实现

获取原文
获取原文并翻译 | 示例

摘要

Automatic keywords extraction is used in almost all the tasks related to natural language processing, such as annotation, indexing, classification, machine translation, knowledge extraction, etc. A large number of effective methods and approaches were developed to solve this problem, and the most simple and robust ones of them are based on the statistics of words. In this paper we describe a statistical method based on Chi-square test. The traditional algorithm implementing this method is an inefficient and time-consuming one. The aim of the paper is to develop the algorithm of this method based on distributed computing model. So we describe the implementation of the algorithm based on the MapReduce model of distributed computing and present the results of experiments showing the benefits of distributed computing.
机译:关键字自动提取几乎用于与自然语言处理有关的所有任务,例如注释,索引,分类,机器翻译,知识提取等。开发了许多有效的方法和方法来解决此问题,其中大多数其中简单而强大的功能是基于单词的统计信息。在本文中,我们描述了一种基于卡方检验的统计方法。实现该方法的传统算法是一种低效且耗时的算法。本文的目的是开发基于分布式计算模型的该方法的算法。因此,我们描述了基于MapReduce分布式计算模型的算法的实现,并给出了表明分布式计算的好处的实验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号