...
首页> 外文期刊>Procedia - Social and Behavioral Sciences >A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration
【24h】

A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration

机译:一种无需语言知识即可构建简单但健壮的词干的方法:词干配置

获取原文
           

摘要

This work is part of a project aiming to define a methodology for building simple but robust stemmers, without having knowledge of the stemmer's target language. The methodology starts with a very simple primary stemmer that is applied in some collection of words and returns the corresponding stems. The primary stemmer removes always the longest suffix that match the ending of the examined word. Next, Information Retrieval (IR) experts express their arguments against the results of the primary stemmer. This methodology allows the creation of a number of consecutive trial stemmers that gradually conform increasingly to the arguments expressed by the IR experts. Here, we are giving attention to the attributes and the adjusted characteristics/options that are available to the responsible person for building the consecutive trial stemmers and finally creating the best trial (the stemmer that respects as much as possible the arguments against the primary stemmer).
机译:这项工作是一个项目的一部分,该项目旨在定义一种构建简单但强大的词干分析器的方法,而无需了解词干分析器的目标语言。该方法始于一个非常简单的主词干,该词干被应用在某些单词集合中并返回相应的词干。主词干总是删除与所检查单词的末尾匹配的最长后缀。接下来,信息检索(IR)专家对主要词干的结果表达了自己的观点。这种方法允许创建许多连续的试验词干,这些词干逐渐越来越符合IR专家表达的观点。在这里,我们关注的是负责人可以用来建立连续的试验词干并最终创建最佳试验的属性和调整后的特征/选项(该词干尽可能地尊重针对主词干的论点) 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号