首页> 外文会议>International conference on asian language processing >The refined MI: A significant improvement to mutual information
【24h】

The refined MI: A significant improvement to mutual information

机译:完善的MI:对互信息的重大改进

获取原文

摘要

Distributional lexical semantics is an empirical approach that is mainly concerned with modeling words' meanings using word distribution statistics gathered from very large corpora. It is basically built on the Distributional Hypothesis by Zellig Harris in 1970, which states that the difference in words' meanings is associated with the difference in their distribution in text. This difference in meaning originates from two kinds of relations between words, which are syntagmatic and paradigmatic relations. Syntagmatic relations are linear combinatorial relations that are established between words that co-occur together in sequential text; while paradigmatic relations are substitutional relations that are established between words that occur in the same context, share neighboring words, but do not co-occur in the same text. In this paper, we present a new association measure, the Refined MI, for measuring syntagmatic relations between words. In addition, an experimental study to evaluate the performance of the proposed measure is presented. The measure showed outstanding results in identifying significant co-occurrences from Classical Arabic text.
机译:分布词法语义学是一种经验方法,主要涉及使用从非常大的语料库中收集的词分布统计来对词的含义进行建模。它基本上是基于1970年Zellig Harris的分布假说建立的,该假说指出单词含义的差异与单词在文本中分布的差异有关。含义上的这种差异源于词之间的两种关系,即关系型和范式关系。同义词关系是线性组合关系,它在顺序文本中同时出现的单词之间建立;范式关系是在相同上下文中出现的单词之间建立的替换关系,它们共享相邻的单词,但不在同一文本中同时出现。在本文中,我们提出了一种新的关联度量,即精制MI,用于度量单词之间的标记关系。此外,还提供了一项实验研究,以评估所提出措施的性能。该措施在从古典阿拉伯语文本中识别重大同时出现方面显示出了出色的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号