首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models
【24h】

Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models

机译:使用主题模型学习单词义分布,检测未经检验的义并识别新义

获取原文

摘要

Unsupervised word sense disambiguation (wsd) methods are an attractive approach to all-words WSD due to their non-reliance on expensive annotated data. Unsupervised estimates of sense frequency have been shown to be very useful for WSD due to the skewed nature of word sense distributions. This paper presents a fully unsupervised topic modelling-based approach to sense frequency estimation, which is highly portable to different corpora and sense inventories, in being applicable to any part of speech, and not requiring a hierarchical sense inventory, parsing or parallel text. We demonstrate the effectiveness of the method over the tasks of predominant sense learning and sense distribution acquisition, and also the novel tasks of detecting senses which aren't attested in the corpus, and identifying novel senses in the corpus which aren't captured in the sense inventory.
机译:无监督词义消歧(wsd)方法是全词WSD的一种有吸引力的方法,因为它们不依赖昂贵的注释数据。由于词义分布的偏斜性质,已证明对词频的无监督估计对于WSD非常有用。本文提出了一种基于完全无监督的基于主题建模的感知频率估计方法,该方法可高度移植到不同的语料库和感知清单中,适用于语音的任何部分,并且不需要分层的感知清单,解析或并行文本。我们证明了该方法在主要的感官学习和感官分布获取任务上的有效性,以及检测语料库中未证明的感官,识别语料库中未捕获的新颖感官的新颖任务。感觉库存。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号