首页> 外文会议>Conference on empirical methods in natural language processing >Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions
【24h】

Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions

机译:学习语义组成以检测多词表达的非组成性

获取原文

摘要

Non-compositionality of multiword expressions is an intriguing problem that can be the source of error in a variety of NLP tasks such as language generation, machine translation and word sense disambiguation. We present methods of non-compositionality detection for English noun compounds using the unsu-pervised learning of a semantic composition function. Compounds which are not well modeled by the learned semantic composition function are considered non-compositional. We explore a range of distributional vector-space models for semantic composition, empirically evaluate these models, and propose additional methods which improve results further. We show that a complex function such as polynomial projection can learn semantic composition and identify non-compositionality in an unsupervised way, beating all other baselines ranging from simple to complex. We show that enforcing sparsity is a useful regularizer in learning complex composition functions. We show further improvements by training a decomposition function in addition to the composition function. Finally, we propose an EM algorithm over latent compositionality annotations that also improves the performance.
机译:多词表达的非组合性是一个有趣的问题,它可能是各种NLP任务(例如语言生成,机器翻译和单词歧义消除)中错误的来源。我们提出了一种使用语义组成函数的未监督学习来对英语名词化合物进行非组成性检测的方法。不能通过学习到的语义组成函数很好地建模的化合物被认为是非组成性的。我们探索了一系列用于语义合成的分布向量空间模型,对这些模型进行了经验评估,并提出了进一步改善结果的其他方法。我们证明了诸如多项式投影之类的复杂函数可以学习语义组成并以一种无监督的方式识别非组成性,击败了从简单到复杂的所有其他基线。我们证明了,在学习复杂的合成函数中,稀疏性是一种有用的正则化方法。通过显示除合成函数外的分解函数,我们显示了进一步的改进。最后,我们提出了一种基于潜在成分注释的EM算法,该算法也可以提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号