Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions

机译：学习语义组成以检测多词表达的非组成性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Non-compositionality of multiword expressions is an intriguing problem that can be the source of error in a variety of NLP tasks such as language generation, machine translation and word sense disambiguation. We present methods of non-compositionality detection for English noun compounds using the unsu-pervised learning of a semantic composition function. Compounds which are not well modeled by the learned semantic composition function are considered non-compositional. We explore a range of distributional vector-space models for semantic composition, empirically evaluate these models, and propose additional methods which improve results further. We show that a complex function such as polynomial projection can learn semantic composition and identify non-compositionality in an unsupervised way, beating all other baselines ranging from simple to complex. We show that enforcing sparsity is a useful regularizer in learning complex composition functions. We show further improvements by training a decomposition function in addition to the composition function. Finally, we propose an EM algorithm over latent compositionality annotations that also improves the performance.

机译：多词表达的非组合性是一个有趣的问题，它可能是各种NLP任务（例如语言生成，机器翻译和单词歧义消除）中错误的来源。我们提出了一种使用语义组成函数的未监督学习来对英语名词化合物进行非组成性检测的方法。不能通过学习到的语义组成函数很好地建模的化合物被认为是非组成性的。我们探索了一系列用于语义合成的分布向量空间模型，对这些模型进行了经验评估，并提出了进一步改善结果的其他方法。我们证明了诸如多项式投影之类的复杂函数可以学习语义组成并以一种无监督的方式识别非组成性，击败了从简单到复杂的所有其他基线。我们证明了，在学习复杂的合成函数中，稀疏性是一种有用的正则化方法。通过显示除合成函数外的分解函数，我们显示了进一步的改进。最后，我们提出了一种基于潜在成分注释的EM算法，该算法也可以提高性能。

著录项

来源
《Conference on empirical methods in natural language processing》|2015年|1733-1742|共10页
会议地点
作者
Majid Yazdani; Meghdad Farahmand; James Henderson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Modeling Semantic Compositionality of Croatian Multiword Expressions [J] . Jan ?najder, Petra Almi? Informatica: An International Journal of Computing and Informatics . 2015,第3期

机译：克罗地亚多词表达的语义组成建模
2. Modeling Semantic Compositionality of Croatian Multiword Expressions [J] . Jan ?najder, Petra Almi? Informatica: An International Journal of Computing and Informatics . 2015,第3期

机译：克罗地亚多词表达的语义组成建模
3. Modeling Semantic Compositionality of Croatian Multiword Expressions [J] . Jan ?najder, Petra Almi? Informatica: An International Journal of Computing and Informatics . 2015,第3期

机译：克罗地亚多词表达的语义组成建模
4. Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions [C] . Majid Yazdani, Meghdad Farahmand, James Henderson Conference on empirical methods in natural language processing . 2015

机译：学习语义构成以检测多字级的非合成性
5. The Effects of Using Textual Enhancement on Processing and Learning Multiword Expressions [D] . Alshaikhi, Adel Zain. 2018

机译：使用文本增强对处理和学习多个表达的影响
6. Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification [O] . Uxoa Inurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, 2020

机译：从Corpora学习言论学的语言论：ullwword表达识别的语言上积极的方法
7. Unsupervised Compositional Translation of Multiword Expressions [O] . Pablo Gamallo, Marcos Garcia 2019

机译：无常常的组成翻译多次表达式

Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions

摘要

著录项

相似文献

相关主题

期刊订阅