首页> 外文会议>Workshop on lexical and grammatical resources for language processing >Annotation and Classification of Light Verbs and Light Verb Variations in Mandarin Chinese
【24h】

Annotation and Classification of Light Verbs and Light Verb Variations in Mandarin Chinese

机译:汉语中轻动词和轻动词变体的注释和分类

获取原文
获取原文并翻译 | 示例

摘要

Light verbs pose an a challenge in linguistics because of its syntactic and semantic versatility and its unique distribution different from regular verbs with higher semantic content and selectional resrictions. Due to its light grammatical content, earlier natural language processing studies typically put light verbs in a stop word list and ignore them. Recently, however, classification and identification of light verbs and light verb construction have become a focus of study in computational linguistics, especially in the context of multi-word expression, information retrieval, disambiguation, and parsing. Past linguistic and computational studies on light verbs had very different foci. Linguistic studies tend to focus on the status of light verbs and its various selectional constraints. While NLP studies have focused on light verbs in the context of either a multi-word expression (MWE) or a construction to be identified, classified, or translated, trying to overcome the apparent poverty of semantic content of light verbs. There has been nearly no work attempting to bridge these two lines of research. This paper takes this challenge by proposing a corpus-bases study which classifies and captures syntactic-semantic difference among all light verbs. In this study, we first incorporate results from past linguistic studies to create annotated light verb corpora with syntactic-semantics features. We next adopt a statistic method for automatic identification of light verbs based on this annotated corpora. Our results show that a language resource based methodology optimally incorporating linguistic information can resolve challenges posed by light verbs in NLP.
机译:轻动词的句法和语义通用性以及与具有较高语义内容和选择性的常规动词不同的独特分布,在语言学上构成了挑战。由于其轻语法内容,较早的自然语言处理研究通常将轻动词放入停用词列表中并忽略它们。然而,近来,轻动词的分类和识别以及轻动词的构造已成为计算语言学的研究重点,特别是在多词表达,信息检索,歧义消除和解析的情况下。过去对轻动词的语言学和计算学研究有非常不同的重点。语言学研究倾向于集中在轻动词的状态及其各种选择约束上。尽管NLP研究集中在多词表达(MWE)或待识别,分类或翻译的构造中的轻动词,试图克服轻动词语义上明显的贫困。几乎没有工作试图将这两种研究联系起来。本文通过提出基于语料库的研究来应对这一挑战,该研究对所有轻动词之间的句法语义差异进行分类和捕获。在这项研究中,我们首先将过去语言研究的结果结合起来,以创建具有句法语义特征的带注释的轻动词语料库。接下来,我们基于这种带注释的语料库采用一种统计方法来自动识别轻动词。我们的结果表明,基于语言资源的方法可以最佳地整合语言信息,可以解决NLP中轻动词所带来的挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号