【24h】

Extracting Nested Collocations

机译:提取嵌套搭配

获取原文

摘要

This paper provides an approach to the semi-automatic extraction of collocations from corpora using statistics. The growing availability of large textual corpora, and the increasing number of applications of collocation extraction, has given rise to various approaches on the topic. In this paper, we address the problem of nested collocations; that is, those being part of longer collocations. Most approaches till now, treated substrings of collocations as collocations, only if they appeared frequently enough by themselves in the corpus. These techniques left a lot of collocations unex-tracted. In this paper, we propose an algorithm for a semi-automatic extraction of nested uninterrupted and interrupted collocations, paying particular attention to nested collocation.
机译:本文提供了一种使用统计信息从语料库中半自动提取搭配的方法。大型文本语料库的可用性不断增长,搭配提取的应用程序越来越多,因此出现了有关该主题的各种方法。在本文中,我们解决了嵌套搭配的问题。也就是说,那些是更长的搭配的一部分。直到现在,大多数方法都将并置的子字符串视为并置,只要它们在语料库中足够频繁地出现。这些技术使许多搭配无法提取。在本文中,我们提出了一种用于嵌套不间断和不连续搭配的半自动提取算法,尤其要注意嵌套搭配。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号