首页> 外文会议>International conference on computational linguistics >Collocation Extraction Using Parallel Corpus
【24h】

Collocation Extraction Using Parallel Corpus

机译:使用平行语料库的搭配提取

获取原文

摘要

This paper presents a novel method to extract the collocations of the Persian language using a parallel corpus. The method is applicable having a parallel corpus between a target language and any other high-resource one. Without the need for an accurate parser for the target side, it aims to parse the sentences to capture long distance collocations and to generate more precise results. A training data built by bootstrapping is also used to rank the candidates with a log-linear model. The method improves the precision and recall of collocation extraction by 5 and 3 percent respectively in comparison with the window-based statistical method in terms of being a Persian multi-word expression.
机译:本文提出了一种使用平行语料库提取波斯语搭配词的新方法。该方法适用于在目标语言和任何其他高资源语言之间具有平行语料库。不需要针对目标端的准确解析器,它旨在解析句子以捕获长距离搭配并生成更精确的结果。通过自举构建的训练数据也用于对数线性模型对候选者进行排名。与基于窗口的统计方法相比,该方法在波斯多词表达方面相比,分别提高了5%和3%的搭配提取精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号