...
首页> 外文期刊>Journal of computer sciences >An Automatic Collocation Extraction from Arabic Corpus | Science Publications
【24h】

An Automatic Collocation Extraction from Arabic Corpus | Science Publications

机译:阿拉伯语语料库中的自动搭配提取科学出版物

获取原文
           

摘要

> Problem statement: The identification of collocations is very important part in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. Because of the complexities of Arabic, the collocations undergo some variations such as, morphological, graphical, syntactic variation that constitutes the difficulties of identifying the collocation. Approach: We used the hybrid method for extracting the collocations from Arabic corpus that is based on linguistic information and association measures. Results: This method extracted the bi-gram candidates of Arabic collocation from corpus and evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. Conclusion: The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.
机译: > 问题陈述:搭配的标识是自然语言处理应用程序中非常重要的部分,这些应用程序需要某种程度的语义解释,例如机器翻译,信息检索和文本摘要。由于阿拉伯语的复杂性,搭配出现了一些变化,例如形态,图形,句法变化,这构成了识别搭配的困难。 方法:我们基于语言信息和关联度量,使用了混合方法从阿拉伯语语料中提取词语搭配。 结果:此方法从语料库中提取阿拉伯语搭配的二元语法候选词,并使用n最佳评估方法评估关联度量。我们在每个n最佳列表中报告了每个关联度量的精度值。 结论:实验结果表明,对数似然比是获得最高精度的最佳关联度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号