首页> 外文期刊>Arabian Journal for Science and Engineering >Multi-corpus-Based Model for Measuring the Semantic Relatedness in Short Texts (SRST)
【24h】

Multi-corpus-Based Model for Measuring the Semantic Relatedness in Short Texts (SRST)

机译:基于多主体的短文本语义相关性度量模型(SRST)

获取原文
获取原文并翻译 | 示例
           

摘要

Semantic Relatedness (SR) defines a relation between linguistic items. These items could be words, phrases, or documents. There are many interesting related applications such as information extraction, words sense disambiguation, text summarization, and text clustering. The task of quantifying SR manually is fairly natural and axiomatic, whereas it is complex automatically because of human's background experience and external domain concepts that are not available for the computational methods. This paper focuses on the Semantic Relatedness in Short Texts (SRST). A Vector Space Modelthat is based on multi-corpusis proposed to measure the SRST. Word synonyms and anaphoric information are used to improve the semantic representation of the document. Since the set of verses in the Holy Quran is a precious sample of the short texts., it is used as the main case study in this paper to measure the degree of relatedness between these verses. Experiments are conducted where their results proved the efficiency of the proposed model in improving SR measurement. The results show an improvement to the recall to be 60% rather than 11.3% as the best previous studies.
机译:语义相关性(SR)定义了语言项目之间的关系。这些项目可以是单词,短语或文档。有许多有趣的相关应用程序,例如信息提取,词义消歧,文本摘要和文本聚类。手动量化SR的任务是很自然和公理的,但是由于人类的背景经验和计算方法所不具备的外部领域概念,自动量化SR的任务却很复杂。本文重点讨论短文本中的语义相关性(SRST)。提出了一种基于多主体的向量空间模型来测量SRST。单词同义词和回指信息用于改善文档的语义表示。由于《古兰经》中的经文集是短文的珍贵样本,因此,本文将其作为主要案例研究来衡量这些经文之间的关联程度。进行实验的结果证明了所提模型在改善SR测量方面的效率。结果表明,召回率提高了60%,而不是以前最好的研究的11.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号