Fine-Tuning an Algorithm for Semantic Search Using a Similarity Graph

Lubomir Stanchev

首页> 外文期刊>International journal of semantic computing >Fine-Tuning an Algorithm for Semantic Search Using a Similarity Graph

【24h】

Fine-Tuning an Algorithm for Semantic Search Using a Similarity Graph

机译：使用相似度图微调语义搜索算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given a set of documents and an input query that is expressed in a natural lariguage, the problem of document search is retrieving the most relevant documents. Unlike most existing systems that perform document search based on keyword matching, we propose a method that considers the meaning of the words in the queries and documents. As a result, our algorithm can return documents that have no words in common with the input query as long as the documents are relevant. For example, a document that contains the words "Ford", "Chrysler" and "General Motors" multiple times is surely relevant for the query "car" even if the word "car" never appears in the document. Our information retrieval algorithm is based on a similarity graph that contains the degree of semantic closeness between terms, where a term can be a word or a phrase. Since the algorithms that constructs the similarity graph takes as input a myriad of parameters, in this paper we fine-tune the part of the algorithm that constructs the Wikipedia part of the graph. Specifically, we experimentally fine-tune the algorithm on the Miller and Charles study benchmark that contains 30 pairs of terms and their similarity score as determined by human users. We then evaluate the performance of the fine-tuned algorithm on the Cranfield benchmark that contains 1400 documents and 225 natural language queries. The benchmark also contains the relevant documents for every query as determined by human judgment. The results show that the fine-tuned algorithm produces higher mean average precision (MAP) score than traditional keyword-based search algorithms because our algorithm considers not only the words and phrases in the query and documents, but also their meaning.

机译：给定一组文档和以自然语言表达的输入查询，文档搜索的问题是检索最相关的文档。与大多数现有的基于关键字匹配执行文档搜索的系统不同，我们提出了一种考虑查询和文档中单词含义的方法。结果，只要文档相关，我们的算法就可以返回与输入查询没有共同词的文档。例如，即使单词“ car”从未出现在文档中，多次包含单词“ Ford”，“ Chrysler”和“ General Motors”的文档也确实与查询“ car”相关。我们的信息检索算法基于相似度图，该相似度图包含术语之间的语义紧密程度，其中术语可以是单词或短语。由于构造相似性图的算法将大量参数作为输入，因此在本文中，我们对构成图的Wikipedia部分的算法部分进行了微调。具体来说，我们根据Miller和Charles研究基准对实验进行了微调，该基准包含30对术语及其由人类用户确定的相似性得分。然后，我们在Cranfield基准测试中评估微调算法的性能，该基准测试包含1400个文档和225个自然语言查询。基准还包含由人工判断确定的每个查询的相关文档。结果表明，由于我们的算法不仅考虑了查询和文档中的单词和短语，而且考虑了其含义，因此与传统的基于关键字的搜索算法相比，微调算法产生的平均平均精度（MAP）得分更高。

著录项

来源
《International journal of semantic computing》 |2015年第3期|共24页
作者
Lubomir Stanchev;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Semantic search; similarity graph; WordNet; Wikepedia;

机译：语义搜索相似图WordNet维基百科;

相似文献

外文文献
中文文献
专利

1. Fine-Tuning an Algorithm for Semantic Search Using a Similarity Graph [J] . Lubomir Stanchev International journal of semantic computing . 2015,第3期

机译：使用相似度图微调语义搜索算法
2. Web Search Engine Based Semantic Similarity Measure Between Words Using Pattern Retrieval Algorithm [J] . Pushpa C N, Thriveni J, Venugopal K R, Computer Science & Information Technology . 2013,第1期

机译：基于网络搜索引擎的语义相似性测量，使用模式检索算法
3. Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs [J] . Daniel Fogaras, Balazs Racz IEEE Transactions on Knowledge and Data Engineering . 2007,第期

机译：大规模图相似搜索的实用算法和下界
4. A Knowledge Search Algorithm Based on Multidimensional Semantic Similarity Analysis in Knowledge Graph Systems of Power Grid Networks [C] . Dandan Qin, Gaofeng Zheng, Li Liu, IEEE International Conference on Communication Technology . 2020

机译：一种知识搜索算法基于电网网络知识图系统中的多维语义相似性分析
5. A category-based similarity algorithm for semantic similarity in information sharing. [D] . Miralaei, Sepideh. 2005

机译：用于信息共享中语义相似度的基于类别的相似度算法。
6. Functional Module Search in Protein Networks based on Semantic Similarity Improves the Analysis of Proteomics Data [O] . Desislava Boyanova, Santosh Nilla, Gunnar W. Klau, 2014

机译：基于语义相似性的蛋白质网络功能模块搜索改进了蛋白质组学数据的分析
7. Semantic Guided and Response Times Bounded Top-k Similarity Search over Knowledge Graphs [O] . Yuxiang Wang, Arijit Khan, Tianxing Wu, 2020

机译：语义引导和响应时间有界顶部K相似性搜索知识图形

Fine-Tuning an Algorithm for Semantic Search Using a Similarity Graph

摘要

著录项

相似文献

相关主题

期刊订阅