首页> 外文会议>International Conference on Information Technology and Electrical Engineering >An Information Retrieval Approach to Finding Similar Questions in Question-Answering of Indonesian Government e-Procurement Services using TF*IDF and LSI Model
【24h】

An Information Retrieval Approach to Finding Similar Questions in Question-Answering of Indonesian Government e-Procurement Services using TF*IDF and LSI Model

机译:信息检索方法,使用TF * IDF和LSI模型在印度尼西亚政府电子采购服务的问题解答中查找类似问题

获取原文

摘要

In the implementation of e-procurement application in the government of the Republic of Indonesia, one of the important issues is how to find relevant questions in the archive of a question answering (QA) service with the question asked by the user. A common method used for finding relevant documents is representing text documents into vector space model (VSM). Relevance between query and documents can be calculated using document similarities theory, by comparing the deviation of angles between each document vector and the query (from user question) vector where the query is represented as the same kind of vector as the documents. The Vector Space model algorithms widely used are Term Frequency * Inverse Document Frequency (TF*IDF) and Latent Semantic Indexing (LSI), however both models have their respective limitation. Considering that problem, this paper proposed hybrid model that combines TF*IDF and LSI to fix some limitations on both. From the experimental results, it is found that the proposed model is outperform (P@1=0.67) compared to TF*IDF model (P@1=0.27) and LSI model (P@1=0.4) that stand alone.
机译:在印度尼西亚共和国政府中实施电子采购应用程序时,重要的问题之一是如何在问题解答(QA)服务的档案中查找用户提出的问题。查找相关文档的常用方法是将文本文档表示为向量空间模型(VSM)。可以使用文档相似性理论,通过比较每个文档向量和查询(来自用户问题)向量之间的角度偏差来计算查询和文档之间的相关性,其中查询表示为与文档相同的向量。广泛使用的向量空间模型算法是术语频率*文档逆频率(TF * IDF)和潜在语义索引(LSI),但是这两种模型都有其各自的局限性。考虑到该问题,本文提出了一种结合了TF * IDF和LSI的混合模型来解决两者的局限性。从实验结果发现,与单独使用的TF * IDF模型(P@1=0.27)和LSI模型(P@1=0.4)相比,该模型的性能优于(P@1=0.67)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号