首页> 外文期刊>International journal of information retrieval research >Efficient Textual Web Retrieval using Wavelet Tree
【24h】

Efficient Textual Web Retrieval using Wavelet Tree

机译:使用小波树进行高效的文本Web检索

获取原文
获取原文并翻译 | 示例
       

摘要

Searching on the web is one of the most progressive and expanding field nowadays. A large amount of information is available on the World Wide Web, motivating the need of efficient text indexing method that support fast text retrieval. In the past, two main indexing techniques: Signature files and Inverted files have been proposed. First require much larger space to store index and are more expensive to construct and update than inverted files. Second has been efficiently implemented using different structures like Sorted array and B-Tree. Sorted array was very expensive in updating the indices while appending a new keyword and B-tree method breaks down if there are many words with the same prefix. This paper presents a modified index structure for text retrieval that keeps a good result to optimize the space needed to store and time to search document. The proposed index is designed using the Wavelet Tree (WT), which was originally designed as wavelet transform for images. Experimental results show that on increasing the query length, the WT based index performs better than others.
机译:网路上的搜寻是当今最进步和扩充的领域之一。万维网上有大量信息可用,从而激发了对支持快速文本检索的有效文本索引方法的需求。过去,已经提出了两种主要的索引技术:签名文件和反向文件。首先,与倒排文件相比,它需要更大的空间来存储索引,并且构造和更新成本更高。其次,已使用不同的结构(如已排序的数组和B-Tree)有效地实现了此功能。排序数组在更新索引时非常昂贵,同时附加一个新关键字,如果有很多带有相同前缀的单词,B树方法就会崩溃。本文提出了一种用于文本检索的改进的索引结构,该索引结构在优化存储所需的空间和搜索文档的时间方面保持了良好的效果。建议的索引是使用小波树(WT)设计的,该树最初设计为图像的小波变换。实验结果表明,在增加查询长度时,基于WT的索引性能优于其他索引。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号