首页> 外文期刊>Aslib Proceedings >Parallel methods for the update of partitioned inverted files

Parallel methods for the update of partitioned inverted files


获取原文并翻译 | 示例


Purpose - An issue that tends to be ignored in information retrieval is the issue of updating invertedrnfiles. This is largely because inverted files were devised to provide fast query service, and much workrnhas been done with the emphasis strongly on queries. This paper aims to study the effect of usingrnparallel methods for the update of inverted files in order to reduce costs, by looking at two types ofrnpartitioning for inverted files: document identifier and term identifier.rnDesign/methodology/approach - Raw update service and update with query service are studiedrnwith these partitioning schemes using an incremental update strategy. The paper uses standardrnmeasures used in parallel computing such as speedup to examine the computing results and also therncosts of reorganising indexes while servicing transactions.rnFindings - Empirical results show that for both transaction processing and index reorganisation therndocument identifier method is superior. However, there is evidence that the term identifier partitioningrnmethod could be useful in a concurrent transaction processing context.rnPractical implications - There is an increasing need to service updates, which is now becoming arnrequirement of inverted files (for dynamic collections such as the web), demonstrating that a shift inrnrequirements of inverted file maintenance is needed from the past.rnOriginality/value - The paper is of value to database administrators who manage large-scale andrndynamic text collections, and who need to use parallel computing to implement their text retrievalrnservices.
机译:目的-在信息检索中往往被忽略的问题是更新反向文件的问题。这主要是因为反向文件被设计为提供快速查询服务,并且已经做了很多工作,并且重点放在查询上。本文旨在通过查看两种类型的反向文件分区方法来研究使用并行方法更新反向文件的效果,以降低成本:文档标识符和术语标识符。rnDesign / methodology / approach-原始更新服务以及使用这些查询方案使用增量更新策略来研究查询服务。本文使用并行计算中使用的标准措施(例如加速)来检查计算结果以及在为事务提供服务时重组索引的成本。但是,有证据表明,术语“标识符分区”方法可能在并发事务处理上下文中有用。实用意义-对服务更新的需求不断增长,现在更新需求已经成为反向文件(对于Web等动态集合)的需求,证明了过去需要反向文件维护的转变要求。原始性/价值-本文对于管理大规模,动态文本集合并且需要使用并行计算来实现其文本检索服务的数据库管理员具有重要价值。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号