首页> 外文学位 >Incorporating quality metrics in agent-based centralized/distributed information retrieval on the World Wide Web.
【24h】

Incorporating quality metrics in agent-based centralized/distributed information retrieval on the World Wide Web.

机译:在万维网上将质量指标纳入基于代理的集中式/分布式信息检索中。

获取原文
获取原文并翻译 | 示例

摘要

With the development of the World Wide Web, almost anyone can publish documents that reach a global audience. As a result, the quality of available information varies widely. However, most information retrieval systems developed to help people search for information on the World Wide Web rely primarily on relevance ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that low quality documents are retrieved. In this research, I present an alternative approach that combines traditional relevance ranking based on term statistics with quality ranking to retrieve information in centralized and distributed environments. The results from a series of experiments are presented to examine how information quality affects system effectiveness.; Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Generally, all the listed quality metrics, with the exception of the site cohesiveness, were found to improve both centralized and distributed search. The improvement of the search effectiveness made by incorporating the currency, availability, information-to-noise ratio and page cohesiveness metrics in centralized search is significant. The improvement made by incorporating the availability, information-to-noise ratio, and popularity metrics in site selection is significant and that made by incorporating the popularity metric in information fusion is significant. The reason that the site cohesiveness was not found to have effect on either site selection or information fusion may be due to inaccuracies in the method used to calculate the site cohesiveness. A further investigation was conducted on the effect of site cohesiveness by using a modified formula of the site cohesiveness. The results show that the site cohesiveness improved the search effectiveness when incorporated in site selection. The authority metric was not found to have a significant impact in either centralized or distributed search. This may indicate that the authority ratings are not reliable. In summary, however, the results show that incorporating quality metrics can significantly improve search effectiveness in both centralized and distributed search.
机译:随着万维网的发展,几乎任何人都可以发布覆盖全球的文档。结果,可用信息的质量差异很大。但是,为帮助人们在万维网上搜索信息而开发的大多数信息检索系统主要依靠仅基于词频统计的相关性排名算法。信息质量通常被忽略。这导致检索低质量文档的问题。在这项研究中,我提出了一种替代方法,该方法将基于术语统计的传统相关性排名与质量排名相结合,以在集中式和分布式环境中检索信息。提出了一系列实验的结果,以检验信息质量如何影响系统有效性。对六个质量指标进行了调查,包括货币,可用性,信噪比,权限,受欢迎程度和凝聚力。通常,发现所有列出的质量度量标准(站点内聚性除外)都可以改善集中式搜索和分布式搜索。通过将货币,可用性,信息噪声比和页面内聚性指标合并到集中式搜索中,可以显着提高搜索效率。通过将可用性,信息噪声比和受欢迎程度指标纳入站点选择而进行的改进是显着的,并且通过将受欢迎程度指标纳入信息融合而进行的改进是显着的。没有发现站点凝聚力对站点选择或信息融合都没有影响的原因可能是由于计算站点凝聚力的方法不准确。通过使用改进的位置内聚力公式,对位置内聚力的影响进行了进一步研究。结果表明,当结合到站点选择中时,站点凝聚力提高了搜索效率。权威性度量在集中式搜索或分布式搜索中均未发现重大影响。这可能表明授权等级不可靠。总之,结果表明,合并质量指标可以显着提高集中式和分布式搜索中的搜索效率。

著录项

  • 作者

    Zhu, Xiaolan.;

  • 作者单位

    University of Kansas.;

  • 授予单位 University of Kansas.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 1999
  • 页码 144 p.
  • 总页数 144
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号