...
首页> 外文期刊>Journal of the American Society for Information Science and Technology >Multilingual Web Retrieval: An Experiment in English-Chinese Business Intelligence
【24h】

Multilingual Web Retrieval: An Experiment in English-Chinese Business Intelligence

机译:多语言Web检索:英汉商务智能实验

获取原文
获取原文并翻译 | 示例
           

摘要

As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIR), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIR techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
机译:随着越来越多的非英语资源在Web上可用,Web用户如何检索不同语言的文档引起了一个有趣且重要的问题。跨语言信息检索(CLIR)是一种通过使用另一种语言表示的查询来检索一种语言的信息的研究,是解决该问题的一种有前途的方法。跨语言信息检索近年来引起了广泛关注。大多数研究系统在标准的文本检索会议(TREC)集合(例如新闻文章)上都取得了令人满意的性能,但是对于诸如Web门户这样的应用程序,CLIR技术尚未得到广泛的研究和评估。在本文中,作者介绍了他们在开发和评估多语言英汉Web门户网站方面的研究,该门户网站融合了用于业务领域的各种CLIR技术。采用了基于字典的方法,该方法结合了短语翻译,共现分析以及翻译前和翻译后查询扩展。门户网站是由领域专家使用一系列英文和中文查询进行评估的。实验结果表明,基于同现的短语翻译比简单的逐词翻译的精度提高了74.6%。当结合使用时,翻译前和翻译后查询扩展会稍微改善性能,比基准逐词翻译方法提高了78.0%。通常,在Web应用程序中应用CLIR技术显示出了希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号