首页> 外国专利> Cost-benefit approach to automatically composing answers to questions by extracting information from large unstructured corpora

Cost-benefit approach to automatically composing answers to questions by extracting information from large unstructured corpora

机译:通过从大型非结构化语料库中提取信息来自动撰写问题答案的成本效益方法

摘要

The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.
机译:本发明涉及促进从大型非结构化语料库例如从万维网和/或其他非结构化源中提取信息的系统和方法。可以通过概率模型和成本效益分析从此类来源自动组成问题答案形式的信息,以指导基于知识的问题回答系统采用的资源密集型信息提取程序。该分析可以利用对由贝叶斯系统或其他统计模型提供的系统生成的答案的最终质量的预测。结合查询的成本和查询结果的期望值,在与实用新型结合使用时,此类预测可以使系统具有决策能力,以决定是否向搜索引擎发出查询的数量回答。给定偏好模型,可以以最高预期效用采取信息提取措施。以这种方式,问题答案的准确性可以与信息提取和分析以组成答案的成本相平衡。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号