首页> 外国专利> Humanitarian crisis analysis using secondary information gathered by a focused web crawler

Humanitarian crisis analysis using secondary information gathered by a focused web crawler

机译:使用重点网络爬虫收集的次要信息进行人道主义危机分析

摘要

A network is crawled using a trained learning model to identify a set of secondary-source documents related to an event. A hub page from the set of secondary-source documents is identified that includes a link predicted to link to a new relevant secondary-source document. The new document is added to the set of secondary-source documents. Information is extracted from the set of secondary-source documents. Feedback is received indicative of a relevancy level for the extracted information as applied to the event. Each document is classified into one or more categories related to the event, based on the extracted information and the received feedback information. A learning model is trained based on the received feedback.
机译:使用训练有素的学习模型对网络进行爬网,以识别与事件相关的一组辅助来源文档。标识来自辅助源文档集合的中心页面,该页面包含预测链接到新的相关辅助源文档的链接。新文档将添加到辅助源文档集中。信息是从辅助源文档集中提取的。接收到指示所提取的信息应用于事件的相关性级别的反馈。基于所提取的信息和所接收的反馈信息,将每个文档分类为与事件有关的一个或多个类别。基于接收到的反馈来训练学习模型。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号