首页> 外文会议>European Conference on Artificial Intelligence >From bursty patterns to bursty facts: The effectiveness of temporal text mining for news
【24h】

From bursty patterns to bursty facts: The effectiveness of temporal text mining for news

机译:从爆发模式到爆发事实:时间文本挖掘的有效性

获取原文

摘要

Many document collections are by nature dynamic, evolving as the topics or events they describe change. The goal of temporal text mining is to discover bursty patterns and to identify and highlight these changes to better enable readers to track stories. Here, we focus on the news domain, where the changes revolve around novel, previously unpublished, "facts" that have an effect on the story developments. However, despite intense research activities on bursty patterns, a lack of common procedures today makes it impossible to compare methods in a principled way. To close this gap, we (a) investigate how different temporal text mining methods discover novel facts and (b) present an evaluation framework for methods assessment, consisting of a set of procedures and metrics for cross-evaluating models. Bursty patterns are transformed into queries for sentence retrieval, either with or without taking into account internal pattern structure, and these sentences are compared with a set of editor-selected ground-truth reference sentences. Our experiments on different classes of temporal text mining show that different methods perform at similar levels overall, but provide distinctive advantages in some settings. The experiments also demonstrate the benefits of using patterns' internal structure for query generation.
机译:许多文档集合是自然动态的,作为他们描述改变的主题或事件的发展。时间文本挖掘的目标是发现突发模式,并识别并突出显示这些变化,以更好地使读者能够跟踪故事。在这里,我们专注于新闻领域,其中变化围绕新颖,以前未发表的“事实”,这对故事发展产生了影响。然而,尽管有关于爆发模式的激烈的研究活动,但今天缺乏共同的程序使得不可能以原则的方式比较方法。为了缩短这种差距,我们(a)研究了不同的时间文本挖掘方法如何发现新颖的事实和(b)为方法评估提供评估框架,包括一组用于交叉评估模型的程序和指标。突发模式被转换为句子检索的查询,无论是在还是没有考虑内部图案结构的情况下,这些句子都与一组编辑器所选地基参考句子进行比较。我们对不同类别的时间文本挖掘的实验表明,不同的方法总体上的相似水平,但在某些设置中提供了独特的优势。实验还展示了使用模式的内部结构进行查询生成的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号