首页> 外文会议>IEEE International Congress on Big Data >Pairwise Topic Model via relation extraction
【24h】

Pairwise Topic Model via relation extraction

机译:通过关系提取的成对主题模型

获取原文

摘要

Topic modeling is a powerful tool to model documents to find their underlying topics. However, the unstructured nature of the raw text makes it hard to model the semantic relationship between the text units, which may be the words, phrases or sentences, and thus even harder to model their corresponding underlying topics. In our work, we try to examine the pairwise relationship of the underlying topics through relation extraction. We first extract the entity pairs within one relation tuple out of the raw text. Then, we model the relationship between the entity pairs by adding the dependencies between entities and their corresponding topics. We propose six different versions of Pairwise Topic Model (PTM) to simultaneously discover the latent topics and their pairwise relationship. The experiment on four data sets (AP news articles, DUC 2004 task2, Clinical Notes and Neuroscience Papers) shows the PTM models are better-structured language model than the traditional topic model Latent Dirichlet Allocation (LDA). Also, empirical results show that the proposed Pairwise Topic Models (PTMs) can explicitly explain how two topics are related.
机译:主题建模是一种功能强大的工具,可以对文档进行建模以查找其基础主题。但是,原始文本的非结构化性质使得很难对文本单元(可能是单词,短语或句子)之间的语义关系进行建模,因此甚至更难于对其相应的基础主题进行建模。在我们的工作中,我们尝试通过关系提取来检查基础主题的成对关系。我们首先从原始文本中提取一个关系元组中的实体对。然后,我们通过添加实体及其对应主题之间的依赖关系来对实体对之间的关​​系进行建模。我们提出了成对主题模型(PTM)的六个不同版本,以同时发现潜在主题及其成对关系。对四个数据集(AP新闻,DUC 2004 task2,Clinical Notes和Neuroscience Papers)进行的实验表明,PTM模型比传统主题模型Latent Dirichlet Allocation(LDA)具有更好的结构化语言模型。此外,经验结果表明,提出的成对主题模型(PTM)可以明确解释两个主题之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号