Sentence Clustering using PageRank Topic Model

机译：使用PageRank主题模型的句子聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The clusters of review sentences on the viewpoints from the products' evaluation can be applied to various use. The topic models, for example Unigram Mixture (UM), can be used for this task. However, there are two problems. One problem is that topic models depend on the randomly-initialized parameters and computation results are not consistent. The other is that the number of topics has to be set as a preset parameter. To solve these problems, we introduce PageRank Topic Model (PRIM), that approximately estimates multinomial distributions over topics and words in a vocabulary using network structure analysis methods to Word Co-occurrence Graphs. In PRTM, an appropriate number of topics is estimated using the Newman method from a Word Co-occurrence Graph. Also, PRTM achieves consistent results because multinomial distributions over words in a vocabulary are estimated using PageRank and a multinomial distribution over topics is estimated as a convex quadratic programming problem. Using two review datasets about hotels and cars, we show that PRTM achieves consistent results in sentence clustering and an appropriate estimation of the number of topics for extracting the viewpoints from the products' evaluation.

机译：从产品的评估观点评述语句的集群可以适用于各种用途。该主题模型，例如一元模型混合物（UM），可以用于此任务。但是，有两个问题。一个问题是，主题模型依赖于随机初始化参数和计算结果并不一致。另一个是主题的数量必须被设置为预设参数。为了解决这些问题，我们引入PageRank的主题模型（PRIM），大约有超过估计使用网络结构分析方法到Word共生图形的词汇主题和单词多项分布。在PRTM，采用从Word共生格拉夫纽曼方法估计主题的适当数量。此外，PRTM达到一致的结果，因为多项分布在单词词汇使用PageRank的估计，并且在主题多项分布估计为凸二次规划问题。使用约酒店和汽车两种审核数据集，我们表明，PRTM实现了一句集群和主题的数量从产品的评价提取的观点适当的估计一致的结果。

著录项

来源
《Pacific Asia Conference on Language, Information and Computation》|2016年|x 553 p.|共9页
会议地点
作者
Kenshin Ikegami; Yukio Ohsawa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词

相似文献

外文文献
中文文献
专利

1. Sentence level topic models for associated topics extraction [J] . Jiang Haixin, Zhou Rui, Zhang Limeng, World Wide Web . 2019,第6期

机译：用于关联主题提取的句子级主题模型
2. Free Model of Sentence Classifier for Automatic Extraction of Topic Sentences [J] . M.L. Khodra, D.H. Widyantoro, E.A. Aziz, Journal of ICT Research and Applications . 2011,第1期

机译：用于自动提取主题句的句子分类器免费模型
3. Free Model of Sentence Classifier for Automatic Extraction of Topic Sentences [J] . M.L. Khodra, D.H. Widyantoro, E.A. Aziz, ITB Journal of Information and Communication Technology . 2011,第1期

机译：用于自动提取主题句的句子分类器免费模型
4. Sentence Clustering using PageRank Topic Model [C] . Kenshin Ikegami, Yukio Ohsawa Pacific Asia Conference on Language, Information and Computation . 2016

机译：使用PageRank主题模型的句子聚类
5. Topics in Clustering: Feature Selection and Semiparametric Modeling [D] . Pu, Xiao. 2017

机译：群集主题：特征选择和半甲酰型建模
6. Topic modeling for cluster analysis of large biological and medical datasets [O] . Weizhong Zhao, Wen Zou, James J Chen 2014

机译：用于大型生物和医学数据集聚类分析的主题建模
7. DOCUMENT SUMMARIZATION USING SENTENCE BASED TOPIC MODELING AND CLUSTERING [O] . Augustine George, Hanumanthappa a 2018

机译：基于句子主题建模和群集的文件摘要

Sentence Clustering using PageRank Topic Model

摘要

著录项

相似文献

相关主题

期刊订阅