The clusters of review sentences on the viewpoints from the products' evaluation can be applied to various use. The topic models, for example Unigram Mixture (UM), can be used for this task. However, there are two problems. One problem is that topic models depend on the randomly-initialized parameters and computation results are not consistent. The other is that the number of topics has to be set as a preset parameter. To solve these problems, we introduce PageRank Topic Model (PRIM), that approximately estimates multinomial distributions over topics and words in a vocabulary using network structure analysis methods to Word Co-occurrence Graphs. In PRTM, an appropriate number of topics is estimated using the Newman method from a Word Co-occurrence Graph. Also, PRTM achieves consistent results because multinomial distributions over words in a vocabulary are estimated using PageRank and a multinomial distribution over topics is estimated as a convex quadratic programming problem. Using two review datasets about hotels and cars, we show that PRTM achieves consistent results in sentence clustering and an appropriate estimation of the number of topics for extracting the viewpoints from the products' evaluation.
展开▼