Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

Fu Qiang; Zhuang Yufan; Gu Jiaxin; Zhu Yushu; Guo Xin

首页> 外文期刊>Big Data Research >Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

【24h】

Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

机译：同意不同意：在八个主题建模方法中选择

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topic modeling is a key research area in natural language processing and has inspired innovative studies in a wide array of social-science disciplines. Yet, the use of topic modeling in computational social science has been hampered by two critical issues. First, social scientists tend to focus on a few standard ways of topic modeling. Our understanding of semantic patterns has not been informed by rapid methodological advances in topic modeling. Moreover, a systematic comparison of the performance of different methods in this field is warranted. Second, the choice of the optimal number of topics remains a challenging task. A comparison of topic-modeling techniques has rarely been situated in a social-science context and the choice appears to be arbitrary for most social scientists. Based on about 120,000 Canadian newspaper articles since 1977, we review and compare eight traditional, generative, and neural methods for topic modeling (Latent Semantic Analysis, Principal Component Analysis, Factor Analysis, Non-negative Matrix Factorization, Latent Dirichlet Allocation, Neural Autoregressive Topic Model, Neural Variational Document Model, and Hierarchical Dirichlet Process). Three measures (coherence statistics, held-out likelihood, and graph-based dimensionality selection) are then used to assess the performance of these methods. Findings are presented and discussed to guide the choice of topic-modeling methods, especially in social science research. (C) 2020 Elsevier Inc. All rights reserved.

机译：主题建模是自然语言处理中的关键研究领域，并在广泛的社会科学学科中启发了创新研究。然而，在计算社会科学中使用主题建模已经受到两个关键问题的阻碍。首先，社会科学家倾向于关注几个标准的主题建模方式。我们对语义模式的理解尚未通过主题建模的快速方法进步来了解。此外，保证了该领域中不同方法性能的系统比较。其次，选择最佳主题的选择仍然是一个具有挑战性的任务。主题建模技术的比较很少位于社会科学环境中，选择似乎是大多数社会科学家的任意。自1977年以来为基于大约120,000家加拿大报纸文章，我们审查和比较八个传统，生成和神经方法主题建模（潜在语义分析，主成分分析，因子分析，非负矩阵分解，潜在的Dirichlet分配，神经自动评级主题模型，神经变分文档模型和分层Dirichlet过程）。然后，使用三项措施（一致统计，结束可能性和基于图形的维度选择）来评估这些方法的性能。提出并讨论了调查结果，以指导主题建模方法，特别是在社会科学研究中。（c）2020 Elsevier Inc.保留所有权利。

著录项

来源
《Big Data Research》 |2021年第1期|共9页
作者
Fu Qiang; Zhuang Yufan; Gu Jiaxin; Zhu Yushu; Guo Xin;
展开▼
作者单位

Univ British Columbia Dept Sociol Vancouver BC Canada;

IBM Res Yorktown Hts NY USA;

Univ British Columbia Dept Sociol Vancouver BC Canada;

Simon Fraser Univ Urban Studies Program Vancouver BC Canada;

Hong Kong Polytech Univ Dept Appl Math Hong Kong Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Topic modeling; Natural language processing; Computational social science; Optimal number of topics;

机译：主题建模;自然语言处理;计算社会科学;最佳主题数量;

相似文献

外文文献
中文文献
专利

1. Agreeing to Disagree Building a new-type China-U.S. military relationship requires both sides＇ efforts [J] . Ding Ying 北京周报：英文版 . 2014,第016期
2. Agreeing to Disagree [J] . Ding Ying 北京周报（英文版） . 2014,第016期
3. RECONSTRUCTION OF PART OF AN ACTUAL BLAST-WAVE FLOW FIELD TO AGREE WITH EXPERIMENTAL DATA BY USING NUMERICAL METHOD WITH HIGH IDENTIFICATION [J] . 吴清松应用数学和力学：英文版 . 1989,第010期
4. Stability and Global Sensitivity Analysis for an Agree-Disagree Model: Partial Rank Correlation Coefficient and Latin Hypercube Sampling Methods [J] . Sara Bidah, Omar Zakary, Mostafa Rachik International Journal of Differential Equations . 2020,第1期

机译：同意的稳定性和全球敏感性分析模型：部分等级相关系数和拉丁超立方体采样方法
5. Stability and Global Sensitivity Analysis for an Agree-Disagree Model: Partial Rank Correlation Coefficient and Latin Hypercube Sampling Methods [J] . Sara Bidah, Omar Zakary, Mostafa Rachik Differential equations and nonlinear mechanics . 2020,第3期

机译：同意的稳定性和全球敏感性分析模型：部分等级相关系数和拉丁超立方体采样方法
6. Comparing two methods-agreeing to disagree? [J] . Choi S. W., Lam D. M. H. Anaesthesia: Journal of the Association of Anaesthetists of Great Britain and Ireland . 2017,第5期

机译：比较两种方法 - 同意不同意吗？
7. Decentralized Optimization Over Noisy, Rate-Constrained Networks: How We Agree By Talking About How We Disagree [C] . Rajarshi Saha, Stefano Rini, Milind Rao, IEEE International Conference on Acoustics, Speech and Signal Processing . 2021

机译：嘈杂，速率约束网络的分散优化：我们如何通过谈论我们如何不同意
8. Do human resource professionals and managers agree or disagree with the characteristics of the millennial generation? [D] . O'Brien, Jessica 2013

机译：人力资源专业人士和管理人员是否同意或不同意千禧一代的特征？
9. Agree or Agree to Disagree? Assessing the Convergence between Parents and Observers on Infant Temperament [O] . Cynthia A. Stifter, Michael T. Willoughby, Nissa Towe-Goodman -1

机译：同意还是不同意？评估父母与观察者之间婴儿气质的融合
10. #039;#039;I couldn#039;t agree more, but...#039;#039;: agreeing to disagree in French and Australian English [O] . Mullan K 2012

机译：＆＃039;＆＃039;＆＃039;我无法达成一致，但......＆＃039;＆＃039;：同意不同意法语和澳大利亚英语
11. Understanding Why Models Agree (Or Disagree) [R] . Potter, G. L., Ellsaesser, H. W. 1981

机译：理解为什么模型同意（或不同意）

Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods

摘要

著录项

相似文献

相关主题

期刊订阅