iLDA: An interactive latent Dirichlet allocation model to improve topic quality

Yezheng Liu; Fei Du; Jianshan Sun; Yuanchun Jiang

首页> 外文期刊>Journal of Information Science >iLDA: An interactive latent Dirichlet allocation model to improve topic quality

【24h】

iLDA: An interactive latent Dirichlet allocation model to improve topic quality

机译：iLDA：交互式潜在狄利克雷分配模型，可提高主题质量

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

User-generated content has been an increasingly important data source for anal/sing user interests in both industries and academic research. Since the proposal of the basic latent Dirichlet allocation (LDA) model, plenty of LDA variants have been developed to learn knowledge from unstructured user-generated contents. An intractable limitation for LDA and its variants is that low-quality topics whose meanings are confusing may be generated. To handle this problem, this article proposes an interactive strategy to generate high-quality topics with clear meanings by integrating subjective knowledge derived from human experts and objective knowledge learned by LDA. The proposed interactive latent Dirichlet allocation (iLDA) model develops deterministic and stochastic approaches to obtain subjective topic-word distribution from human experts, combines the subjective and objective topic-word distributions by a linear weighted-sum method, and provides the inference process to draw topics and words from a comprehensive topic-word distribution. The proposed model is a significant effort to integrate human knowledge with LDA-based models by interactive strategy. The experiments on two real-world corpora show that the proposed iLDA model can draw high-quality topics with the assistance of subjective knowledge from human experts. It is robust under various conditions and offers fundamental supports for the applications of LDA-based topic modelling.

机译：用户生成的内容已成为越来越重要的数据源，用于在行业和学术研究中分析/激发用户的兴趣。自从提出基本的潜在狄利克雷分配（LDA）模型以来，已经开发了许多LDA变体来从非结构化用户生成的内容中学习知识。 LDA及其变体的一个棘手的限制是，可能会生成含义混乱的低质量主题。为了解决这个问题，本文提出了一种交互式策略，通过整合人类专家的主观知识和LDA学习的客观知识，生成具有清晰含义的高质量主题。拟议的交互式潜在狄利克雷分配（iLDA）模型开发了确定性和随机方法，可从人类专家那里获得主观主题词分布，通过线性加权和方法组合主观和客观主题词分布，并提供推理过程以进行绘制全面的主题词分布中的主题和词。提出的模型是通过交互策略将人类知识与基于LDA的模型集成的一项重大工作。在两个真实世界语料库上的实验表明，所提出的iLDA模型可以借助人类专家的主观知识来画出高质量的主题。它在各种条件下都非常可靠，并为基于LDA的主题建模应用程序提供了基础支持。

著录项

来源
《Journal of Information Science》 |2020年第1期|23-40|共18页
作者
Yezheng Liu; Fei Du; Jianshan Sun; Yuanchun Jiang;
展开▼
作者单位

School of Management Hefei University of Technology China Key Laboratory of Process Optimization and Intelligent Decision-Making Ministry of Education China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Expert knowledge; interactive strategy; latent Dirichlet allocation; subjective topic-word distribution;

机译：专业知识;互动策略;潜在的狄利克雷分配主观话题词分布;

相似文献

外文文献
中文文献
专利

1. Latent Dirichlet Allocation (LDA) for improving the topic modeling of the official bulletin of the spanish state (BOE) [J] . J.C. Bailón-Elvira, M.J. Cobo, E. Herrera-Viedma, Procedia Computer Science . 2019,第1期

机译：潜在狄利克雷分配（LDA），用于改进西班牙国家（BOE）官方公告的主题建模
2. Analysis of Health Research Topics in Indonesia Using the LDA (Latent Dirichlet Allocation) Topic Modeling Method [J] . Yoga Sahria, Dhomas Hatta Fudholi Jurnal RESTI: Rekayasa Sistem dan Teknologi Informasi . 2020,第2期

机译：使用LDA（潜在Dirichlet分配）主题建模方法分析印度尼西亚的健康研究主题
3. A comparative analysis of Latent Semantic analysis and Latent Dirichlet allocation topic modeling methods using Bible data [J] . Vasantha Kumari Garbhapu, Prajna Bodapati Indian Journal of Science and Technology . 2020,第44期

机译：潜在语义分析与潜在的Dirichlet分配主题建模方法的比较分析
4. An Improved Latent Dirichlet Allocation Model for Hot Topic Extraction [C] . Guolong Liu, Xiaofei Xu, Ying Zhu, 2014 IEEE Fourth International Conference on Big Data and Cloud Computing . 2014

机译：一种改进的潜在Dirichlet潜在话题分配模型
5. Performance of Latent Dirichlet Allocation with Different Topic and Document Structures [D] . Feng, Haotian. 2019

机译：不同主题和文档结构的潜在Dirichlet分配的性能
6. Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter [O] . Jia Xue, Junxiang Chen, Chen Chen, 2020

机译：Covid 19 Pandemery的公众话语和情绪：在推特上使用潜在的Dirichlet分配主题建模
7. Latent Dirichlet Allocation (LDA) for improving the topic modeling of the official bulletin of the spanish state (BOE) [O] . J.C. Bailón-Elvira, M.J. Cobo, E. Herrera-Viedma, 2019

机译：潜在的Dirichlet分配（LDA），用于改进西班牙州官方公报（BOE）的主题建模

iLDA: An interactive latent Dirichlet allocation model to improve topic quality

摘要

著录项

相似文献

相关主题

期刊订阅