首页> 外文期刊>Journal of Information Science >iLDA: An interactive latent Dirichlet allocation model to improve topic quality
【24h】

iLDA: An interactive latent Dirichlet allocation model to improve topic quality

机译:iLDA:交互式潜在狄利克雷分配模型,可提高主题质量

获取原文
获取原文并翻译 | 示例
           

摘要

User-generated content has been an increasingly important data source for anal/sing user interests in both industries and academic research. Since the proposal of the basic latent Dirichlet allocation (LDA) model, plenty of LDA variants have been developed to learn knowledge from unstructured user-generated contents. An intractable limitation for LDA and its variants is that low-quality topics whose meanings are confusing may be generated. To handle this problem, this article proposes an interactive strategy to generate high-quality topics with clear meanings by integrating subjective knowledge derived from human experts and objective knowledge learned by LDA. The proposed interactive latent Dirichlet allocation (iLDA) model develops deterministic and stochastic approaches to obtain subjective topic-word distribution from human experts, combines the subjective and objective topic-word distributions by a linear weighted-sum method, and provides the inference process to draw topics and words from a comprehensive topic-word distribution. The proposed model is a significant effort to integrate human knowledge with LDA-based models by interactive strategy. The experiments on two real-world corpora show that the proposed iLDA model can draw high-quality topics with the assistance of subjective knowledge from human experts. It is robust under various conditions and offers fundamental supports for the applications of LDA-based topic modelling.
机译:用户生成的内容已成为越来越重要的数据源,用于在行业和学术研究中分析/激发用户的兴趣。自从提出基本的潜在狄利克雷分配(LDA)模型以来,已经开发了许多LDA变体来从非结构化用户生成的内容中学习知识。 LDA及其变体的一个棘手的限制是,可能会生成含义混乱的低质量主题。为了解决这个问题,本文提出了一种交互式策略,通过整合人类专家的主观知识和LDA学习的客观知识,生成具有清晰含义的高质量主题。拟议的交互式潜在狄利克雷分配(iLDA)模型开发了确定性和随机方法,可从人类专家那里获得主观主题词分布,通过线性加权和方法组合主观和客观主题词分布,并提供推理过程以进行绘制全面的主题词分布中的主题和词。提出的模型是通过交互策略将人类知识与基于LDA的模型集成的一项重大工作。在两个真实世界语料库上的实验表明,所提出的iLDA模型可以借助人类专家的主观知识来画出高质量的主题。它在各种条件下都非常可靠,并为基于LDA的主题建模应用程序提供了基础支持。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号