【24h】

Bayesian feature selection for sparse topic model

机译:稀疏主题模型的贝叶斯特征选择

获取原文

摘要

This paper presents a new Bayesian sparse learning approach to select salient lexical features and build sparse topic model (sTM). The Bayesian learning is performed by incorporating the spike-and-slab priors so that the words with spiky distributions are filtered and those with slab distributions are selected as features for estimating the topic model (TM) based on latent Dirichlet allocation. The variational inference procedure is developed to train sTM parameters. In the experiments on document modeling using TM and sTM, we find that the proposed sTM does not only reduce the model perplexity but also reduce the memory and computation costs. Bayesian feature selection method does effectively identify the representative topic words for building a sparse learning model.
机译:本文提出了一种新的贝叶斯稀疏学习方法,用于选择显着的词汇特征并建立稀疏主题模型(sTM)。贝叶斯学习是通过结合尖峰和台阶先验来进行的,以便过滤具有尖峰分布的单词,并选择具有尖峰分布的单词作为基于潜在Dirichlet分配估算主题模型(TM)的特征。开发了变分推理程序来训练sTM参数。在使用TM和sTM进行文档建模的实验中,我们发现所提出的sTM不仅减少了模型的复杂性,而且减少了内存和计算成本。贝叶斯特征选择方法的确可以有效地识别出具有代表性的主题词,以建立一个稀疏的学习模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号