首页> 外文会议>2008年京津地区青年概率统计研讨会 >Themes Discovery with a Generalized Dictionary Model
【24h】

Themes Discovery with a Generalized Dictionary Model

机译:带有通用字典模型的主题发现

获取原文
获取原文并翻译 | 示例

摘要

Discovery of patterns and functional modules from observed data sets is one of the most important problems in data mining and bioinformatics. In this paper, we propose an approach for discovering patterns and functional modules from categorical and text data sets.The potential patterns hidden in data sets are regarded as themes. In terms of a probabilistic model, we build a dictionary of these themes, and then we try to find these themes based on the likelihood. To evaluate the approach, we give simulation, and then we apply the approach to the traditional Chinese medicine, Chinese text mining and genome data. Compared with other approaches, the advantages of the approach proposed in this paper are that it can find smaller and weaker modules which may overlap heavily with very low false positive rate and it can present more complex relationships among variables.
机译:从观察到的数据集中发现模式和功能模块是数据挖掘和生物信息学中最重要的问题之一。本文提出一种从分类和文本数据集中发现模式和功能模块的方法,将隐藏在数据集中的潜在模式作为主题。根据概率模型,我们建立了这些主题的字典,然后尝试根据可能性找到这些主题。为了评估该方法,我们进行了仿真,然后将该方法应用于中药,中文文本挖掘和基因组数据。与其他方法相比,本文提出的方法的优势在于,它可以找到较小且较弱的模块,这些模块可能以非常低的假阳性率大量重叠,并且可以呈现变量之间更复杂的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号