首页> 美国卫生研究院文献>BMC Bioinformatics >Hidden Markov Model Variants and their Application
【2h】

Hidden Markov Model Variants and their Application

机译:隐马尔可夫模型变体及其应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Markov statistical methods may make it possible to develop an unsupervised learning process that can automatically identify genomic structure in prokaryotes in a comprehensive way. This approach is based on mutual information, probabilistic measures, hidden Markov models, and other purely statistical inputs. This approach also provides a uniquely common ground for comparative prokaryotic genomics. The approach is an on-going effort by its nature, as a multi-pass learning process, where each round is more informed than the last, and thereby allows a shift to the more powerful methods available for supervised learning at each iteration. It is envisaged that this "bootstrap" learning process will also be useful as a knowledge discovery tool. For such an ab initio prokaryotic gene-finder to work, however, it needs a mechanism to identify critical motif structure, such as those around the start of coding or start of transcription (and then, hopefully more).For eukaryotes, even with better start-of-coding identification, parsing of eukaryotic coding regions by the HMM is still limited by the HMM's single gene assumption, as evidenced by the poor performance in alternatively spliced regions. To address these complications an approach is described to expand the states in a eukaryotic gene-predictor HMM, to operate with two layers of DNA parsing. This extension from the single layer gene prediction parse is indicated after preliminary analysis of the C. elegans alt-splice statistics. State profiles have made use of a novel hash-interpolating MM (hIMM) method. A new implementation for an HMM-with-Duration is also described, with far-reaching application to gene-structure identification and analysis of channel current blockade data.
机译:马尔可夫统计方法可能使开发无监督的学习过程成为可能,该过程可以以全面的方式自动识别原核生物的基因组结构。这种方法基于互信息,概率测度,隐马尔可夫模型和其他纯粹的统计输入。这种方法也为比较原核基因组学提供了独特的共同基础。这种方法本质上是一项持续不断的工作,它是一次多次通过的学习过程,在此过程中,每个回合的信息都比最后一回合的信息多,因此可以转移到更强大的方法,以便在每次迭代时进行监督学习。可以设想,这种“引导”学习过程也将用作知识发现工具。但是,要使这样的从头开始的原核基因发现者能够工作,就需要一种机制来识别关键的基序结构,例如在编码开始或转录开始(然后希望更多)附近的那些。从编码开始识别开始,由HMM解析真核编码区仍然受到HMM单基因假设的限制,这在交替剪接区域中表现不佳得到了证明。为了解决这些并发症,描述了一种方法,以扩展真核基因预测子HMM中的状态,以进行两层DNA解析。在对秀丽隐杆线虫(C. elegans)alt剪接统计数据进行初步分析后,表明了单层基因预测分析的这种扩展。状态配置文件已使用一种新颖的哈希插值MM(hIMM)方法。还介绍了具有持续时间的HMM的新实现,并将其广泛应用于基因结构识别和通道电流封锁数据分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号