首页> 外文会议>Annual International Conference of the IEEE Engineering in Medicine and Biology Society >An improved classification scheme for chromosomes with missing data
【24h】

An improved classification scheme for chromosomes with missing data

机译:具有缺失数据的染色体的改进的分类方案

获取原文

摘要

Karyotyping, or the automatic classification of human chromosomes, is mostly based on the analysis of the chromosome specific banding pattern. Unfortunately, the most informative phases of the cell division cycle are composed of long chromosomes that easily overlap: the involved banding pattern information is corrupted, resulting in a drastic increase of the classification error. Assuming the availability of a probabilistic classifier, the improvement of the classification of chromosomes with corrupted data would require the additional estimation of the joint probability density of the observed and missing data for each chromosome class. Given the number of classes, the possible position and extension of the corrupted data within a chromosome, and the dimensionality of the feature space, a reliable estimation would need an impossible number of training samples. We chose to circumvent the estimation problem by developing a statistical generative model of the pattern of each class, so that the corrupted part can be substituted with a partial pattern synthetically generated from the model. This allows to obtain a Monte Carlo estimate of the maximum a posteriori probability for the class given the observation and the missing data, which reduces to a simple voting scheme if the a priori probability for each class is equal. Moreover, this Monte Carlo classification is superior to the voting scheme based on the simple imputation of the classes mean to the missing data.
机译:核型化或人染色体的自动分类主要是基于染色体特异性条带图案的分析。不幸的是,细胞分裂周期的最具信息阶段由长染色体组成,可容易重叠:所涉及的绷带图案信息已损坏,导致分类误差的大幅增加。假设概率分类器的可用性,具有损坏数据的染色体分类的改善将需要额外估计每个染色体类的观察到的数据的联合概率密度。鉴于类的数量,损坏的数据的染色体内可能的位置和外延,特征空间的维度,一个可靠的估计将需要训练样本的数量不可能。我们选择通过开发​​每个类别的模式的统计生成模型来规避估计问题,使得损坏的部分可以用从模型的合成产生的部分模式代替。这允许获得给定观察和缺失数据的类别的最大后验概率的蒙特卡罗估计,如果每个类的先验概率相等,则减少到简单的投票方案。此外,这种蒙特卡罗分类优于基于对缺失数据的类别的简单估算来优于投票方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号