Microarray missing data imputation based on a set theoretic framework and biological knowledge

Gan XC; Liew AWC; Yan H

首页> 外文期刊>Nucleic Acids Research >Microarray missing data imputation based on a set theoretic framework and biological knowledge

【24h】

Microarray missing data imputation based on a set theoretic framework and biological knowledge

机译：基于一套理论框架和生物学知识的微阵列缺失数据估算

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Gene expressions measured using microarrays usually suffer from the missing value problem. However, in many data analysis methods, a complete data matrix is required. Although existing missing value imputation algorithms have shown good performance to deal with missing values, they also have their limitations. For example, some algorithms have good performance only when strong local correlation exists in data while some provide the best estimate when data is dominated by global structure. In addition, these algorithms do not take into account any biological constraint in their imputation. In this paper, we propose a set theoretic framework based on projection onto convex sets (POCS) for missing data imputation. POCS allows us to incorporate different types of a priori knowledge about missing values into the estimation process. The main idea of POCS is to formulate every piece of prior knowledge into a corresponding convex set and then use a convergence-guaranteed iterative procedure to obtain a solution in the intersection of all these sets. In this work, we design several convex sets, taking into consideration the biological characteristic of the data: the first set mainly exploit the local correlation structure among genes in microarray data, while the second set captures the global correlation structure among arrays. The third set (actually a series of sets) exploits the biological phenomenon of synchronization loss in microarray experiments. In cyclic systems, synchronization loss is a common phenomenon and we construct a series of sets based on this phenomenon for our POCS imputation algorithm. Experiments show that our algorithm can achieve a significant reduction of error compared to the KNNimpute, SVDimpute and LSimpute methods.

机译：使用微阵列测量的基因表达通常遭受缺失值问题。但是，在许多数据分析方法中，都需要完整的数据矩阵。尽管现有的缺失值插补算法在处理缺失值方面表现出良好的性能，但它们也有其局限性。例如，某些算法仅在数据中存在强局部相关性时才具有良好的性能，而某些算法在数据由全局结构支配时才提供最佳估计。另外，这些算法在归因中未考虑任何生物学限制。在本文中，我们提出了一种基于凸集投影（POCS）的集合理论框架，用于缺失数据插补。 POCS允许我们将关于缺失值的不同类型的先验知识纳入估计过程。 POCS的主要思想是将每个先验知识公式化为相应的凸集，然后使用收敛保证的迭代过程在所有这些集的交集上获得解。在这项工作中，我们考虑到数据的生物学特性，设计了一些凸集：第一个集主要利用微阵列数据中基因之间的局部相关结构，而第二个集则捕获了阵列之间的全局相关结构。第三组（实际上是一系列）利用了微阵列实验中同步丢失的生物学现象。在循环系统中，同步损耗是一种常见现象，我们基于这种现象为POCS插补算法构造了一系列集合。实验表明，与KNNimpute，SVDimpute和LSimpute方法相比，我们的算法可以显着减少错误。

著录项

来源
《Nucleic Acids Research》 |2006年第5期|共12页
作者
Gan XC; Liew AWC; Yan H;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GENE-EXPRESSION DATA; ARTIFACT REDUCTION; CLUSTER-ANALYSIS; CONSTRAINT SET; ALGORITHM; IDENTIFICATION; PROJECTION; PROFILE; CELLS;

机译：基因表达数据;伪缩;聚类分析;约束集;算法;识别;投影;轮廓;细胞;

相似文献

外文文献
中文文献
专利

1. Microarray missing data imputation based on a set theoretic framework and biological knowledge [J] . Gan XC, Liew AWC, Yan H Nucleic Acids Research . 2006,第5期

机译：基于一套理论框架和生物学知识的微阵列缺失数据估算
2. A cluster-directed framework for neighbour based imputation of missing value in microarray data [J] . Keerin Phimmarin, Kurutach Werasak, Boongoen Tossapon International journal of data mining and bioinformatics . 2016,第2期

机译：一个基于集群的框架，用于基于邻居的微阵列数据缺失值插补
3. Missing value imputation in DNA microarray gene expression data: a comparative study of an improved collaborative filtering method with decision tree based approach [J] . Sujay Saha, Anupam Ghosh, Saikat Bandopadhyay, International Journal of Computational Science and Engineering . 2019,第2期

机译：DNA微阵列基因表达数据中缺少价值局息：基于决策树的改进协作滤波方法的比较研究
4. Microarray Missing Data Imputation based on a Set Theoretic Framework and Biological Constraints [C] . Xiangchao Gan, Liew A.W.-C., Hong Yan International Conference on Pattern Recognition . 2006

机译：基于设定的理论框架和生物限制，微阵列缺少数据载态
5. A knowledge-based framework for high-content screening of multi-modality biological imaging data. [D] . Ahmed, Wamiq Manzoor. 2007

机译：基于知识的多模式生物成像数据高内涵筛选框架。
6. Microarray missing data imputation based on a set theoretic framework and biological knowledge [O] . Xiangchao Gan, Alan Wee-Chung Liew, Hong Yan 2006

机译：基于一套理论框架和生物学知识的微阵列缺失数据估算
7. Microarray missing data imputation based on a set theoretic framework and biological knowledge [O] . Gan, Xiangchao, Liew, Alan Wee-Chung, Yan, Hong 2006

机译：基于一套理论框架和生物学知识的微阵列缺失数据估算

Microarray missing data imputation based on a set theoretic framework and biological knowledge

摘要

著录项

相似文献

相关主题

期刊订阅