Random Sample Consensus for the Robust Identification of Outliers in Cancer Data

机译：随机样本共识，用于癌症数据中的异常值识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Random sample consensus (Ransac) is a technique that has been widely used for modeling data with a large amount of noise. Although successfully employed in areas such as computer vision, extensive testing and applications to clinical data, particularly in oncology, are still lacking. We applied this technique to synthetic and biomedical datasets, publicly available at The Cancer Genome Atlas (TCGA) and the UC Irvine Machine Learning Repository, to identify outliers in the classification of tumor samples. The results obtained by combining Ransac with logistic regression were compared against a baseline classical logistic model. To evaluate the robustness of this method, the original datasets were then perturbed by generating noisy data and by artificially switching the labels. The flagged outlier observations were compared against the misclassifications of the baseline logistic model, along with the evaluation of the overall accuracy of both strategies. Ransac has shown high precision in classifying a subset of core (inlier) observations in the datasets evaluated, while simultaneously identifying the outlier observations, as well as robustness to increasingly perturbed data.

机译：随机样本共识（RANSAC）是一种通过广泛用于使用大量噪声建模数据的技术。虽然在计算机视觉，广泛的测试和应用程序等领域成功雇用，但特别是在肿瘤学中，仍然缺乏。我们将该技术应用于合成和生物医学数据集，公开可用于癌症基因组Atlas（TCGA）和UC Irvine机器学习储存库，以识别肿瘤样本分类中的异常值。通过将Ransac与Logistic回归组合获得的结果与基线古典物流模型进行了比较。为了评估该方法的稳健性，然后通过产生噪声数据并通过人工切换标签来扰乱原始数据集。将标记的异常观测与基线物流模型的错误分类进行了比较，以及评估两种策略的整体准确性。 Ransac在分类评估的数据集中的核心（Inlier）观测的子集中，ransac已经高精度，同时识别异常观察，以及越来越扰动数据的鲁棒性。

著录项

来源
《International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics》|2020年|350p|共11页
会议地点
作者
Andre Verissimo; Marta B. Lopes; Eunice Carrasquinha; Susana Vinga;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP301-53;
关键词
Ransac; Machine learning; Outlier detection; Oncological data;

机译：Ransac;机器学习;异常检测;肿瘤学数据;

相似文献

外文文献
中文文献
专利

1. Model-wise and point-wise random sample consensus for robust regression and outlier detection [J] . Moumen T. El-Melegy Neural Networks: The Official Journal of the International Neural Network Society . 2014,第Null期

机译：鲁棒回归和离群值检测的按模型和按点随机样本共识
2. Robust identification of target genes and outliers in triple-negative breast cancer data [J] . Segaert Pieter, Lopes Marta B., Casimiro Sandra, Statistical methods in medical research . 2019,第10a11期

机译：三阴性乳腺癌数据中靶基因和异常值的鲁棒鉴定
3. Robust Stochastic Sampled-data-based Output Consensus of Heterogeneous Multi-agent Systems Subject to Random DoS Attack: A Markovian Jumping System Approach [J] . Ni Hongjie, Xu Zhenhua, Cheng Jun, International Journal of Control, Automation, and Systems . 2019,第7期

机译：基于软化的随机采样数据的异质多助理系统的基于基于的基于数据的输出共识，受到随机DOS攻击：Markovian跳跃系统方法
4. Random Sample Consensus for the Robust Identification of Outliers in Cancer Data [C] . Andre Verissimo, Marta B. Lopes, Eunice Carrasquinha, International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics . 2020

机译：随机样本共识，用于癌症数据中的异常值识别
5. Stable random projections and conditional random sampling, two sampling techniques for modern massive datasets [D] . Li, Ping 2007

机译：稳定的随机投影和条件随机采样，这是现代海量数据集的两种采样技术
6. Robust identification of target genes and outliers in triple-negativebreast cancer data [O] . Pieter Segaert, Marta B Lopes, Sandra Casimiro, -1

机译：可靠鉴定三阴性中的靶基因和离群值乳腺癌数据
7. Robust identification of target genes and outliers in triple-negative breast cancer data [O] . Pieter Segaert, Marta B Lopes, Sandra Casimiro, 2018

机译：三重阴性乳腺癌数据中靶基因和异常值的鲁棒鉴定
8. Identification of Multivariate Outliers and Leverage Points by Means of Robust Covariance Matrices [R] . Rousseeuw, P. J. , van Zomeren, B. C. 1987

机译：利用鲁棒协方差矩阵识别多元异常值和杠杆点

Random Sample Consensus for the Robust Identification of Outliers in Cancer Data

摘要

著录项

相似文献

相关主题

期刊订阅