首页> 外文学位 >Enhancing gene expression signatures in cancer prediction models: Understanding and managing classification complexity.
【24h】

Enhancing gene expression signatures in cancer prediction models: Understanding and managing classification complexity.

机译:在癌症预测模型中增强基因表达签名:了解和管理分类的复杂性。

获取原文
获取原文并翻译 | 示例

摘要

Cancer can develop through a series of genetic events in combination with external influential factors that alter the progression of the disease. Gene expression studies are designed to provide an enhanced understanding of the progression of cancer and to develop clinically relevant biomarkers of disease, prognosis and response to treatment. One of the main aims of microarray gene expression analyses is to develop signatures that are highly predictive of specific biological states, such as the molecular stage of cancer. This dissertation analyzes the classification complexity inherent in gene expression studies, proposing both techniques for measuring complexity and algorithms for reducing this complexity.;Classifier algorithms that generate predictive signatures of cancer models must generalize to independent datasets for successful translation to clinical practice. The predictive performance of classifier models is shown to be dependent on the inherent complexity of the gene expression data. Three specific quantitative measures of classification complexity are proposed and one measure (&phis;) is shown to correlate highly (R2=0.82) with classifier accuracy in experimental data.;Three quantization methods are proposed to enhance contrast in gene expression data and reduce classification complexity. The accuracy for cancer prognosis prediction is shown to improve using quantization in two datasets studied: from 67% to 90% in lung cancer and from 56% to 68% in colorectal cancer. A corresponding reduction in classification complexity is also observed.;A random subspace based multivariable feature selection approach using cost-sensitive analysis is proposed to model the underlying heterogeneous cancer biology and address complexity due to multiple molecular pathways and unbalanced distribution of samples into classes. The technique is shown to be more accurate than the univariate t-test method. The classifier accuracy improves from 56% to 68% for colorectal cancer prognosis prediction.;A published gene expression signature to predict radiosensitivity of tumor cells is augmented with clinical indicators to enhance modeling of the data and represent the underlying biology more closely. Statistical tests and experiments indicate that the improvement in the model fit is a result of modeling the underlying biology rather than statistical over-fitting of the data, thereby accommodating classification complexity through the use of additional variables.
机译:癌症可通过一系列遗传事件与改变疾病进展的外部影响因素共同发展。基因表达研究旨在增强对癌症进展的了解,并开发疾病,预后和对治疗反应的临床相关生物标志物。微阵列基因表达分析的主要目的之一是开发可高度预测特定生物学状态(例如癌症的分子阶段)的特征。本文分析了基因表达研究固有的分类复杂性,提出了测量复杂性的技术和降低复杂性的算法。产生癌症模型预测特征的分类器算法必须推广到独立的数据集,才能成功地转化为临床实践。分类器模型的预测性能显示取决于基因表达数据的固有复杂性。提出了三种具体的分类复杂度定量方法,并提出了一种测量值(φ)与实验数据中的分类器准确性高度相关(R2 = 0.82)。提出了三种量化方法以增强基因表达数据的对比度并降低分类复杂度。在两个研究的数据集中,通过量化可以提高癌症预后的预测准确性:肺癌从67%提高到90%,大肠癌从56%提高到68%。还提出了分类复杂度的相应降低。提出了一种使用成本敏感分析的基于随机子空间的多变量特征选择方法,对潜在的异质性癌症生物学建模,并解决了由于多种分子途径和样品在类别中的不平衡分配所引起的复杂性。该技术显示出比单变量t检验方法更准确。用于大肠癌预后预测的分类器准确性从56%提高到68%。;已发布的预测肿瘤细胞放射敏感性的基因表达签名增加了临床指标,以增强数据建模并更紧密地代表基础生物学。统计测试和实验表明,模型拟合的改进是对基础生物学进行建模的结果,而不是数据的统计过拟合,从而通过使用其他变量来适应分类的复杂性。

著录项

  • 作者

    Kamath, Vidya P.;

  • 作者单位

    University of South Florida.;

  • 授予单位 University of South Florida.;
  • 学科 Engineering Biomedical.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 133 p.
  • 总页数 133
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号