首页> 外文期刊>Cancer Medicine >Predicting biomarkers from classifier for liver metastasis of colorectal adenocarcinomas using machine learning models
【24h】

Predicting biomarkers from classifier for liver metastasis of colorectal adenocarcinomas using machine learning models

机译:使用机器学习模型预测分类器分类器的生物标志物进行结直肠癌肝癌的肝转移

获取原文
           

摘要

Background Early diagnosis of liver metastasis is of great importance for enhancing the survival of colorectal adenocarcinoma (CAD) patients, and the combined use of a single biomarker in a classier model has shown great improvement in predicting the metastasis of several types of cancers. However, it is little reported for CAD. This study therefore aimed to screen an optimal classier model of CAD with liver metastasis and explore the metastatic mechanisms of genes when applying this classier model. Methods The differentially expressed genes between primary CAD samples and CAD with metastasis samples were screened from the Moffitt Cancer Center (MCC) dataset GSE131418. The classification performances of six selected algorithms, namely, LR, RF, SVM, GBDT, NN, and CatBoost, for classification of CAD with liver metastasis samples were compared using the MCC dataset GSE131418 by detecting their classification test accuracy. In addition, the consortium datasets of GSE131418 and GSE81558 were used as internal and external validation sets to screen the optimal method. Subsequently, functional analyses and a drug‐targeted network construction of the feature genes when applying the optimal method were conducted. Results The optimal CatBoost model with the highest accuracy of 99%, and an area under the curve of 1, was screened, which consisted of 33 feature genes. A functional analysis showed that the feature genes were closely associated with a “steroid metabolic process” and “lipoprotein particle receptor binding” (eg APOB and APOC3). In addition, the feature genes were significantly enriched in the “complement and coagulation cascade” pathways (eg FGA, F2, and F9). In a drug‐target interaction network, F2 and F9 were predicted as targets of menadione. Conclusion The CatBoost model constructed using 33 feature genes showed the optimal classification performance for identifying CAD with liver metastasis.
机译:背景技术肝转移的早期诊断对于增强结肠直肠腺癌(CAD)患者的存活率具有重要意义,并且在级别模型中的单一生物标志物的结合使用表现出了预测若干类型癌症的转移的良好改善。但是,对于CAD很少报道。因此,该研究旨在筛选具有肝转移的最佳CAD的最佳类别模型,并在应用该类模型时探讨基因的转移机制。方法从Moffitt癌症中心(MCC)数据集GSE131418筛选原发性CAD样品和CAD之间的差异表达基因。通过检测其分类测试精度,比较使用MCC DataSet GSE131410进行比较六种选定算法,即LR,RF,SVM,GBDT,NN和Catboost的分类性能,用于使用肝脏转移样品进行分类。此外,GSE131418和GSE81558的联盟数据集用作内部和外部验证集,以筛选最佳方法。随后,进行了施加最佳方法时的功能分析和特征基因的药物靶向网络构建。结果筛选了最高精度的最佳脱荷模型,筛选了13个特征基因的曲线下的最高精度。功能分析表明,特征基因与“类固醇代谢过程”密切相关,“脂蛋白颗粒受体结合”(例如Apob和Apoc3)。此外,特征基因在“补体和凝固级联”途径(例如FGA,F2和F9)中显着富集。在药物 - 目标相互作用网络中,预测F2和F9作为男女养的靶标。结论使用33个特征基因构建的临床主语模型显示出肝转移鉴定CAD的最佳分类性能。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号