...
首页> 外文期刊>BMC Bioinformatics >Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data
【24h】

Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data

机译:基于基于张量分解的无预测特征提取对大规模数据的处理细胞基因表达的药物候选鉴定

获取原文
           

摘要

Although in silico drug discovery is necessary for drug development, two major strategies, a structure-based and ligand-based approach, have not been completely successful. Currently, the third approach, inference of drug candidates from gene expression profiles obtained from the cells treated with the compounds under study requires the use of a training dataset. Here, the purpose was to develop a new approach that does not require any pre-existing knowledge about the drug-protein interactions, but these interactions can be inferred by means of an integrated approach using gene expression profiles obtained from the cells treated with the analysed compounds and the existing data describing gene-gene interactions. In the present study, using tensor decomposition-based unsupervised feature extraction, which represents an extension of the recently proposed principal-component analysis-based feature extraction, gene sets and compounds with a significant dose-dependent activity were screened without any training datasets. Next, after these results were combined with the data showing perturbations in single-gene expression profiles, genes targeted by the analysed compounds were inferred. The set of target genes thus identified was shown to significantly overlap with known target genes of the compounds under study. The method is specifically designed for large-scale datasets (including hundreds of treatments with compounds), not for conventional small-scale datasets. The obtained results indicate that two compounds that have not been extensively studied, WZ-3105 and CGP-60474, represent promising drug candidates targeting multiple cancers, including melanoma, adenocarcinoma, liver carcinoma, and breast, colon, and prostate cancers, which were analysed in this in silico study.
机译:虽然在硅药物发现中是药物开发所必需的,但两种主要策略,基于结构和基于配体的方法,尚未完全成功。目前,第三种方法,从由研究中的化合物处理的细胞中获得的基因表达谱引起药物候选物的推断需要使用训练数据集。在这里,目的是开发一种不需要对药物 - 蛋​​白质相互作用的任何预先存在的新方法,但是可以通过使用从分析处理的细胞获得的基因表达谱来推断出这些相互作用的综合方法化合物和描述基因基因相互作用的现有数据。在本研究中,使用基于张量分解的无预测特征提取,其表示最近提出的主要组分分析的特征提取,基因组和具有显着剂量依赖性活性的化合物在没有任何训练数据集的情况下筛选。接下来,在这些结果与单基因表达谱中显示扰动的数据结合后,推断出由分析的化合物靶向的基因。如此鉴定的一组靶基因显示出明显重叠与正在研究的化合物的已知靶基因重叠。该方法专为大规模数据集(包括百分之百的化合物)设计,而不是传统的小型数据集。所得结果表明,两种尚未广泛研究的化合物WZ-3105和CGP-60474代表了靶向多种癌症的有前途的药物候选者,包括黑素瘤,腺癌,肝癌和乳腺癌,结肠和前列腺癌,这是分析的在这中,在Silico研究中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号