...
首页> 外文期刊>BMC Medical Informatics and Decision Making >Identification of genomic features in the classification of loss- and gain-of-function mutation
【24h】

Identification of genomic features in the classification of loss- and gain-of-function mutation

机译:在功能丧失和功能获得突变的分类中鉴定基因组特征

获取原文
           

摘要

Background Alterations of a genome can lead to changes in protein functions. Through these genetic mutations, a protein can lose its native function (loss-of-function, LoF), or it can confer a new function (gain-of-function, GoF). However, when a mutation occurs, it is difficult to determine whether it will result in a LoF or a GoF. Therefore, in this paper, we propose a study that analyzes the genomic features of LoF and GoF instances to find features that can be used to classify LoF and GoF mutations. Methods In order to collect experimentally verified LoF and GoF mutational information, we obtained 816 LoF mutations and 474 GoF mutations from a literature text-mining process. Next, with data-preprocessing steps, 258 LoF and 129 GoF mutations remained for a further analysis. We analyzed the properties of these LoF and GoF mutations. Among the properties, we selected features which show different tendencies between the two groups and implemented classifications using support vector machine, random forest, and linear logistic regression methods to confirm whether or not these features can identify LoF and GoF mutations. Results We analyzed the properties of the LoF and GoF mutations and identified six features which have discriminative power between LoF and GoF conditions: the reference allele, the substituted allele, mutation type, mutation impact, subcellular location, and protein domain. When using the six selected features with the random forest, support vector machine, and linear logistic regression classifiers, the result showed accuracy levels of 72.23%, 71.28%, and 70.19%, respectively. Conclusions We analyzed LoF and GoF mutations and selected several properties which were different between the two classes. By implementing classifications with the selected features, it is demonstrated that the selected features have good discriminative power.
机译:背景基因组的改变会导致蛋白质功能的改变。通过这些遗传突变,蛋白质可以失去其天然功能(功能丧失,LoF),或者可以赋予新功能(功能获得,GoF)。但是,当发生突变时,很难确定它会导致LoF还是GoF。因此,在本文中,我们提出了一项对LoF和GoF实例的基因组特征进行分析的研究,以找到可用于对LoF和GoF突变进行分类的特征。方法为了收集经过实验验证的LoF和GoF突变信息,我们从文献文本挖掘过程中获得了816个LoF突变和474个GoF突变。接下来,通过数据预处理步骤,还保留了258个LoF和129个GoF突变,以进行进一步分析。我们分析了这些LoF和GoF突变的特性。在这些属性中,我们选择了表现出两组之间不同趋势的特征,并使用支持向量机,随机森林和线性逻辑回归方法进行了分类,以确认这些特征是否可以识别LoF和GoF突变。结果我们分析了LoF和GoF突变的特性,并确定了在LoF和GoF条件之间具有判别力的六个特征:参考等位基因,取代的等位基因,突变类型,突变影响,亚细胞位置和蛋白质结构域。在随机森林,支持向量机和线性逻辑回归分类器上使用六个选定的特征时,结果分别显示出72.23%,71.28%和70.19%的准确度。结论我们分析了LoF和GoF突变,并选择了两个类别之间不同的几个特性。通过对所选特征进行分类,可以证明所选特征具有良好的区分能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号