...
首页> 外文期刊>BMC Genomics >Properly defining the targets of a transcription factor significantly improves the computational identification of cooperative transcription factor pairs in yeast
【24h】

Properly defining the targets of a transcription factor significantly improves the computational identification of cooperative transcription factor pairs in yeast

机译:正确定义转录因子的靶标可显着改善酵母中协同转录因子对的计算鉴定

获取原文
           

摘要

Background Transcriptional regulation of gene expression in eukaryotes is usually accomplished by cooperative transcription factors (TFs). Computational identification of cooperative TF pairs has become a hot research topic and many algorithms have been proposed in the literature. A typical algorithm for predicting cooperative TF pairs has two steps. (Step 1) Define the targets of each TF under study. (Step 2) Design a measure for calculating the cooperativity of a TF pair based on the targets of these two TFs. While different algorithms have distinct sophisticated cooperativity measures, the targets of a TF are usually defined using ChIP-chip data. However, there is an inherent weakness in using ChIP-chip data to define the targets of a TF. ChIP-chip analysis can only identify the binding targets of a TF but it cannot distinguish the true regulatory from the binding but non-regulatory targets of a TF. Results This work is the first study which aims to investigate whether the performance of computational identification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. For this purpose, we propose four simple algorithms, all of which consist of two steps. (Step 1) Define the targets of a TF using (i) ChIP-chip data in the first algorithm, (ii) TF binding data in the second algorithm, (iii) TF perturbation data in the third algorithm, and (iv) the intersection of TF binding and TF perturbation data in the fourth algorithm. Compared with the first three algorithms, the fourth algorithm uses a more biologically relevant way to define the targets of a TF. (Step 2) Measure the cooperativity of a TF pair by the statistical significance of the overlap of the targets of these two TFs using the hypergeometric test. By adopting four existing performance indices, we show that the fourth proposed algorithm (PA4) significantly out performs the other three proposed algorithms. This suggests that the computational identification of cooperative TF pairs is indeed improved when using a more biologically relevant way to define the targets of a TF. Strikingly, the prediction results of our simple PA4 are more biologically meaningful than those of the 12 existing sophisticated algorithms in the literature, all of which used ChIP-chip data to define the targets of a TF. This suggests that properly defining the targets of a TF may be more important than designing sophisticated cooperativity measures. In addition, our PA4 has the power to predict several experimentally validated cooperative TF pairs, which have not been successfully predicted by any existing algorithms in the literature. Conclusions This study shows that the performance of computationalidentification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. The main contribution of this study is not to propose another new algorithm but to provide a new thinking for the research of computational identification of cooperative TF pairs. Researchers should put more effort on properly defining the targets of a TF (i.e. Step 1) rather than totally focus on designing sophisticated cooperativity measures (i.e. Step 2). The lists of TF target genes, the Matlab codes and the prediction results of the four proposed algorithms could be downloaded from our companion website http://cosbi3.ee.ncku.edu.tw/TFI/
机译:背景技术真核生物中基因表达的转录调控通常是通过协同转录因子(TF)来完成的。协同TF对的计算识别已成为研究的热点,文献中提出了许多算法。用于预测协作TF对的典型算法有两个步骤。 (步骤1)定义每个正在研究的TF的目标。 (步骤2)基于这两个TF的目标,设计一种用于计算TF对的协作性的措施。尽管不同的算法具有独特的复杂协作性度量,但通常使用ChIP芯片数据来定义TF的目标。但是,使用ChIP芯片数据定义TF的目标存在固有的缺陷。 ChIP芯片分析只能识别TF的结合靶标,而不能区分TF的结合靶与非调控靶的真正调控。结果这项工作是第一个研究,目的是调查是否可以通过使用更生物学相关的方法来定义TF目标来提高协作TF对的计算识别性能。为此,我们提出了四种简单的算法,所有这些算法均由两个步骤组成。 (步骤1)使用(i)第一种算法中的ChIP芯片数据,(ii)第二种算法中的TF绑定数据,(iii)第三种算法中的TF扰动数据和(iv)定义TF的目标第四算法中TF绑定和TF摄动数据的交集。与前三种算法相比,第四种算法使用生物学上更相关的方式来定义TF的目标。 (第2步)使用超几何检验,通过这两个TF的目标重叠的统计显着性来测量TF对的协作性。通过采用四个现有的性能指标,我们表明第四个提出的算法(PA4)明显优于其他三个提出的算法。这表明当使用更生物学相关的方式来定义TF的目标时,协作TF对的计算识别确实得到了改善。令人惊讶的是,与文献中现有的12种复杂算法相比,我们简单的PA4的预测结果在生物学上更有意义,所有这些算法均使用ChIP芯片数据来定义TF的目标。这表明正确定义TF的目标可能比设计复杂的合作性措施更为重要。此外,我们的PA4可以预测几个经过实验验证的协作TF对,而文献中任何现有算法都没有成功预测这些对。结论这项研究表明,通过使用更生物学相关的方法来定义TF的目标,可以改善协作TF对的计算识别性能。这项研究的主要贡献不是提出另一种新算法,而是为协作TF对的计算识别研究提供了新思路。研究人员应将更多的精力放在正确定义TF的目标上(即步骤1),而不是完全专注于设计复杂的合作性措施(即步骤2)。 TF靶基因列表,Matlab代码和四种拟议算法的预测结果可以从我们的配套网站http://cosbi3.ee.ncku.edu.tw/TFI/下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号