首页> 美国卫生研究院文献>G3: GenesGenomesGenetics >Comparison of ChIP-Seq Data and a Reference Motif Set for Human KRAB C2H2 Zinc Finger Proteins
【2h】

Comparison of ChIP-Seq Data and a Reference Motif Set for Human KRAB C2H2 Zinc Finger Proteins

机译:ChIP-Seq数据与人KRAB C2H2锌指蛋白的参考基序集的比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

KRAB C2H2 zinc finger proteins (KZNFs) are the largest and most diverse family of human transcription factors, likely due to diversifying selection driven by novel endogenous retroelements (EREs), but the vast majority lack binding motifs or functional data. Two recent studies analyzed a majority of the human KZNFs using either ChIP-seq (60 proteins) or ChIP-exo (221 proteins) in the same cell type (HEK293). The ChIP-exo paper did not describe binding motifs, however. Thirty-nine proteins are represented in both studies, enabling the systematic comparison of the data sets presented here. Typically, only a minority of peaks overlap, but the two studies nonetheless display significant similarity in ERE binding for 32/39, and yield highly similar DNA binding motifs for 23 and related motifs for 34 (MoSBAT similarity score >0.5 and >0.2, respectively). Thus, there is overall (albeit imperfect) agreement between the two studies. For the 242 proteins represented in at least one study, we selected a highest-confidence motif for each protein, utilizing several motif-derivation approaches, and evaluating motifs within and across data sets. Peaks for the majority (158) are enriched (96% with AUC >0.6 predicting peak vs. nonpeak) for a motif that is supported by the C2H2 “recognition code,” consistent with intrinsic sequence specificity driving DNA binding in cells. An additional 63 yield motifs enriched in peaks, but not supported by the recognition code, which could reflect indirect binding. Altogether, these analyses validate both data sets, and provide a reference motif set with associated quality metrics.
机译:KRAB C2H2锌指蛋白(KZNFs)是人类转录因子的最大和最多样化的家族,可能是由于新型内源性逆转录因子(ERE)驱动的多样化选择,但绝大多数缺乏结合基序或功能数据。两项最新研究使用相同细胞类型(HEK293)中的ChIP-seq(60种蛋白质)或ChIP-exo(221种蛋白质)分析了大多数人类KZNF。然而,ChIP-exo论文没有描述结合基序。两项研究中均包含39种蛋白质,可对此处介绍的数据集进行系统比较。通常,只有少数峰重叠,但是两项研究在ERE结合方面显示出32/39的显着相似性,并分别产生了23个相似的DNA结合基序和34个相似的相关基序(MoSBAT相似性评分分别> 0.5和> 0.2 )。因此,两项研究之间存在总体共识(尽管不完善)。对于至少一项研究中代表的242种蛋白质,我们利用几种基序衍生方法,为每个蛋白质选择了最高置信度的基序,并评估了数据集中和数据集中的基序。多数峰(158)的峰富集(96%,AUC> 0.6预测峰对非峰)由C2H2“识别码”支持的基序,与驱动细胞内DNA结合的内在序列特异性一致。另外63个产量基序富含峰,但识别码不支持,这可能反映了间接结合。总之,这些分析验证了这两个数据集,并提供了具有相关质量指标的参考主题集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号