...
首页> 外文期刊>Chemical informatics >High-Throughput Screening Assay Datasets from the PubChem Database
【24h】

High-Throughput Screening Assay Datasets from the PubChem Database

机译:PubChem数据库中的高通量筛选分析数据集

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Availability of high-throughput screening (HTS) data in the public domain offers great potential to foster development of ligand-based computer-aided drug discovery (LB-CADD) methods crucial for drug discovery efforts in academia and industry. LB-CADD method development depends on high-quality HTS assay data, i.e., datasets that contain both active and inactive compounds. These active compounds are hits from primary screens that have been tested in concentrationresponse experiments and where the target-specificity of the hits has been validated through suitable secondary screening experiments. Publicly available HTS repositories such as PubChem often provide such data in a convoluted way: compounds that are classified as inactive need to be extracted from the primary screening record. However, compounds classified as active in the primary screening record are not suitable as a set of active compounds for LB-CADD experiments due to high false-positive rate. A suitable set of actives can be derived by carefully analysing results in often up to five or more assays that are used to confirm and classify the activity of compounds. These assays, in part, build on each other. However, often not all hit compounds from the previous screen have been tested. Sometimes a compound can be classified as ‘active’, though its meaning is ‘inactive’ on the target of interest as it is ‘active’ on a different target protein. Here, a curation process of hierarchically related confirmatory screens is illustrated based on two specifically chosen protein use-cases.The subsequent re-upload procedure into PubChem is described for the findings of those two scenarios. Further, we provide nine publicly accessible high quality datasets for future LB-CADD method development that provide a common baseline for comparison of future methods to the scientific community. We also provide a protocol researchers can follow to upload additional datasets for benchmarking. Keywords: HTS; PubChem; Datasets; LB-CADD
机译:高通量筛选(HTS)数据在公共领域的可用性为促进基于配体的计算机辅助药物发现(LB-CADD)方法的发展提供了巨大潜力,这些方法对于学术界和工业界的药物发现工作至关重要。 LB-CADD方法的开发取决于高质量的HTS分析数据,即包含活性和非活性化合物的数据集。这些活性化合物是来自初次筛选的命中物,这些筛选物已在浓度响应实验中进行了测试,并且这些命中物的靶标特异性已通过合适的二次筛选实验进行了验证。公众可获得的HTS储存库(例如PubChem)通常会以令人费解的方式提供此类数据:归类为非活性的化合物需要从初步筛选记录中提取。但是,由于高假阳性率,在初次筛选记录中被分类为有活性的化合物不适合作为用于LB-CADD实验的一组活性化合物。一组合适的活性成分可以通过经常在多达五个或更多个用于确认和分类化合物活性的测定中仔细分析结果来得出。这些检测在某种程度上是相互依存的。但是,通常并非所有来自先前筛选的命中化合物都经过了测试。有时,化合物的含义是在目标靶标上是“无活性”,因为它在不同的靶蛋白上具有“活性”,尽管它的含义是“无活性”。在此,基于两个特定选择的蛋白质用例说明了与层次相关的确认性筛选的策划过程。随后针对这两种情况的发现描述了随后重新上传到PubChem中的过程。此外,我们为将来的LB-CADD方法开发提供了九个可公开访问的高质量数据集,这些数据集为将未来方法与科学界进行比较提供了共同的基准。我们还提供了研究人员可以遵循的协议,以上传其他数据集进行基准测试。关键字:HTS; PubChem;数据集磅CAD

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号