首页> 外文期刊>Journal of Chemical Information and Computer Sciences >MOLECULAR DIVERSITY IN CHEMICAL DATABASES - COMPARISON OF MEDICINAL CHEMISTRY KNOWLEDGE BASES AND DATABASES OF COMMERCIALLY AVAILABLE COMPOUNDS
【24h】

MOLECULAR DIVERSITY IN CHEMICAL DATABASES - COMPARISON OF MEDICINAL CHEMISTRY KNOWLEDGE BASES AND DATABASES OF COMMERCIALLY AVAILABLE COMPOUNDS

机译:化学数据库中的分子多样性-药物化学知识库与商业化合物数据库的比较。

获取原文
获取原文并翻译 | 示例
           

摘要

A molecular descriptor space has been developed which describes structural diversity. Large databases of molecules have been mapped into it and compared. This analysis used five chemical databases, CMC and MDDR, which represent knowledge bases containing active medicinal agents, ACD and SPECS, two databases of commercially available compounds, and finally the Wellcome Registry. Together these databases contained more than 300 000 structures. Topological indices and the free energy of solvation were computed for each compound in the databases. Factor analysis was used to reduce the dimensionality of the descriptor space. Low density observations were deleted as a way of removing outliers, which allowed a further reduction in the descriptor space of interest. The five databases could then be compared on an efficient basis using a metric developed for this purpose. A Riemann gridding scheme was used to subdivide the factor space into subhypercubes to obtain accurate comparisons. Most of the 300 000 structures were highly clustered, but unique structures were found. An analysis of overlap between the biological and commercial databases was carried out. The metric provides a useful algorithm for choosing screening sets of diverse compounds from large databases. [References: 11]
机译:已经开发了描述结构多样性的分子描述符空间。大型分子数据库已映射到其中并进行了比较。该分析使用了五个化学数据库(CMC和MDDR),它们代表包含活性药物的知识库,ACD和SPECS,两个可商购化合物的数据库,最后是Wellcome Registry。这些数据库总共包含30万多个结构。计算数据库中每种化合物的拓扑指数和溶剂化自由能。使用因子分析来减少描述符空间的维数。删除低密度观测值是消除异常值的一种方法,这可以进一步减少感兴趣的描述符空间。然后可以使用为此目的开发的度量标准对五个数据库进行有效比较。使用黎曼网格化方案将因子空间细分为超超立方体,以获得准确的比较。 300 000个结构中的大多数都是高度聚集的,但是发现了独特的结构。对生物学数据库和商业数据库之间的重叠进行了分析。该度量标准提供了一种有用的算法,可从大型数据库中选择多种化合物的筛选集。 [参考:11]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号