首页> 外文期刊>Water Research >Predicting aqueous solubility of environmentally relevant compounds from molecular features: A simple but highly effective four-dimensional model based on Project to Latent Structures
【24h】

Predicting aqueous solubility of environmentally relevant compounds from molecular features: A simple but highly effective four-dimensional model based on Project to Latent Structures

机译:从分子特征预测环境相关化合物的水溶性:基于Project to Latent Structures的简单但高效的四维模型

获取原文
获取原文并翻译 | 示例
       

摘要

The aqueous solubility (log S) of xenobiotic chemicals has been identified as a key characteristic in determining their bioaccessibility/bioavailability and their fate and transport in aquatic environments. We here explore and evaluate the use of a state-of-the-art data analysis technique (Project to Latent Structures, PLS) to estimate log S of environmentally relevant chemicals. A large number (n = 624) of molecular descriptors was computed for over 1400 organic chemicals, and then refined by a feature selection technique. Candidate predictor descriptors were fitted to data by means of PLS, which was optimized by an internal leave-one-out cross-validation technique and validated by an external data set. The final (best) PLS model with only four variables (AlogP, X1sol, Mv, and E) exhibited noteworthy stability and good predictive power. It was able to explain 91% of the data (n = 1400) variance with an average absolute error of 0.5 log units through the solubilities span over 12 orders of magnitude. The newly proposed model is transparent, easily portable from one user to another, and robust enough to accurately estimate log S of a wide range of emerging contaminants.
机译:异种生物化学物质的水溶性(log S)已被确定为确定其生物可利用性/生物利用度及其在水生环境中的命运和运输的关键特征。我们在这里探索和评估使用最先进的数据分析技术(潜在结构工程,PLS)来估算与环境相关的化学品的logS。计算了1400多种有机化学物质的大量分子描述符(n = 624),然后通过特征选择技术对其进行了改进。候选预测变量描述符通过PLS拟合到数据,PLS通过内部留一法交叉验证技术进行了优化,并通过外部数据集进行了验证。只有四个变量(AlogP,X1sol,Mv和E)的最终(最佳)PLS模型表现出显着的稳定性和良好的预测能力。通过溶解度跨12个数量级,它能够解释91%的数据(n = 1400)方差,平均绝对误差为0.5 log单位。新提出的模型是透明的,可以方便地从一个用户转移到另一个用户,并且足够健壮,可以准确地估计各种新兴污染物的logS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号