【24h】

Causality-Guided Feature Selection

机译:因果指导的特征选择

获取原文

摘要

Identifying meaningful features that drive a phenomenon (response) of interest in complex systems of interconnected factors is a challenging problem. Causal discovery methods have been previously applied to estimate bounds on causal strengths of factors on a response or to identify meaningful interactions between factors in complex systems, but these approaches have been used only for inferential purposes. In contrast, we posit that interactions between factors with a potential caused association on a given response could be viable candidates not only for hypothesis generation but also for predictive modeling. In this work, we propose a causality-guided feature selection methodology that identifies factors having a potential cause-effect relationship in complex systems, and selects features by clustering them based on their causal strength with respect to the response. To this end, we estimate statistically significant causal effects on the response of factors taking part in potential causal relationships, while addressing associated technical challenges, such as multicollinearity in the data. We validate the proposed methodology for predicting response in five real-world datasets from the domain of climate science and biology. The selected features show predictive skill and consistent performance across different domains.
机译:识别在互连因素的复杂系统中引起关注现象(响应)的有意义的特征是一个具有挑战性的问题。以前已经使用因果发现方法来估计响应中因素的因果强度界限,或者确定复杂系统中因素之间有意义的相互作用,但是这些方法仅用于推论目的。相反,我们认为,在给定响应上具有潜在潜在关联的因素之间的相互作用可能不仅是假设生成而且是预测建模的可行候选者。在这项工作中,我们提出了一种因果关系指导的特征选择方法,该方法可识别复杂系统中具有潜在因果关系的因素,并根据因果关系相对于响应的因果关系对特征进行聚类来选择特征。为此,我们估计了统计上显着的因果关系对参与潜在因果关系的因素的响应的影响,同时解决了相关的技术挑战,例如数据中的多重共线性。我们验证了所提出的用于预测来自气候科学和生物学领域的五个真实世界数据集中响应的方法。所选功能显示了跨不同领域的预测技能和一致的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号