首页> 外国专利> Method and apparatus for constructing informative outcomes to guide multi-policy decision making

Method and apparatus for constructing informative outcomes to guide multi-policy decision making

机译:构建信息结果以指导多政策决策的方法和装置

摘要

In Multi-Policy Decision-Making (MPDM), many computationally-expensive forward simulations are performed in order to predict the performance of a set of candidate policies. In risk-aware formulations of MPDM, only the worst outcomes affect the decision making process, and efficiently finding these influential outcomes becomes the core challenge. Recently, stochastic gradient optimization algorithms, using a heuristic function, were shown to be significantly superior to random sampling. In this disclosure, it was shown that accurate gradients can be computed-even through a complex forward simulation—using approaches similar to those in dep networks. The proposed approach finds influential outcomes more reliably, and is faster than earlier methods, allowing one to evaluate more policies while simultaneously eliminating the need to design an easily-differentiable heuristic function.
机译:在多策略决策(MPDM)中,执行许多计算昂贵的前向模拟,以便预测一组候选策略的性能。 在MPDM的风险感知制剂中,只有最糟糕的结果影响决策过程,有效地发现这些有影响力的结果成为核心挑战。 最近,使用启发式功能的随机梯度优化算法显示出明显优于随机抽样。 在本公开中,示出了可以计算精确的梯度 - 即使通过与DEP网络中类似的方法类似的方法,也可以使用复杂的前向仿真。 所提出的方法可以更可靠地发现有影响力的结果,并且比早期的方法更快,允许人们评估更多策略,同时消除设计易于区分的启发式功能。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号