首页> 外文会议>Recent advances in reinforcement learning >Policy Iteration for Learning an Exercise Policy for American Options
【24h】

Policy Iteration for Learning an Exercise Policy for American Options

机译:学习美国期权行权政策的政策迭代

获取原文
获取原文并翻译 | 示例

摘要

Options are important financial instruments, whose prices are usually determined by computational methods. Computational finance is a compelling application area for reinforcement learning research, where hard sequential decision making problems abound and have great practical significance. In this paper, we investigate reinforcement learning methods, in particular, least squares policy iteration (LSPI), for the problem of learning an exercise policy for American options. We also investigate a method by Tsitsiklis and Van Roy, referred to as FQI. We compare LSPI and FQI with LSM, the standard least squares Monte Carlo method from the finance community. We evaluate their performance on both real and synthetic data. The results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Our work shows that solution methods developed in reinforcement learning can advance the state of the art in an important and challenging application area, and demonstrates furthermore that computational finance remains an under-explored area for deployment of reinforcement learning methods.
机译:期权是重要的金融工具,其价格通常由计算方法确定。计算金融是强化学习研究的一个引人注目的应用领域,其中困难的顺序决策问题比比皆是,具有很大的现实意义。在本文中,我们研究了强化学习方法,特别是最小二乘策略迭代(LSPI),用于学习美式期权的执行政策问题。我们还研究了Tsitsiklis和Van Roy提出的一种称为FQI的方法。我们将LSPI和FQI与LSM(金融界的标准最小二乘蒙特卡罗方法)进行比较。我们评估它们在真实和综合数据上的性能。结果表明,在真实数据和综合数据上,LSPI和FQI发现的运动策略所获得的收益均大于LSM发现的运动策略。我们的工作表明,在强化学习中开发的解决方法可以在重要且具有挑战性的应用领域中提高技术水平,并且进一步证明了计算财务仍然是部署强化学习方法尚待研究的领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号