首页>
外国专利>
- METHOD AND APPARATUS FOR ADAPTIVE MULTI-BATCH EXPERIENCE REPLAY FOR CONTINUOUS ACTION CONTROL
- METHOD AND APPARATUS FOR ADAPTIVE MULTI-BATCH EXPERIENCE REPLAY FOR CONTINUOUS ACTION CONTROL
展开▼
机译:-用于连续动作控制的自适应多批次体验重放的方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
An adaptive multi-batch experience replay technique for continuous action space control. In the adaptive multi-batch experience replay (AMBER) method, storing information tuples of samples generated based on the updated policy in a replay memory in multiple batches, random mini-batch Adjusting the size of) to reduce the average importance sampling specific gravity, calculating the average importance sampling specific gravity of each sample batch in the replay memory, for the replay memory, the calculated Dropping a batch having an average importance sampling specific gravity greater than a predetermined batch drop coefficient, and updating parameters by performing random mini-batch sampling based on the batch excluded from the drop, targeting the replay memory. You can.
展开▼