A method and system for reinforcement learning can include an actor-critic framework comprising an actor and a critic, the actor comprising an actor network and the critic comprising a critic network; and a controller comprising a neural network embedded in the actor-critic framework and which can be tuned according to reinforcement learning based tuning including anti-windup tuning.
展开▼