Reinforcement learning concerns the problem of a learning agent interacting with its environmnet ot achieve a goal. istead of being given exmples of desired behavior, the learning agent must discover by trial and error how to behave in order to get the most rewared. The environment is a Markov decision process with state set, delta, and action set, A. The agent and the environmen interact in a sequence of discrete steps, t=0,1,2,---
展开▼