Body

Reinforcement learning is a general technique that allows an agent to learn an optimal policy and interact with an environment in sequential decision making problems. The goodness of a policy is measured by its value function starting from some initial state. This talk includes a few topics about constructing statistical inference for a policy's value in infinite horizon settings where the number of decision points diverges to infinity. Applications in real world examples will also be discussed.