Past

Stochastic Linear Quadratic Optimal Control Problem: A Reinforcement Learning Method

Abstract

This talk is to adopt a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where the drift and diffusion terms in the dynamics may depend on both the state and control. Based on the Bellman’s dynamic programming principle, we presented an online RL algorithm to attain optimal control with partial system information. This algorithm computes the optimal control, rather than estimates the system coefficients, and solves the related Riccati equation. It only requires local trajectory information, which significantly simplifies the calculation process. We shed light on our theoretical findings using two numerical examples.