Discrete Action Dependant Heuristic Dynamic Programming in Control of a Wheeled Mobile Robot
In presented paper we propose a discrete tracking control algorithm for a two-wheeled mobile robot. The control algorithm consists of discrete Adaptive Critic Design (ACD) in Action Dependant Heuristic Dynamic Programming (ADHDP) configuration, PD controller and a supervisory term, derived from the Lyapunov stability theorem and based on the variable structure systems theory. Adaptive Critic Designs are a group of algorithms that use two independent structures for estimation of optimal value function from Bellman equation and estimation of optimal control law. ADHDP algorithm consists of Actor (ASE - Associate Search Element) that estimates the optimal control law and Critic (ACE - Adaptive Critic Element) that evaluates quality of control by estimation of the optimal value function from Bellman equation. Both structures are realized in a form of Neural Networks (NN). ADHDP algorithm does not require a plant model (the wheeled mobile robot (WMR) model) for ACE or ASE neural network weights update procedure (in contrast with other ACD configurations e.g. Heuristic Dynamic Programming or Dual Heuristic Programming that use the plant model). In presented control algorithm Actor-Critic structure is supported by PD controller and the supervisory term, that guarantee stable implementation of tracking in an initial adaptive critic neural networks learning phase, and robustness in a face of disturbances. Verification of proposed control algorithm was realized on the two-wheeled mobile robot Pioneer-2DX.
Andrejus H. Marcinkevičius and Algirdas V.Valiulis
Z. Hendzel and M. Szuster, "Discrete Action Dependant Heuristic Dynamic Programming in Control of a Wheeled Mobile Robot", Solid State Phenomena, Vol. 164, pp. 419-424, 2010