A Multi-Step Reinforcement Learning Algorithm

Article Preview

Abstract:

Reinforcement learning (RL) is a state or action value based machine learning method which approximately solves large-scale Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP). A multi-step RL algorithm called Sarsa(,k) is proposed, which is a compromised variation of Sarsa and Sarsa(). It is equivalent to Sarsa if k is 1 and is equivalent to Sarsa() if k is infinite. Sarsa(,k) adjust its performance by setting k value. Two forms of Sarsa(,k), forward view Sarsa(,k) and backward view Sarsa(,k), are constructed and proved equivalent in off-line updating.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3611-3615

Citation:

Online since:

December 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] S. Singh, T. Jaakkola, M. L. Littman, C. Szepesvári: Machine Learning Vol. 38(2000), pp.287-308.

DOI: 10.1023/a:1007678930559

Google Scholar

[2] K. Papadakia, V. Friderikos: Computers & Operations Research Vol. 35(2008), p.3848 – 3859.

Google Scholar

[3] D. Vengerov: Future Generation Computer Systems Vol. 25(2009), pp.728-736.

Google Scholar

[4] R. S. Sutton, A. G. Barto: Reinforcement Learning: An introduction. MIT Press, Cambridge, Massachusetts (1998).

Google Scholar

[5] G. A. Rummery, M. Niranjan: On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Engineering Department, Cambridge University (1994).

Google Scholar

[6] G. A. Rummery: Problem Solving with Reinforcement Learning [Ph.D. dissertation]. Cambridge University (1995).

Google Scholar

[7] S. S. Singh, V. B. Tadić and A. Doucet: European Journal of Operational Research Vol. 178(2007), pp.808-818.

Google Scholar