Paper Title:
A Multi-Step Reinforcement Learning Algorithm
  Abstract

Reinforcement learning (RL) is a state or action value based machine learning method which approximately solves large-scale Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP). A multi-step RL algorithm called Sarsa(,k) is proposed, which is a compromised variation of Sarsa and Sarsa(). It is equivalent to Sarsa if k is 1 and is equivalent to Sarsa() if k is infinite. Sarsa(,k) adjust its performance by setting k value. Two forms of Sarsa(,k), forward view Sarsa(,k) and backward view Sarsa(,k), are constructed and proved equivalent in off-line updating.

  Info
Periodical
Edited by
Ran Chen
Pages
3611-3615
DOI
10.4028/www.scientific.net/AMM.44-47.3611
Citation
Z. C. Zhang, K. S. Hu, H. Y. Huang, S. Li, S. Y. Zhao, "A Multi-Step Reinforcement Learning Algorithm", Applied Mechanics and Materials, Vols. 44-47, pp. 3611-3615, 2011
Online since
December 2010
Export
Price
$32.00
Share

In order to see related information, you need to Login.

In order to see related information, you need to Login.

Authors: Zhi Qiang Xie, Jing Yang, Yu Jing He, Guang Jie Ye
Abstract:Aiming at the dynamic integrated scheduling problem of complex multi-products with different arriving time and identical machines, an...
897
Authors: Tian Pei Zhou, Wen Fang Huang
Abstract:In the process of recycling chemical product in coking object, ammonia and tar were indispensable both metallurgy and agriculture, so the...
1945
Authors: Dong Wang, Shi Huan Xiong
Chapter 8: Nanomaterials and Nanomanufacturing
Abstract:The learning sequence is an important factor of affecting the study effect about incremental Bayesian classifier. Reasonable learning...
1455
Authors: Si Lian Xie, Tie Bin Wu, Shui Ping Wu, Yun Lian Liu
Chapter 18: Computer Applications in Industry and Engineering
Abstract:Evolutionary algorithms are amongst the best known methods of solving difficult constrained optimization problems, for which traditional...
2846
Authors: Fang Li, Yu Wang, Ying Chun Zhong, Zhi Tan
Chapter 16: Application of Information and Network Technology
Abstract:An optimization of multi-varieties and small-batch of production scheduling is proposed, which is embodied the utilization ratio of...
3177