Optional Feature Vector Generation for Linear Value Function Approximation with Binary Features

Bo Yan Ren; Zheng Qin; Feng Fei Zhao

doi:10.4028/www.scientific.net/AMR.756-759.3967

Paper Titles

Simple and Robust RSSI Estimation Using M-Estimator
p.3946

Research on the Algorithm of Topology Discovery Based on Switch MAC's Self-Learning
p.3952

The Digital Signal Processing Algorithm Implemented on ARM Embedded System
p.3958

Analysis of Face Detection and Feature Location Application on Intelligent Character Design Situations
p.3962

Optional Feature Vector Generation for Linear Value Function Approximation with Binary Features
p.3967

A Variable Step Size LMS Adaptive Filtering Algorithm and Simulations Based on Computational Verb Theory
p.3972

DOA Estimation for Antenna Array with Partially-Well Sensors via Low-Rank Matrix Completion
p.3977

A Distributed Algorithm for Skyline Query Based on Pre-Clustering
p.3982

Analysis of the Computer Processing System of Remote Sensing Satellite Images
p.3987

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 756-759Optional Feature Vector Generation for Linear...

Optional Feature Vector Generation for Linear Value Function Approximation with Binary Features

Abstract:

Linear value function approximation with binary features is important in the research of Reinforcement Learning (RL). When updating the value function, it is necessary to generate a feature vector which contains the features that should be updated. In high dimensional domains, the generation process will take lot more time, which reduces the performance of algorithm a lot. Hence, this paper introduces Optional Feature Vector Generation (OFVG) algorithm as an improved method to generate feature vectors that can be combined with any online, value-based RL method that uses and expands binary features. This paper shows empirically that OFVG performs well in high dimensional domains.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 756-759)

Pages:

3967-3971

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.756-759.3967

Citation:

Cite this paper

Online since:

September 2013

Authors:

Bo Yan Ren, Zheng Qin, Feng Fei Zhao

Keywords:

Binary Feature, Linear Value Function Approximation, Reinforcement Learning

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Sutton, R.S. and Barto, A.G. (1998) Reinforcement Learning: An Introduction, MIT Press.

Google Scholar

[2] Parr, R., Li, L., Taylor, G., Painter-Wakefield, C., and Littman, M. (2008).

Google Scholar

[3] Tsitsiklis, J.N. and Van R.B. (1997) An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, 42(5): 674-690.

DOI: 10.1109/9.580874

Google Scholar

[4] Parr, R., Painter-Wakefield, C., Li, L., and Littman, M. (2007) Analyzing feature generation for value-function approximation, in ICML 2007: Proceedings of the 24th International Conference on Machine learning, New York, NY, USA, pp.737-744.

DOI: 10.1145/1273496.1273589

Google Scholar

[5] Petrik, M., Taylor, G., Parr, R., and Zilberstein, S. (2010).

Google Scholar

[6] Geramifard, A., Doshi, F., Redding, J., Roy, N., and How, J.P. (2011) Online Discovery of Feature Dependencies,. Paper Presented at the 28th International Conference on Machine Learning, Bellevue, WA, USA.

Google Scholar

[7] Buro, M. (1998) From Simple Features to Sophisticated Evaluation Functions, in CG 1998: Proceedings of the First International Conference on Computers and Games, pp.126-145.

DOI: 10.1007/3-540-48957-6_8

Google Scholar

[8] Sutton, R.S. (1988) Learning to predict by the methods of temporal differences, in Machine Learning, 3: 9-44.

Google Scholar

[9] Gomes, E.R. and Kowalczyk, R. (2009) Dynamic Analysis of Multiagent Q-learning with e-greedy Exploration, in ICML 2009: Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, pp.369-376.

DOI: 10.1145/1553374.1553422

Google Scholar