Point-Based Monte Carto Online Planning in POMDPs

Bo Wu; Yan Peng Feng; Hong Yan Zheng

doi:10.4028/www.scientific.net/AMR.846-847.1388

Paper Titles

An Optimized Collision Detection Algorithm Based on Dynamic Bounding Volume Tree
p.1372

Researching on Parsing
p.1376

Connected Mandarin Digit Speech Recognition Using Two-Layer Acoustic Universal Structure
p.1380

Research on the Auxiliary Function of Image Processing Software in the Modern Arts Creation
p.1384

Point-Based Monte Carto Online Planning in POMDPs
p.1388

Hybrid Corrected Approach for Wind Power Forecasting Based on Ordinary Least Square Method
p.1392

A Fast Way of Enterprise-Grade Embedded Software Development and Release
p.1401

The Research on Internet Electronic Identity Management Mechanism
p.1405

Energy Balance Routing Protocol Based on Location Information
p.1410

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 846-847Point-Based Monte Carto Online Planning in POMDPs

Point-Based Monte Carto Online Planning in POMDPs

Abstract:

The online planning and learning in partially observable Markov decision processes are often intractable because belief states space has two curses: dimensionality and history. In order to address this problem, this paper proposes a point-based Monte Carto online planning approach in POMDPs. This approach involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes. Then Monte Carlo tree search algorithm is exploited to share the value of actions across each subtree of the search tree so as to minimise the mean squared error. The experimental results show that the proposed algorithm is effective in real-time system.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 846-847)

Pages:

1388-1391

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.846-847.1388

Citation:

Cite this paper

Online since:

November 2013

Authors:

Bo Wu*, Yan Peng Feng, Hong Yan Zheng

Keywords:

Monte Carto, Planning, Point-Based, POMDPs

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] J. Pineau, G. Gordon and Thrun S, Anytime point-based approximations for large POMDPs. J Artif Intell Res, vol. 27(2006), pp.335-380.

DOI: 10.1613/jair.2078

Google Scholar

[2] M. T. J. Spaan and N. Vlassis, Perseus: randomized point-based value iteration for POMDPs. J Artif Intell Res, vol. 24 (2005), pp.195-220.

DOI: 10.1613/jair.1659

Google Scholar

[3] T. Smith and R. Simmons, Point-based POMDP algorithms: improved analysis and implementation, in: Proceedings of uncertainty in artificial intelligence(UAI-05), Cambridge, MA, pp.542-547. (2005).

Google Scholar

[4] H. B. McMahan, M. Likhachev and G. Gordon, Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees, in: Proceedings of the twenty-second international conference on machine learning, New York, pp.569-576. (2005).

DOI: 10.1145/1102351.1102423

Google Scholar

[5] G. Shani, R. Brafman and S. Shimony S, Forward search value iteration for POMDPs, in: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, pp.2619-2624. (2007).

Google Scholar

[6] D. McAllester and S. Singh, Approximate planning for factored POMDPs using belief state simplification, in: Proceedings of the fifteenth conference on uncertainty in artificial intelligence, Stockholm, Sweden, pp.409-416. (1999).

Google Scholar

[7] S. Paquet, L. Tobin and B. Chaib-draa, An online POMDP algorithm for complex multiagent environments. in: Proceedings of fourth international joint conference on autonomous agents and multiagent systems, Utrecht, Netherlands, pp.970-977. (2005).

DOI: 10.1145/1082473.1082620

Google Scholar

[8] S. Ross and B. Chaib-draa, Aems: an anytime online search algorithm for approximate policy refinement in large POMDPs, in: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, pp.2592-2598. (2007).

Google Scholar

[9] S. Gelly and D. Silver, Monte-Carlo tree search and rapid action value estimation in computer Go, Artificial Intelligence, vol. 175(11) (2011), pp.1856-1875.

DOI: 10.1016/j.artint.2011.03.007

Google Scholar