Graduate Thesis Or Dissertation

 

Monte Carlo Tree Search with Fixed and Adaptive Abstractions Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/kw52jc45j

Descriptions

Attribute NameValues
Creator
Abstract
  • Monte Carlo tree search (MCTS) is a class of online planning algorithms for Markov decision processes (MDPs) and related models that has found success in challenging applications. In the online planning approach, the agent makes a decision in the current state by performing a limited forward search over possible futures and selecting the course of action that is expected to lead to the best outcomes. This thesis proposes a new approach to MCTS based on abstraction and progressive abstraction refinement that makes better use of a limited number of samples. Our first contribution is an analysis of state abstraction in the MCTS setting. We describe a class of state aggregation abstractions that generalizes previously-proposed abstraction criteria and show that the regret due to planning with such abstractions is bounded. We then adapt popular MCTS algorithms to use fixed state abstractions. Our second contribution is a novel approach to MCTS based on abstraction refinement. We propose the Progressive Abstraction Refinement for Sparse Sampling (PARSS) algorithm, which begins by performing sparse sampling with a coarse state abstraction and then refines the abstraction progressively to make it more accurate. The PARSS algorithm provides the same formal guarantees as ordinary sparse sampling, and we show experimentally that PARSS outperforms sparse sampling in the ground state space and with fixed uninformed abstractions. Our third contribution is an extension of the progressive refinement idea to incorporate other kinds of abstraction. For this purpose, we introduce the formalism of abstraction diagrams (ADs) and show that ADs can express diverse kinds of abstraction, including state abstraction, temporal abstraction, and action pruning. We then describe refinement operators for ADs, extending the progressive refinement search framework to abstractions represented as ADs. Our fourth and final contribution is an application of online planning algorithms to the problem of controlling electrical transmission grids to mitigate the effects of equipment failures. Our work in this area is distinguished by the use of a full dynamical model of the power grid, which captures more mechanisms of cascading failure than simpler models. Because of the computational cost of the simulation, we choose simple online planning algorithms that require a small number of simulation trajectories. Our results demonstrate the superiority of the online planning approach to fixed expert policies, while also highlighting the need for faster simulators to enable more sophisticated solution algorithms.
License
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces

Relationships

Parents:

This work has no parents.

In Collection:

Items