Graduate Thesis Or Dissertation
 

Adiabatic Markov Decision Process : convergence of value iteration algorithm

Öffentlich Deposited

Herunterladbarer Inhalt

PDF Herunterladen
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/rf55zb80p

Descriptions

Attribute NameValues
Creator
Abstract
  • Markov Decision Process (MDP) is a well-known framework for devising the optimal decision making strategies under uncertainty. Typically, the decision maker assumes a stationary environment which is characterized by a time-invariant transition probability matrix. However, in many real-world scenarios, this assumption is not justified, thus the optimal strategy might not provide the expected performance. In this thesis, we study the performance of the classic Value Iteration algorithm for solving an MDP problem under non-stationary environments. Specifically, the non-stationary environment is modeled as a sequence of time-variant transition probability matrices governed by an adiabatic evolution inspired from quantum mechanics. We characterize the performance of the Value Iteration algorithm subject to the rate of change of the underlying environment. The performance is measured in terms of the convergence rate to the optimal average reward. We show two examples of queuing systems that make use of our analysis framework.
License
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Subject
Urheberrechts-Erklärung
Publisher
Peer Reviewed
Language
Replaces

Beziehungen

Parents:

This work has no parents.

In Collection:

Artikel