Adiabatic Markov Decision Process : convergence of value iteration algorithm

Duong, Thai

Graduate Thesis Or Dissertation

Adiabatic Markov Decision Process : convergence of value iteration algorithm

Öffentlich Deposited

PDF Herunterladen

Citeable URL: https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/rf55zb80p

Descriptions

Attribute Name	Values
Creator	Duong, Thai
Abstract	Markov Decision Process (MDP) is a well-known framework for devising the optimal decision making strategies under uncertainty. Typically, the decision maker assumes a stationary environment which is characterized by a time-invariant transition probability matrix. However, in many real-world scenarios, this assumption is not justified, thus the optimal strategy might not provide the expected performance. In this thesis, we study the performance of the classic Value Iteration algorithm for solving an MDP problem under non-stationary environments. Specifically, the non-stationary environment is modeled as a sequence of time-variant transition probability matrices governed by an adiabatic evolution inspired from quantum mechanics. We characterize the performance of the Value Iteration algorithm subject to the rate of change of the underlying environment. The performance is measured in terms of the convergence rate to the optimal average reward. We show two examples of queuing systems that make use of our analysis framework.
License	All rights reserved
Resource Type	Masters Thesis
Date Available	2013-07-02T22:15:57+00:00
Date Issued	2013-06-10
Degree Level	Master's
Degree Name	Master of Science (M.S.)
Degree Field	Electrical and Computer Engineering
Degree Grantor	Oregon State University
Commencement Year	2013
Advisor	Nguyen, Thinh P.
Committee Member	Bose, Bella Tadepalli, Prasad Kovchegov, Yevgeniy Raich, Raviv
Academic Affiliation	Electrical Engineering and Computer Science
Non-Academic Affiliation	Oregon State University. Graduate School
Subject	Queuing theory Decision making -- Mathematical models Markov processes
Urheberrechts-Erklärung	In Copyright
Publisher	Oregon State University
Peer Reviewed	No
Language	English [eng]
Replaces	http://hdl.handle.net/1957/40061

Beziehungen

Parents:

This work has no parents.

In Collection:

Graduate Theses and Dissertations (GTD)

Artikel

Miniaturansicht	Titel	Hochladedatum	Sichtbarkeit	Aktionen
	DuongThaiP2013.pdf	2017-08-16	Öffentlich	Herunterladen

Hyrax

Adiabatic Markov Decision Process : convergence of value iteration algorithm

Herunterladbarer Inhalt

Descriptions

Beziehungen

Artikel