Multi-criteria average reward reinforcement learning

Natarajan, Sriraam

Graduate Thesis Or Dissertation

Multi-criteria average reward reinforcement learning

Public Deposited

Télécharger le fichier PDF

Citeable URL: https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/4q77fv17v

Descriptions

Attribute Name	Values
Creator	Natarajan, Sriraam
Abstract	Reinforcement learning (RL) is the study of systems that learn from interaction with their environment. The current framework of Reinforcement Learning is based on receiving scalar rewards, which the agent aims to maximize. But in many real world situations, tradeoffs must be made among multiple objectives. This necessitates the use of vector representation of values and rewards and the use of weights to represent the importance of different objectives. In this thesis, we consider the problem of learning in the presence of time-varying preferences among multiple objectives. Learning a new policy for every possible weight vector is wasteful. Instead we propose a method that allows us store a finite number of policies, choose an appropriate policy for any weight vector and improve upon it. The idea is that though there can be infinitely many weight vectors, a lot of them will have the same optimal policy. We prove this empirically in two domains: a version of the Buridan's ass problem and network routing. We show that while learning is required for the first few weight vectors, later the agent would settle for an already learnt policy and thus would converge very quickly.
Resource Type	Masters Thesis
Date Available	2011-08-25T23:16:36+00:00
Date Issued	2004-06-02
Degree Level	Master's
Degree Name	Master of Science (M.S.)
Degree Field	Computer Science
Degree Grantor	Oregon State University
Commencement Year	2005
Advisor	Tadepalli, Prasad
Committee Member	Sethia, Saurabh Bolte, John Dietterich, Thomas G.
Academic Affiliation	Electrical Engineering and Computer Science
Non-Academic Affiliation	Oregon State University. Graduate School
Subject	Reinforcement learning (Machine learning)
Déclaration de droits	Copyright Not Evaluated
Publisher	Oregon State University
Peer Reviewed	No
Language	English [eng]
Digitization Specifications	File scanned at 300 ppi (Monochrome, 8-bit Grayscale) using ScandAll PRO 1.8.1 on a Fi-6770A in PDF format. CVista PdfCompressor 4.0 was used for pdf compression and textual OCR.
Replaces	http://hdl.handle.net/1957/22859

Des relations

Parents:

This work has no parents.

Dans Collection:

Graduate Theses and Dissertations (GTD)

Articles

La vignette	Titre	Date de téléchargement	Visibilité	actes
	NatarajanSriraam2005.pdf	2017-08-15	Public	Télécharger

Hyrax

Multi-criteria average reward reinforcement learning

Contenu téléchargeable

Descriptions

Des relations

Articles