Multi-criteria average reward reinforcement learning Public Deposited

http://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/4q77fv17v

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • Reinforcement learning (RL) is the study of systems that learn from interaction with their environment. The current framework of Reinforcement Learning is based on receiving scalar rewards, which the agent aims to maximize. But in many real world situations, tradeoffs must be made among multiple objectives. This necessitates the use of vector representation of values and rewards and the use of weights to represent the importance of different objectives. In this thesis, we consider the problem of learning in the presence of time-varying preferences among multiple objectives. Learning a new policy for every possible weight vector is wasteful. Instead we propose a method that allows us store a finite number of policies, choose an appropriate policy for any weight vector and improve upon it. The idea is that though there can be infinitely many weight vectors, a lot of them will have the same optimal policy. We prove this empirically in two domains: a version of the Buridan's ass problem and network routing. We show that while learning is required for the first few weight vectors, later the agent would settle for an already learnt policy and thus would converge very quickly.
Resource Type
Date Available
Date Copyright
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Subject
Rights Statement
Peer Reviewed
Language
Digitization Specifications
  • File scanned at 300 ppi (Monochrome, 8-bit Grayscale) using ScandAll PRO 1.8.1 on a Fi-6770A in PDF format. CVista PdfCompressor 4.0 was used for pdf compression and textual OCR.
Replaces
Additional Information
  • description.provenance : Approved for entry into archive by Anna Opoien(anna.opoien@oregonstate.edu) on 2011-08-25T23:16:36Z (GMT) No. of bitstreams: 1 NatarajanSriraam2005.pdf: 2701527 bytes, checksum: 496026a091e8766ba27356c6342be7de (MD5)
  • description.provenance : Submitted by Sergio Trujillo (jstscanner@gmail.com) on 2011-08-25T22:54:12Z No. of bitstreams: 1 NatarajanSriraam2005.pdf: 2701527 bytes, checksum: 496026a091e8766ba27356c6342be7de (MD5)
  • description.provenance : Made available in DSpace on 2011-08-25T23:16:36Z (GMT). No. of bitstreams: 1 NatarajanSriraam2005.pdf: 2701527 bytes, checksum: 496026a091e8766ba27356c6342be7de (MD5) Previous issue date: 2004-06-02
  • description.provenance : Rejected by Anna Opoien(anna.opoien@oregonstate.edu), reason: Page 6 is missing. Please scan and insert it. Rescan pages 41, 42, and 43 in greyscale. on 2011-08-24T23:28:31Z (GMT)
  • description.provenance : Approved for entry into archive by Anna Opoien(anna.opoien@oregonstate.edu) on 2011-08-25T23:10:28Z (GMT) No. of bitstreams: 1 NatarajanSriraam2005.pdf: 2701527 bytes, checksum: 496026a091e8766ba27356c6342be7de (MD5)
  • description.provenance : Submitted by Sergio Trujillo (jstscanner@gmail.com) on 2011-08-23T18:51:10Z No. of bitstreams: 1 NatarajanSriraam2005.pdf: 776497 bytes, checksum: cd01227f7eac48b13bf0968646a83a2e (MD5)

Relationships

In Administrative Set:
Last modified: 08/15/2017

Downloadable Content

Download PDF
Citations:

EndNote | Zotero | Mendeley

Items