CLEAN learning to improve coordination and scalability in multiagent systems Public Deposited

http://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/0p096994s

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • Recent advances in multiagent learning have led to exciting new capabilities spanning fields as diverse as planetary exploration, air traffic control, military reconnaissance, and airport security. Such algorithms provide a tangible benefit over traditional control algorithms in that they allow fast responses, adapt to dynamic environments, and generally scale well. Unfortunately, because many existing multiagent learning methods are extensions of single agent approaches, they are inhibited by three key issues: i) they treat the actions of other agents as "environmental noise" in an attempt to simplify the problem complexity, ii) they are slow to converge in large systems as the joint action space grows exponentially in the number of agents, and iii) they frequently rely upon the presence of an accurate system model being readily available. This work addresses these three issues sequentially. First, we improve overall learning performance compared to existing state-of-the-art techniques in the field by embracing the exploration in learning rather than ignoring it or approximating it away. Within multiagent systems, exploration by individual agents significantly alters the dynamics of the environment in which all agents learn. To address this, we introduce the concept of "private" exploration, which enables each agent to present a stationary baseline policy to other agents in order to allow other agents in the system to learn more efficiently. In particular, we introduce Coordinated Learning without Exploratory Action Noise (CLEAN) rewards which improve coordination and performance by utilizing the concept of private exploration in order to remove the negative impact of traditional "public" exploration strategies from learning in multiagent systems. Next, we leverage the fundamental properties of CLEAN rewards that enable private exploration to allow agents to explore multiple potential actions concurrently in a "batch mode" in order to significantly improve learning speed over the state-of-the-art. Finally, we improve the real-world applicability of the proposed techniques by reducing their requirements. Specifically, the CLEAN rewards developed require an accurate partial model (i.e., an accurate model of the system objective) of the system in order to be computed. Unfortunately, many real-world systems are too complex to be modeled or are not known in advance, so an accurate system model is not available a priori. We address this shortcoming by employing model-based reinforcement learning techniques to enable agents to construct their own approximate model of the system objective based upon their observations and use this approximate model to calculate their CLEAN rewards.
Resource Type
Date Available
Date Copyright
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Keyword
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces
Additional Information
  • description.provenance : Made available in DSpace on 2013-06-11T21:23:41Z (GMT). No. of bitstreams: 2 license_rdf: 1089 bytes, checksum: 0a703d871bf062c5fdc7850b1496693b (MD5) HolmesParkerChristopherG2013.pdf: 2896797 bytes, checksum: 931f5dfebe2fc1a376a8196f80317da2 (MD5) Previous issue date: 2013-04-15
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2013-06-11T21:23:41Z (GMT) No. of bitstreams: 2 license_rdf: 1089 bytes, checksum: 0a703d871bf062c5fdc7850b1496693b (MD5) HolmesParkerChristopherG2013.pdf: 2896797 bytes, checksum: 931f5dfebe2fc1a376a8196f80317da2 (MD5)
  • description.provenance : Submitted by Christopher Holmes Parker (holmespc@onid.orst.edu) on 2013-06-03T17:16:00Z No. of bitstreams: 2 license_rdf: 1089 bytes, checksum: 0a703d871bf062c5fdc7850b1496693b (MD5) HolmesParkerChristopherG2013.pdf: 2896797 bytes, checksum: 931f5dfebe2fc1a376a8196f80317da2 (MD5)
  • description.provenance : Approved for entry into archive by Julie Kurtz(julie.kurtz@oregonstate.edu) on 2013-06-07T21:46:41Z (GMT) No. of bitstreams: 2 license_rdf: 1089 bytes, checksum: 0a703d871bf062c5fdc7850b1496693b (MD5) HolmesParkerChristopherG2013.pdf: 2896797 bytes, checksum: 931f5dfebe2fc1a376a8196f80317da2 (MD5)

Relationships

Parents:

This work has no parents.

Last modified

Downloadable Content

Download PDF

Items