Scaling multiagent reinforcement learning

Proper, Scott

Graduate Thesis Or Dissertation

Scaling multiagent reinforcement learning

Public Deposited

Download PDF

Citeable URL: https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/r494vn275

Descriptions

Attribute Name	Values
Creator	Proper, Scott
Abstract	Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity or "outcome space" explosion. Multiagent domains are particularly susceptible to these problems. This thesis describes ways to mitigate these curses in several different multiagent domains, including real-time delivery of products using multiple vehicles with stochastic demands, a multiagent predator-prey domain, and a domain based on a real-time strategy game. To mitigate the problem of state-space explosion, this thesis present several approaches that mitigate each of these curses. "Tabular linear functions" (TLFs) are introduced that generalize tile-coding and linear value functions and allow learning of complex nonlinear functions in high-dimensional state-spaces. It is also shown how to adapt TLFs to relational domains, creating a "lifted" version called relational templates. To mitigate the problem of action-space explosion, the replacement of complete joint action space search with a form of hill climbing is described. To mitigate the problem of outcome space explosion, a more efficient calculation of the expected value of the next state is shown, and two real-time dynamic programming algorithms based on afterstates, ASH-learning and ATR-learning, are introduced. Lastly, two approaches that scale by treating a multiagent domain as being formed of several coordinating agents are presented. "Multiagent H-learning" and "Multiagent ASH-learning" are described, where coordination is achieved through a method called "serial coordination". This technique has the benefit of addressing each of the three curses of dimensionality simultaneously by reducing the space of states and actions each local agent must consider. The second approach to multiagent coordination presented is "assignment-based decomposition", which divides the action selection step into an assignment phase and a primitive action selection step. Like the multiagent approach, assignment-based decomposition addresses all three curses of dimensionality simultaneously by reducing the space of states and actions each group of agents must consider. This method is capable of much more sophisticated coordination. Experimental results are presented which show successful application of all methods described. These results demonstrate that the scaling techniques described in this thesis can greatly mitigate the three curses of dimensionality and allow solutions for multiagent domains to scale to large numbers of agents, and complex state and outcome spaces.
License	All rights reserved
Resource Type	Dissertation
Date Available	2009-12-17T17:43:35+00:00
Date Issued	2009-12-01
Degree Level	Doctoral
Degree Name	Doctor of Philosophy (Ph.D.)
Degree Field	Computer Science
Degree Grantor	Oregon State University
Commencement Year	2010
Advisor	Tadepalli, Prasad
Committee Member	Metoyer, Ron Dietterich, Thomas Fern, Alan Higginbotham, Jack
Academic Affiliation	Electrical Engineering and Computer Science
Non-Academic Affiliation	Oregon State University. Graduate School
Subject	Reinforcement learning Intelligent agents (Computer software)
Rights Statement	In Copyright
Publisher	Oregon State University
Peer Reviewed	No
Language	English [eng]
Replaces	http://hdl.handle.net/1957/13662

Relationships

Parents:

This work has no parents.

In Collection:

Graduate Theses and Dissertations (GTD)

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
	ProperScott2010.pdf	2017-08-01	Public	Download

ScholarsArchive@OSU

Scaling multiagent reinforcement learning

Downloadable Content

Descriptions

Relationships

Items