Graduate Thesis Or Dissertation
 

Scaling multiagent reinforcement learning

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/r494vn275

Descriptions

Attribute NameValues
Creator
Abstract
  • Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity or "outcome space" explosion. Multiagent domains are particularly susceptible to these problems. This thesis describes ways to mitigate these curses in several different multiagent domains, including real-time delivery of products using multiple vehicles with stochastic demands, a multiagent predator-prey domain, and a domain based on a real-time strategy game. To mitigate the problem of state-space explosion, this thesis present several approaches that mitigate each of these curses. "Tabular linear functions" (TLFs) are introduced that generalize tile-coding and linear value functions and allow learning of complex nonlinear functions in high-dimensional state-spaces. It is also shown how to adapt TLFs to relational domains, creating a "lifted" version called relational templates. To mitigate the problem of action-space explosion, the replacement of complete joint action space search with a form of hill climbing is described. To mitigate the problem of outcome space explosion, a more efficient calculation of the expected value of the next state is shown, and two real-time dynamic programming algorithms based on afterstates, ASH-learning and ATR-learning, are introduced. Lastly, two approaches that scale by treating a multiagent domain as being formed of several coordinating agents are presented. "Multiagent H-learning" and "Multiagent ASH-learning" are described, where coordination is achieved through a method called "serial coordination". This technique has the benefit of addressing each of the three curses of dimensionality simultaneously by reducing the space of states and actions each local agent must consider. The second approach to multiagent coordination presented is "assignment-based decomposition", which divides the action selection step into an assignment phase and a primitive action selection step. Like the multiagent approach, assignment-based decomposition addresses all three curses of dimensionality simultaneously by reducing the space of states and actions each group of agents must consider. This method is capable of much more sophisticated coordination. Experimental results are presented which show successful application of all methods described. These results demonstrate that the scaling techniques described in this thesis can greatly mitigate the three curses of dimensionality and allow solutions for multiagent domains to scale to large numbers of agents, and complex state and outcome spaces.
License
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces

Relationships

Parents:

This work has no parents.

In Collection:

Items