Learning from action not taken in multiagent systems Public Deposited

http://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/td96k517j

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • Coordination in large multiagent systems in order to achieve a system level goal is a critical challenge. Given the agents' intention to cooperate, there is no guarantee that the agent actions will lead to good system objective especially when the system becomes large. One of the primary difficulties in such mulitagent systems is the slow learning process. Agents need to learn how to interact with other agents in a complex and dynamic system while adapting in the presence of other agents that are simultaneously learning. Presented in this thesis is a unique multiagent learning approach that significantly improves both learning speed and system level performance in multiagent systems by having an agent update its estimate of the reward (e.g., value function in reinforcement learning) for all its available actions, not just the action that was taken. This method is based on the agent receiving the reward for the actions they do not take by estimating the counterfactual reward it would have received had it taken those actions. The experimental results illustrate that the rewards on such "actions not taken" are helpful early in the learning process. The agents then use their team members to estimate these rewards resulting in principally learning as a team. Finally, it is shown that fast learning is essential in a dynamic environment. The ANT reward with teams presents improvement in speed that results in more stability in following the changes in such an environment.
Resource Type
Date Available
Date Copyright
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Keyword
Subject
Rights Statement
Language
Replaces
Additional Information
  • description.provenance : Approved for entry into archive by Julie Kurtz(julie.kurtz@oregonstate.edu) on 2009-06-26T18:25:28Z (GMT) No. of bitstreams: 1 Newsha_thesis.pdf: 323110 bytes, checksum: e790355acf892fb5b4abebe774097d56 (MD5)
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2009-06-26T22:12:29Z (GMT) No. of bitstreams: 1 Newsha_thesis.pdf: 323110 bytes, checksum: e790355acf892fb5b4abebe774097d56 (MD5)
  • description.provenance : Made available in DSpace on 2009-06-26T22:12:29Z (GMT). No. of bitstreams: 1 Newsha_thesis.pdf: 323110 bytes, checksum: e790355acf892fb5b4abebe774097d56 (MD5)
  • description.provenance : Submitted by Newsha Khani (khanin@onid.orst.edu) on 2009-06-26T18:21:39Z No. of bitstreams: 1 Newsha_thesis.pdf: 323110 bytes, checksum: e790355acf892fb5b4abebe774097d56 (MD5)

Relationships

Parents:

This work has no parents.

Last modified

Downloadable Content

Download PDF

Items