Using RUBI to partition agents in air traffic problems with hard constraints and reward shaping

Curran, William

Graduate Thesis Or Dissertation

Using RUBI to partition agents in air traffic problems with hard constraints and reward shaping

公开 Deposited

下载PDF文件

Citeable URL: https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/c534fr33x

Descriptions

Attribute Name	Values
Creator	Curran, William
Abstract	Air traffic flow management over the U.S. airpsace is a difficult problem. Current management approaches lead to hundreds of thousands of hours of delay, costing billions of dollars annually. Weather and airport conditions may instigate this delay, but routing decisions balancing delay with congestion contribute significantly to the propagation of delays throughout the US airspace. The task of managing delay may be seen as a multiagent congestion problem. In this problem there are many tightly coupled agents whose actions collectively impact the system, making it difficult for agents to learn how they individually affect the system. Reward shaping is effective at improving agent learning for soft constraint problems by reducing noise caused by interactions with other agents, so we extend those results to hard constraints that cannot be easily learned, and must be enforced algorithmically. Additionally, congestion must be removed from the system in order to ensure a safe environment for all aircraft. This can be done through the use of a greedy scheduler, enforcing a ground delay on each plane to remove congestion from the system, at the cost of delay. Reward shaping reduces noise caused by agent interactions, and a greedy scheduler removes congestion from the system, but these two approaches cannot be simply combined without a large increase in computational complexity. Agent partitioning can be used to alleviate this complexity by treating each partition of agents as an independent set of agents from other partitions, thus simplifying the computation during each learning step. Our approach is based on the combination of three different methods to perform hard constraint optimization: a greedy scheduler, reward shaping and agent partitioning. We present two agent partitioning algorithms in conjunction with this approach to simplify the learning domain. The first partitioning algorithm uses system features to compute similarities between agents. This is then used to partition agents into small subsets. To remove the assumption of having domain knowledge, we introduce the Reward/Utility Based Impact (RUBI) algorithm. This algorithm develops an effective similarity matrix while requiring no prior domain knowledge. This similarity matrix can then be used in any similarity based partitioning algorithm. Our results show that autonomous partitioning of the agents using system features leads to up to 450x speed over simple hard constraint enforcement, as well as up to a 37% improvement in performance over a greedy scheduling solution, while using RUBI to partition agents led to a better partitioning of agents, as well as a 510x speed up. Both results correspond to hundreds of hours of delay saved in a single day.
License	All rights reserved
Resource Type	Masters Thesis
Date Available	2013-07-02T18:17:35+00:00
Date Issued	2013-06-04
Degree Level	Master's
Degree Name	Master of Science (M.S.)
Degree Field	Computer Science
Degree Grantor	Oregon State University
Commencement Year	2013
Advisor	Tumer, Kagan
Committee Member	Smart, Bill Tadepalli, Prasad Vergara, Hector
Academic Affiliation	Electrical Engineering and Computer Science
Non-Academic Affiliation	Oregon State University. Graduate School
Subject	Multiagent systems Reinforcement learning Air traffic control -- Computer programs
权利声明	In Copyright
Publisher	Oregon State University
Peer Reviewed	No
Language	English [eng]
Replaces	http://hdl.handle.net/1957/40034

关联

Parents:

This work has no parents.

属于 Collection:

Graduate Theses and Dissertations (GTD)

单件

缩略图	标题	上传日期	公开度	行动
	CurranWilliamJ2013.pdf	2017-08-04	公开	下载

蹄兔

Using RUBI to partition agents in air traffic problems with hard constraints and reward shaping

可下载的内容

Descriptions

关联

单件