Graduate Thesis Or Dissertation
 

Robust Long Term Autonomy Through Local Behaviors

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/q524jx251

Descriptions

Attribute NameValues
Creator
Abstract
  • Multi-robot teams offer promising solutions for many long term deployments in remote and dangerous domains, such as extraterrestrial or underseas exploration. However, long term deployments present many problems preventing robot teams from operating effectively. Learning over long time scales is makes it difficult to assign credit to robots' actions, as rewards distributed at the end of a task do not clearly highlight which actions during deployment were most useful. Additionally, the environment is more likely to change during long deployments, so robots must recognize and adapt to new environments to continue their missions. To overcome these challenges, robots must learn with little feedback, integrate new information overtime, and develop new strategies to overcome unpredictable challenges. The first contribution of this work is Multi-Fitness Learning (MFL), a learning approach that handles the multi-faceted nature of long term tasks --- and their associated rewards --- by exploiting the fact agents solve one problem at a time. Instead of learning every solution in one monolithic policy, agents using MFL learn individual behaviors, then learn which behavior matters when. This approach is effective in sparsely rewarded long term deployments with complex coupling between objectives. The second contribution, Reactive Multi-Fitness Learning (R-MFL), identifies and integrates new information into a learner in a non-forgetful manner. MFL is advantageous structure because agents use multiple behaviors, but is hampered when agent capabilities are fixed to the initial behavior set. R-MFL addresses this limitation by selecting behavior groups and executing a single behavior from the group. When adaptation needs to occur, new behaviors are added into the group. This adds new skills without forgetting or overwriting any previous behaviors, and preserves the team coordination learned across agents. The third contribution is a set of hardware experiments validating R-MFL's ability to identify skill gaps and integrate new behaviors by analyzing reward functions. An unexpected disturbance is presented to a pair of ground robots, preventing one of them from successfully observing their objective. Without sensors to directly measure this disturbance, the agents using R-MFL successfully analyze local rewards to ask for, and then integrate, a new behavior to overcome the disturbance. The final contribution of this dissertation is \ac{KS-ME}, which generates new behaviors to autonomously overcome external disturbances identified by R-MFL. Searching the space of potential behaviors is difficult because robots cannot directly measure quality on the new task without physically trying a behavior. Subsequently, agents must intelligently search the behavior space to generate as few candidate solutions as possible. By shaping the behavior space with knowledge about the performance of the current behaviors, agents are able to generate new behaviors that are novel from the current behavior set. Overall, this dissertation addresses learning in complex, long-term tasks by operating on multiple rewards, determining when new behaviors are needed to overcome challenges, and generating new behaviors to match unknown reward functions.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Non-Academic Affiliation
Rights Statement
Publisher
Peer Reviewed
Language
Embargo reason
  • Pending Publication
Embargo date range
  • 2022-12-09 to 2023-07-10

Relationships

Parents:

This work has no parents.

In Collection:

Items