Graduate Thesis Or Dissertation
 

Efficiently Learning Human Preferences for Robot Autonomy

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/7s75dm176

Descriptions

Attribute NameValues
Creator
Abstract
  • Human-robot teams are invaluable for mapping unknown environments, exploring difficult-to-reach areas, and manipulating inaccessible equipment. However, guiding autonomous robots requires dealing with these dynamic domains while synthesizing a significant amount of data and balancing competing objectives. Current mission planning methods often involve manually specifying low-level parameters of the mission, such as exact waypoints or control inputs. These methods cannot perfectly cope with the changing surroundings and limited communications that come with operating in these complex conditions. To address this and reduce the burden on human operators, the field has trended towards ever-increasing levels of autonomy. Providing this long-term autonomy requires more usable, robust collaborative mission planning solutions that leverage the strengths of both the robot and the human operator. In this thesis, we propose two novel methods for improving the collaboration of human-robot teams by enabling the robot to learn an operator's preferences for mission planning. These techniques provide the robot with a rich representation of the human's goals while utilizing familiar techniques to speed learning. The first method is trained by making small-scale, iterative improvements to candidate mission plans generated by the robot, similar to the small improvements an operator would make while planning an actual mission. Using a novel coactive learning algorithm, the method learns the operator's preferences from the feature differences between the original and improved mission plans while remaining robust to errors and noise in the operator's corrections. The second proposed method simplifies the queries by asking survey-style rating and ranking questions about candidate plans. These queries are generated by a Gaussian process (GP) active learner that uses the responses to learn the most preferred region of the mission preference space. The ranking query responses provide the GP with general relational information about several points in the preference space, while the rating query responses provide a specific preference about a single point. A custom probit allows the GP to incorporate the different strengths of each query type into a single preference model. Tests in simulated lake monitoring missions show that these methods can efficiently and accurately learn an operator’s preferences. Additionally, a field trial in which an EcoMapper autonomous underwater vehicle monitors the ecology of a lake validates the use of the coactive learning method. These results demonstrate that these techniques can enable a robot to accurately learn a human operator's preferences, then autonomously plan and perform missions that apply those preferences without relying on regular intervention by the operator.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Rights Statement
Funding Statement (additional comments about funding)
  • This work was supported in part by NSF grant IIS-1317815 and Office of Naval Research grant N00014-14-1-0905
Publisher
Peer Reviewed
Language

Relationships

Parents:

This work has no parents.

In Collection:

Items