Humans are remarkably efficient in learning by interacting with other people and
observing their behavior. Children learn by watching their parents’ actions and mimic their
behavior. When they are not sure about their parents demonstration, they communicate with
them, ask questions, and learn from their feedback. On the other hand, parents and teachers
ask children to explain their behavior. This explanation helps the parents know whether the
children learned their task correctly or not. So, why not have intelligent systems that learn
from examples and interaction with humans, and explain their decisions to humans? This
dissertation makes three contributions toward this goal.
The first contribution is towards designing an intelligent system that incorporates
human’s knowledge in discovering of hierarchical structure in sequential decision problems.
Given a set of expert demonstrations. We proposed a new approach that learns a hierarchical
policy by actively selecting demonstrations and using queries to explicate their intentional
structure at selected points.
The second contribution is a generalization of the framework of adaptive submodularity.
Adaptive submodular optimization, where a sequence of items is selected adaptively to
optimize a submodular function, has been found to have many applications from sensor
placement to active learning. We extend this work to the setting of multiple queries
at each time step, where the set of available queries is randomly constrained. A primary
contribution of this paper is to prove the first near optimal approximation bound for a greedy
policy in this setting. A natural application of this framework is to crowd-sourced active
learning problem where the set of available experts and examples might vary randomly. We
instantiate the new framework for multi-label learning and evaluate it in multiple benchmark
domains with promising results.
The third contribution of this dissertation is the introduction of a framework for explaining
the decisions of deep neural networks using human-recognizable visual concepts. Our
approach, called interactive naming, is based on enabling human annotators to interactively
group the excitation patterns of the neurons in the critical layer of the network into groups
called ”visual concepts". We performed two user studies of visual concepts produced by
human annotators. We found that a large fraction of the activation maps have recognizable
visual concepts, and that there is significant agreement between the different annotators
about their denotations. Many of the visual concepts created by human annotators can be
generalized reliably from a modest number of examples.