Humans are remarkably efficient in learning by interacting with other people and observing their behavior. Children learn by watching their parents’ actions and mimic their behavior. When they are not sure about their parents demonstration, they communicate with them, ask questions, and learn from their feedback. On the other hand, parents and teachers ask children to explain their behavior. This explanation helps the parents know whether the children learned their task correctly or not. So, why not have intelligent systems that learn from examples and interaction with humans, and explain their decisions to humans? This dissertation makes three contributions toward this goal.
The first contribution is towards designing an intelligent system that incorporates human’s knowledge in discovering of hierarchical structure in sequential decision problems. Given a set of expert demonstrations. We proposed a new approach that learns a hierarchical policy by actively selecting demonstrations and using queries to explicate their intentional structure at selected points.
The second contribution is a generalization of the framework of adaptive submodularity. Adaptive submodular optimization, where a sequence of items is selected adaptively to optimize a submodular function, has been found to have many applications from sensor placement to active learning. We extend this work to the setting of multiple queries at each time step, where the set of available queries is randomly constrained. A primary contribution of this paper is to prove the first near optimal approximation bound for a greedy policy in this setting. A natural application of this framework is to crowd-sourced active learning problem where the set of available experts and examples might vary randomly. We instantiate the new framework for multi-label learning and evaluate it in multiple benchmark domains with promising results.
The third contribution of this dissertation is the introduction of a framework for explaining the decisions of deep neural networks using human-recognizable visual concepts. Our approach, called interactive naming, is based on enabling human annotators to interactively group the excitation patterns of the neurons in the critical layer of the network into groups called ”visual concepts". We performed two user studies of visual concepts produced by human annotators. We found that a large fraction of the activation maps have recognizable visual concepts, and that there is significant agreement between the different annotators about their denotations. Many of the visual concepts created by human annotators can be generalized reliably from a modest number of examples.