Natural Language Comprehension is a challenging domain of Natural Language Processing. To improve a model’s language comprehension/understanding, one approach would be to enrich the structure of the model to enhance its capability in learning the latent rules of the language.
In this dissertation, we will ﬁrst introduce several deep models for a variety of natural language comprehension tasks including natural language inference and question answering. Previous approaches employ reading mechanisms that do not fully exploit the interdependencies between the input sources like “premise and hypothesis” or “document and query”. In contrast, we explore more sophisticated reading mechanisms to efﬁciently model the relationships between input sources (e.g. “premise and hypothesis” or “document and query”). These mechanisms and models yield better empirical performances, however, due to the black-box nature of deep learning, it is difﬁcult to assess whether the improved models indeed acquire a better understanding of language. Meanwhile, data is often plagued by meaningless or even harmful statistical biases and deep models might achieve high performance by focusing on the biases. This motivates us to study methods for “peaking inside” the black-box deep models to provide explanation and understanding of the models’ behavior. The proposed method (a.k.a. saliency) takes a step toward explaining deep learning-based models based on gradient of the model output with respect to different components like the input layer and inter-mediate layers. Saliency reveals interesting insights and identiﬁes critical information contributing to the model decisions. Besides proposing a model-agnostic interpretation method (saliency), we study model-dependent interpretation solutions and propose two interpretable designs and structures. Finally, we introduce a novel mechanism (saliency learning), which learns from ground-truth explanation signal such that the learned model will not only make the right prediction but also for the right reason. Our experimental results on multiple tasks and datasets demonstrate the effectiveness of the proposed methods, which produce more faithful to right reasons and evidences predictions while delivering better results compared to traditionally trained models.