Graduate Thesis Or Dissertation


New Directions in Search-based Structured Prediction: Multi-Task Learning and Integration of Deep Models Public Deposited

Downloadable Content

Download PDF


Attribute NameValues
  • This thesis studies the problem of structured prediction (SP), where the agent needs to predict a structured output for a given structured input (e.g., Part-of-Speech tagging sequence for an input sentence). Many important applications including machine translation in natural language processing (NLP) and image interpretation in computer vision can be naturally formulated as structured prediction problems. These prediction problems have an exponentially number of candidate outputs, which poses significant challenges for inference and learning. Search-based structured prediction views the prediction of structured outputs as a search process in the space of candidate outputs guided by a learned scoring function. Search-based approaches offer several advantages including, incorporation of expressive representations over inputs and outputs with negligible overhead in inference complexity, and providing a way to trade off accuracy for efficient inference. In this thesis, I make three contributions to advance search-based structured prediction methods towards the goal of improving their accuracy and computational-efficiency. First, I developed a search-based learning approach called “Prune-and-Score” to improve the accuracy of greedy policy based structured prediction for search spaces with large action spaces. The key idea is to learn a pruning function that prunes bad decisions and a scoring function that then selects the best among the remaining decisions. I show the efficacy of this approach for coreference resolution, i.e., clustering mentions that refer to the same entity, which is a hard NLP task. Second, I studied multi-task structured prediction (MTSP) problems in the context of entity analysis, which includes the three tasks of named entity recognition, co-reference resolution, and entity linking. I developed three different search-based learning architectures for MTSP problem that make different trade-offs between speed and accuracy of training and inference. I performed empirical evaluation of the proposed architectures for entity analysis with state-of-the-art results. Finally, I developed the HC-Nets framework by integrating the advances in deep learning with search-based SP methods. I formulate structured prediction as a complete output space search problem, and employ two neural network functions to guide the search: a heuristic function H for generating candidate outputs and a cost function C for selecting the best output from the generated candidates. Unlike prior methods such as structured prediction energy networks and deep value networks that perform gradient-based inference in relaxed space, HC-Nets allows to incorporate prior knowledge in the form of constraints. The decomposition of heuristic and cost functions makes the representation more expressive and learning more modular. Our experimental results show that HC-Nets achieves better performance than prior methods on multiple benchmark domains.
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Committee Member
Academic Affiliation
Rights Statement
Peer Reviewed



This work has no parents.

In Collection: