Graduate Thesis Or Dissertation
 

Efficient training and feature induction in sequential supervised learning

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/9019s692k

Descriptions

Attribute NameValues
Creator
Abstract
  • Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction. In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential supervised learning problems. Existing training algorithms for CRFs are slow, particularly in problems with large numbers of potential input features and feature combinations. In this dissertation, we describe a new algorithm, TreeCRF, for training CRFs via gradient tree boosting. In TreeCRF, the CRF potential functions are represented as weighted sums of regression trees, which provide compact representations of feature interactions. So the algorithm does not explicitly consider the potentially large parameter space. As a result, gradient tree boosting scales linearly in the order of the Markov model and in the order of the feature interactions, rather than exponentially as in previous algorithms based on iterative scaling and gradient descent. Detailed experimental results are provided to evaluate the performance of the TreeCRF algorithm and possible extensions of this algorithm are discussed. We also study the problem of handling missing input values in CRFs, which has been rarely discussed in the literature. Gradient tree boosting also makes it possible to use instance weighting (as in C4.5) and surrogate splitting (as in CART) to handle missing values in CRFs. Experimental studies of the effectiveness of these two methods (as well as standard imputation and indicator feature methods) show that instance weighting is the best method in most cases when feature values are missing at random. In the direction of feature induction, we study the search-based structured learning framework and its application to sequential supervised learning problems. By formulating the label sequence prediction process as an incremental search process from one end of a sequence to the other, this framework is able to avoid complicated inference algorithms in the training process and thus achieves very fast training speed. However, for problems where there exist long range dependencies between the current position and future positions, at each search step, this framework is unable to exploit these dependencies to make accurate predictions. In this dissertation, a multiple-instance learning based algorithm is proposed to automatically extract useful features from future positions as a way to discover and exploit these long range dependencies. Integrating this algorithm with maximum entropy Markov models yields promising experimental results on both synthetic data sets and real data sets that have long range dependencies in sequences.
License
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces

Relationships

Parents:

This work has no parents.

In Collection:

Items