Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction.
In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential...
This paper examines how six online multiclass text classification algorithms perform in the domain of email tagging within the TaskTracer system. TaskTracer is a project-oriented user interface for the desktop knowledge worker. TaskTracer attempts to tag all documents, web pages, and email messages with the projects to which they are...
This thesis includes three studies involving different aspects of modeling protein structure. The first study illustrates the levels of insight available from atomic-resolution protein structures. The second study derives general trends of protein geometry from atomic-resolution structures and shows their implications for modeling. The third study creates a model of...
This thesis presents a progression of novel planning algorithms that culminates in a new family of diverse Monte-Carlo methods for probabilistic planning domains. We provide a proof for performance guarantees and analyze how these algorithms can resolve some of the shortcomings of traditional probabilistic planning methods. The direct policy search...
Many conditions affecting hydrogen (H₂) production by the cyanobacterium, Synechocystis sp. PCC 6803, were optimized to yield maximum H₂ accumulation. Biological H₂ production from photosynthetic species is a promising form of renewable energy since an abundant supply of sunlight hits the Earth every day, and photosynthetic bacteria can harness this...
Markov models are commonly used for joint inference of label sequences. Unfortunately, inference scales quadratically in the number of labels, which is problematic for training methods where inference is repeatedly preformed and is the primary computational bottleneck for large label sets. Recent work has used output coding to address this...
The results of a machine learning from user behavior can be thought of as a program, and like all programs, it may need to be debugged. Providing ways for the user to debug it matters because without the ability to fix errors, users may find that the learned program’s errors...
We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...