Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction.
In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential...
Novelty detection plays an important role in machine learning and signal processing. This
project studies novelty detection in a new setting where the data object is represented as
a bag of instances and associated with multiple class labels, referred to as multi-instance
multi-label (MIML) learning. Contrary to the common assumption...
We consider the problem of finding unknown patterns that are recurring across multiple sets. For example, finding multiple objects that are present in multiple images or a short DNA code that is repeated across multiple DNA sequences. We first consider a simple problem of finding a single unknown pattern in...
An important impact of the genome technology revolution will be the elucidation of mechanisms of cancer pathogenesis, leading to improvements in the diagnosis of cancer and the selection of cancer treatment. Integrated with current well-studied massive knowledge and findings about the role of protein-coding mutations in cancer, demystifying the functional...
Machine learning (ML) and deep learning (DL) models impact our daily lives with applications in natural language modeling, image analysis, healthcare, genomics, and bioinformatics. The exponential growth of biological sequence data necessitates accompanying advances in computational methods. Although deep learning is highly effective for detecting and classifying biological sequences, challenges...
Simultaneous translation, which translates concurrently with the source language speech, is widely used in many scenarios including multilateral organizations. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. On the other hand, simultaneous translation...
We describe a series of novel computational models, CERENKOV (Computational Elucidation of the REgulatory NonKOding Variome) and its successors CERENKOV2, CERENKOV3, and Convolutional CERENKOV3, for discriminating regulatory single nucleotide polymorphisms (rSNPs) from non-regulatory SNPs within non-coding genetic loci. The CERENKOV models are designed for recognizing rSNPs in the context of...
Deep learning is becoming the latest trend in sensitive applications, such as healthcare, criminal justice, and finance. As these new applications emerge, adversaries are circumventing them.
Further, there have been concerns about the possibility of bias and discrimination in predictive applications.
In order to address these issues, we propose an...
Simultaneous speech translation (SimulST) is widely useful in many cross-lingual communication scenarios, including multinational conferences and international traveling. Since text-based simultaneous machine translation (SimulMT) has achieved great success in recent years. The conventional cascaded approach for SimulST uses a pipeline of streaming ASR followed by simultaneous MT but suffers from...
This paper addresses the high model complexity and overconfident frame labeling of state-of-the-art (SOTA) action segmenters. Their complexity is typically justified by the need to sequentially refine action segmentation through multiple stages of a deep architecture. However, this multistage refinement does not take into account uncertainty of frame labeling predicted...