Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are...
Macrosomia is a medical term describing a new baby born with an excessive birth weight (greater than 4000g). Fetal macrosomia may lead to both pregnancy complications, and increased risk of mother's and baby's health problems after birth. But the potential complications may be mitigated by a cesarean delivery. As such,...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
Many methods have been explored in the literature of multi-label learning, ranging from simple problem transformation to more complex method that capture correlation among labels. However, mostly all existing works do not address the challenge with incomplete label data. The goal of this project is to extend the work of...
Recent work in machine learning concerns the detection and identification of bird species from audio recordings of their vocalizations. Such analysis can yield valuable ecological information concerning the activity and distribution of species in the wild. Current species-identification methods require individual syllables of bird audio as input, but field-collected audio...
Automatic event extraction from natural text is an important and challenging task for natural language understanding. Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on...
This paper introduces an approach to text classification for semi-structured label systems that have poor performance with standard methods. With the perspective that perfect classification for such a system is unattainable, we demonstrate an automated procedure to isolate the learnable elements of the problem. Through analysis of an example dataset,...
Natural Language Comprehension is a challenging domain of Natural Language Processing. To improve a model’s language comprehension/understanding, one approach would be to enrich the structure of the model to enhance its capability in learning the latent rules of the language.
In this dissertation, we will first introduce several deep models...
Information about named entities (real-world objects) is usually harvested from different sources and organized as a multiple relational directed graph in Knowledge Bases (KBs). KBs play essential roles in many NLP modules including question answering, fact-checking, search engines, etc. KBs are big but still incomplete: relational information among entities is...
Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. This thesis studies the problem of learning...