Developing accurate predictive distribution models requires adequately representing relevant spatial and temporal scales, as these scales are ultimately reflective of the relationships between distributions and influential environmental conditions. In this research, we considered both spatial and temporal scale and the influence each has on predicting broad-scale distributions of two disparate...
Supervised learning programs, such as decision tree learners and neural networks, often must learn Boolean functions. The concept being learned may not easily be expressed in terms of the atomic features given. Constructive induction automatically produces higher level features (combinations of the atomic features), which can improve learning performance. The...
A diagnostic policy species what test to perform next based on the results of previous tests and when to stop and make a diagnosis. Cost-sensitive diagnostic policies perform tradeoffs between (a) the costs of tests and (b) the costs of misdiagnoses. An optimal diagnostic policy minimizes the expected total cost....
Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. This thesis studies the problem of learning...
In its simplest form, the process of diagnosis is a decision-making process in which the diagnostician performs a sequence of tests culminating in a diagnostic decision. For example, a physician might perform a series of simple measurements (body tem- perature, weight, etc.) and laboratory measurements (white blood count, CT scan,...
Structural health monitoring (SHM) systems perform automated non-destructive damage detection and characterization for a variety of large structures including civil structures such as bridges and aerospace structures such as aircrafts and space vehicles. The goals of SHM include preventing catastrophic structural failures, increasing reliability, reducing maintenance costs, and increasing the...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Maintaining the sustainability of the earth’s ecosystems has attracted much attention as these ecosystems are facing more and more pressure from human activities. Machine learning can play an important role in promoting sustainability as a large amount of data is being collected from ecosystems. There are at least three important...
The field of machine learning has made major strides over the last 20 years. This document summarizes the major problem formulations that the discipline has studied, then reviews three tasks in cognitive networking and briefly discusses how aspects of those tasks fit these formulations. After this, it discusses challenges for...
This dissertation explores algorithms for learning ranking functions to efficiently solve search problems, with application to automated planning. Specifically, we consider the frameworks of beam search, greedy search, and randomized search, which all aim to maintain tractability at the cost of not guaranteeing completeness nor optimality. Our learning objective for...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
The ability to create reproducible cryptographically secure keys from temporal environments (e.g., images) has the potential to be a contributor to effective cryptographic mechanisms. Due to the noisy nature of these environments, achieving this goal in a user friendly fashion is a very challenging task, especially since there exists a...
Recognizing human actions in videos is a long-standing problem in computer vision with a wide range of applications including video surveillance, content retrieval, and sports analysis. This thesis focuses on addressing efficiency and robustness of video classification in unconstrained real-world settings. The thesis work can be broadly divided into four...
Bayesian networks are used for building intelligent agents that act under uncertainty. They are a compact representation of agents' probabilistic knowledge. A Bayesian network can be viewed as representing a factorization of a full joint probability distribution into the multiplication of a set of conditional probability distributions. Independence of causal...
Many approaches for achieving intelligent behavior of automated (computer) systems involve components that learn from past experience. This dissertation studies computational methods for learning from examples, for classification and for decision
making, when the decisions have different non-zero costs associated with them. Many practical applications of learning algorithms, including transaction...
This dissertation addresses the problem of video labeling at both the frame and pixel levels using deep learning. For pixel-level video labeling, we have studied two problems: i) Spatiotemporal video segmentation and ii) Boundary detection and boundary flow estimation. For the problem of spatiotemporal video segmentation, we have developed recurrent...
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to identify statistically significant anomalies, a more difficult task is to identify anomalies that are both interesting and statistically significant. Category detection is an emerging area of machine learning...
Multiagent learning with cooperative coevolutionary algorithms is a critical area of research, and is relevant to many real-world applications including air traffic control, distributed sensor network control, and game-theoretic applications such as border patrol. A key difficulty in multiagent learning is the credit assignment problem, where the impact of each...
We are witnessing the rise of the data-driven science paradigm, in which massive amounts of data - much of it collected as a side-effect of ordinary human activity - can be analyzed to make sense of the data and to make useful predictions. To fully realize the promise of this...
Coordination is essential to achieving good performance in cooperative multiagent systems. To date, most work has focused on either implicit or explicit coordination mechanisms, while relatively little work has focused on the benefits of combining these two approaches. In this work we demonstrate that combining explicit and implicit mechanisms can...
This thesis presents a progression of novel planning algorithms that culminates in a new family of diverse Monte-Carlo methods for probabilistic planning domains. We provide a proof for performance guarantees and analyze how these algorithms can resolve some of the shortcomings of traditional probabilistic planning methods. The direct policy search...
Object categorization is one of the fundamental topics in computer vision research. Most current work in object categorization aims to discriminate among generic object classes with gross differences. However, many applications require much finer distinctions. This thesis focuses on the design, evaluation and analysis of learning algorithms for fine- grained...
We consider the problem of supervised classification of bird species from audio recordings in a real-world acoustic monitoring scenario (i.e. audio data is collected in the field with an omnidirectional microphone, without human supervision). Obtaining better data about bird activity can assist conservation efforts, and improve our understanding of their...
Machine learning models for natural language processing have traditionally relied on large numbers of discrete features, built up from atomic categories such as word forms and part-of-speech labels, which are considered completely distinct from each other. Recently however, the advent of dense feature representations coupled with deep learning techniques has...
We developed and investigated machine learning methods that require
minimal preprocessing of the input data, use few training examples, run fast, and
still obtain high levels of accuracy.
Most approaches to designing machine learning programs are based on the
supervised learning paradigm – training examples are chosen randomly and given...