The Pacific Coast Groundfish Fishery harvests a diverse and large grouping of fishes, but it did not become heavily fished until around WWII. This makes the groundfish fishery a comparatively young fishery. Despite its youth, it is one of the largest and most lucrative fisheries in Oregon—with a current harvest...
Learning novel concepts from relational databases is an important problem with applications in several disciplines, such as data management, natural language processing, and bioinformatics. For a learning algorithm to be effective, the input data should be clean and in some desired representation. However, real-world data is usually heterogeneous – the...
Anomaly detection has been used in variety of applications in practice, including cyber-security, fraud detection and detecting faults in safety critical systems, etc. Anomaly detectors produce a ranked list of statistical anomalies, which are typically examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, most...
Structural health monitoring (SHM) systems perform automated non-destructive damage detection and characterization for a variety of large structures including civil structures such as bridges and aerospace structures such as aircrafts and space vehicles. The goals of SHM include preventing catastrophic structural failures, increasing reliability, reducing maintenance costs, and increasing the...
Due to recent advances in computer technology, the cost of collecting and storing data has dropped drastically. This makes it feasible to collect large amounts of information for each data point. This increasing trend in feature dimensionality justifies the need for research on variable selection. Random forest (RF) has demonstrated...
This dissertation addresses the problem of video labeling at both the frame and pixel levels using deep learning. For pixel-level video labeling, we have studied two problems: i) Spatiotemporal video segmentation and ii) Boundary detection and boundary flow estimation. For the problem of spatiotemporal video segmentation, we have developed recurrent...
In supervised learning, label information can be provided at different levels of granularity. For small datasets, it is possible to acquire a label for each data instance. However, in the big-data regime, this fine granularity approach is prohibitively costly. For example, in semi-supervised learning, only a limited number of samples...
Base-pairing interaction are the fundamental blocks of an RNA structure. With increased ability to determine the structure of RNA molecules, an accurate and informative way of classifying these interactions would help to study RNA structure at many levels. Apart from the well-known complementary base pairs and wobble pairs, other interactions...
Assembly planning is a crucial task for every manufacturing product. In general, assembly operations consume more than 30% of the total manufacturing time and cost. Therefore, any effort in optimizing assembly will have a significant impact on the economic success of manufacturing. Finding an optimal assembly plan by hand is...
Developing accurate predictive distribution models requires adequately representing relevant spatial and temporal scales, as these scales are ultimately reflective of the relationships between distributions and influential environmental conditions. In this research, we considered both spatial and temporal scale and the influence each has on predicting broad-scale distributions of two disparate...
In bioacoustics, automatic animal voice detection and recognition from audio recordings is an emerging topic for animal preservation. Our research focuses on bird bioacoustics, where the goal is to segment bird syllables from the recording and predict the bird species for the syllables. Traditional methods for this task addresses the...
The ability to create reproducible cryptographically secure keys from temporal environments (e.g., images) has the potential to be a contributor to effective cryptographic mechanisms. Due to the noisy nature of these environments, achieving this goal in a user friendly fashion is a very challenging task, especially since there exists a...
Machine learning models for natural language processing have traditionally relied on large numbers of discrete features, built up from atomic categories such as word forms and part-of-speech labels, which are considered completely distinct from each other. Recently however, the advent of dense feature representations coupled with deep learning techniques has...
Recognizing human actions in videos is a long-standing problem in computer vision with a wide range of applications including video surveillance, content retrieval, and sports analysis. This thesis focuses on addressing efficiency and robustness of video classification in unconstrained real-world settings. The thesis work can be broadly divided into four...
Hop (Humulus lupulus L. var lupulus) is a plant of great cultural significance, used as a medicinal herb for thousands of years, and for flavor and as a preservative in brewing beer. Studies of the medicinal effects of the unique compounds produced by hop have led to interest from the...
Following thermal neutron induced fission, supervised machine learning and temporal gamma-ray spectroscopy methods were used to identify differences in the delayed gamma-ray spectra of Pu-239 and U-235. The temporal gamma-ray spectroscopy method takes advantage of the time-dependent decay of fission products. Employing Spearman's rank-order correlation coefficient and without prior knowledge...
Maintaining the sustainability of the earth’s ecosystems has attracted much attention as these ecosystems are facing more and more pressure from human activities. Machine learning can play an important role in promoting sustainability as a large amount of data is being collected from ecosystems. There are at least three important...
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do...
Many problems in ecology and conservation biology can be formulated and solved using machine learning algorithms for multi-label classification. This dissertation addresses three topics related to predicting the distributions of multiple species. It improves existing methods and proposes a new modeling paradigm to address the multi-species, multi-label problem. The first...
How can end users efficiently influence the predictions that machine learning systems make on their behalf? Traditional systems rely on users to provide examples of how they want the learning system to behave, but this is not always practical for the user, nor efficient for the learning system. This dissertation...