This paper addresses the high model complexity and overconfident frame labeling of state-of-the-art (SOTA) action segmenters. Their complexity is typically justified by the need to sequentially refine action segmentation through multiple stages of a deep architecture. However, this multistage refinement does not take into account uncertainty of frame labeling predicted...
The use of genetic algorithms to compose music and generate sounds is an area of interest in the artificial intelligence field. Music and instrument sounds have known rules and structures that can be followed which make them well-suited for genetic algorithms. However, genetic algorithms still struggle to generate sounds comparable...
Over 37,000 people die each year in automobile accidents, with many of these fatalities resulting from collisions with emergency vehicles. The rise of autonomous cars creates the need for an accurate and failsafe method of detecting and responding to emergency vehicles safely and on time. This thesis investigates the ability...
Autonomous robotic agents are on their way to becoming in-home personal assistants, construction assistants, and warehouse workers. The degree of autonomy of such systems is reflected by the manner in which we specify goals to them; the abstraction of low-level commands to high-level goals goes hand-in-hand with increased autonomy. In...
Learning to recognize objects is a fundamental and essential step in human perception and understanding of the world. Accordingly, research of object discovery across diverse modalities plays a pivotal role in the context of computer vision. This field not only contributes significantly to enhancing our understanding of visual information but...
While digital inclusivity researchers and software practitioners have been trying to address exclusion biases in Windows, Icons, Menus, and Pointers (WIMP) user interfaces (UIs) for a long time, little has been done to investigate if and how inclusive software design and its methods that have been devised for WIMP UIs...
A video server is the multimedia equivalent of a data file server. As video goes digital video servers that support digital media are required for production and broadcasting in television companies. Grass Valley Group provides comprehensive digital video solutions through a range of video servers called Profiles.
The Profile Media...
Structure Query Language (SQL) is widely used to access data stored in relational database systems. Although a powerful and flexible language, SQL can also be complex and hard to learn. For most new SQL users, it's easy to write SQL statement by following SQL grammar and syntax rules, but it's...
We have developed a framework for Web-based GIS/database applications which allow users to insert, update, delete, and query data with a map interface displayed by Web browsers. The framework was designed so that a Web-based GIS application that uses ArcIMS as a map server can be easily created, customized, and...
Web-Site Generator 3 (WebSiteGen3) is a rapid application development (RAD) tool that generates ASP.NET forms to insert, query, update and delete data stored in the user tables in a SQL Server 2000 database. WebSiteGen3 uses a graphical user interface to show the user tables in a hierarchical tree based on...
Analysis of programs forms an important activity in the field of software engineering. It is necessary to help understand the code, which facilitates comprehensive testing, maintenance and optimization of code. Aristotle is a tool for analyzing programs written in C. We have designed a system on similar lines for Java...
I present a new heuristic search approach to compute approximate answers for the probability query in belief nets. This approach can compute the 'best' bounds for a query in a period of any given time (if time permitted, it will get an exact value). It inherits the essence of Symbolic...
Wireless Application Protocol (WAP) is an industry standard aimed to bring the web to handheld devices. The handheld devices are constrained by battery life and it becomes important that power is conserved across web transactions. The power consumed by the handheld device is directly proportional to the time taken for...
We have developed a prototype web-based GIS application for tracking the locations of moving entities. This application, which is called the Responder application, consists of two parts: the Responder client and the Responder server. The Responder client is a .NET application written in C#. It reads the location data from...
Global Positioning Systems have allowed for precise timing of power system measurements over wide areas. This newly found capability has the potential to provide much greater insight into the operation of the power system and its response to contingencies, but few analytical techniques currently exist that provide enough robustness and...
With sequential computing technology reaching its speed limits, parallel processing is emerging as the key to very-high-speed computation. However, developing a parallel program is by no means a simple task; neither is analyzing the performance of parallel programs.
C* is a high-level data-parallel language that hides explicit message passing and...
High level data-parallel languages are easy to use and shield the programmer from machine specific details. A simple and efficient way of providing an interface to such languages is to develop a machine-independent compiler and a routing library, which isolates the low-level machine dependent communication functions, The compiler translates the...
Component based software technologies are viewed as essential for creating the software systems of the future. However, the use of externally provided component has serious drawbacks for a wide range of software engineering activities often because of a lack of information about the components. One such drawback involves validation of...
Web-based Pesticide Screening Tool (Web-PST) is a web-based software application for evaluating the potential risk of pesticides on the surrounding environment. It uses the formulas and standards specified by the Soil/Pesticide Interaction Screening Procedure Version II (SPISP II). Web-PST closely models the stand-alone Windows application Windows Pesticide Screening Tool (WIN-PST)....
Dataparallel C has been designed for various kinds of architectures and was developed jointly at University of New Hampshire and Oregon State University.
This project dealt with porting the libraries to the nCUBE 2 system and helping a user compile his programs on nCUBE 2 directly from a Sun workstation...
CREEDA-BG (Crop Rotation Economic and Environmental Decision-Aid Budget Generator) is a web-based enterprise budgeting tool that allows a user to manage information on her fields, rotations, crops, and operations, and to view estimates of income and expenses. The user can then use this information to evaluate alternative plans and make...
In this paper we describe a fuzzy logic based control system implemented on a PC architecture. The rule based inference engine of this expert system is easily configured through an ascii text file and is demonstrated to be capable of controlling various simulations, including an inverted pendulum. The controller is...
Component based software technologies are viewed as essential for creating the software systems of the future. However the use of externally provided components has serious drawbacks for a wide range of software engineering activities often because of a lack of information about the components. One such drawback involves validation of...
We present a model for a distributed virtual market place that can be constructed on the Internet to support selling and buying requests, such as those found as classified advertisements. One requirement for a transaction to take place in the virtual market place is that a sell request and a...
We have developed SearchPak, a machine independent parallel searching tool on shared and distributed memory machines. It can be used for combinatorial optimization problems and OR-parallel computations as well. Both depth-first and best-first search of the state space can be performed using the SearchPak. With SearchPak, a user just provides...
Learning latent space representations of high-dimensional world states has been at the core of recent rapid growth in reinforcement learning(RL). At the same time, RL algo- rithms have suffered from ignored uncertainties in the predicted estimates of model-free or model-based methods. In our work, we investigate both of these aspects...
Papers proposing novel machine learning algorithms tend to present the algorithm or technique in question in the best possible light. The standard practice is generally for authors to emphasize their proposed algorithms' performance in the precise setting where it is maximally impressive, often by only fully evaluating their best known...
We explore the application of deep learning to the disparate fields of natural language processing and computational biology. Both the sentences uttered by humans as well as the RNA and protein sequences found within the cells of their bodies can be considered formal languages in computer science, as sets of...
Assessing AI systems is difficult. Humans rely on AI systems in increasing ways, both visible and invisible, meaning a variety of stakeholders need a variety of assessment tools (e.g., a professional auditor, a developer, and an end user all have different needs). We posit that it is possible to provide...
Manufacturing technology has continuously evolved and advanced over the past century; this has led to an increase in the production of consumer and industrial goods driven by simultaneous growth in population and wealth. Despite the resulting economic and labor growth, environmental impacts of manufacturing have increased dramatically due to the...
Top-performing approaches to embodied AI tasks like point-goal navigation often rely on training agents via reinforcement learning over tens of millions (or even billions) of experiential steps – learning neural agents that map directly from visual observations to actions. In this work, we question whether these extreme training durations are...
In this work, we study the problem of learning and improving policies for probabilistic planning problems. In the first part, we train neural network policies for probabilistic planning problems modeled as factored Markov decision problems. The objective is to train problem-specific neural networks via supervised learning to imitate the action...
Causal inference is an important analytical tool to bridge the gap between prediction and decision-making. However, learning a causal network solely from data is a challenging task. In this work, various techniques have been explored for a better and improved causal network learning from data. Firstly, the problem of learning...
Advances in deep learning based image processing have led to their adoption for a wide range of applications, and in tow with these developments is a dramatic increase in the availability of high quality datasets. With this comes the need to accelerate and scale deep learning applications in order to...
Over the last two decades, satisfiability and satisfiability-modulo theory (SAT/SMT) solvers have grown powerful enough to be general purpose reasoning engines throughout software engineering and computer science. However, most practical use cases of SAT/SMT solvers require not just solving a single SAT/SMT problem, but solving sets of related SAT/SMT problems....
We propose an approach for understanding control policies represented as recurrent neural networks. Recent work has approached this problem by transforming such recurrent policy networks into finite-state machines (FSM) and then analyzing the equivalent minimized FSM. While this led to interesting insights, the minimization process can obscure a deeper understanding...
We present a novel multi-objective optimization methodology built upon a multi-agent blackboard framework. This multi-agent blackboard system (MABS) synthesizes blackboard architectures, multi-agent environments, and optimization theory. The blackboard architecture creates the framework for initializing, storing, and solving a multi-objective optimization problem. Multiple agents allow for an optimization problem to be...
Information about named entities (real-world objects) is usually harvested from different sources and organized as a multiple relational directed graph in Knowledge Bases (KBs). KBs play essential roles in many NLP modules including question answering, fact-checking, search engines, etc. KBs are big but still incomplete: relational information among entities is...
The performance of deep learning frameworks could be significantly improved through considering the particular underlying structures for each dataset. In this thesis, I summarize our three work about boosting the performance of deep learning models through leveraging structures of the data. In the first work, we theoretically justify that, for...
It is common practice in the unsupervised anomaly detection literature to create experimental benchmarks by sampling from existing supervised learning datasets. We seek to improve this practice by identifying four dimensions important to real-world anomaly detection applications --- point difficulty, clusteredness of anomalies, relevance of features, and relative frequency of...
This dissertation incorporates coalition formation and probabilistic planning towards a domain-independent automated planning solution scalable to multiple heterogeneous robots in complex domains. The first research direction investigates the effectiveness of Task Fusion and introduces heuristics that improve task allocation and result in better quality plans, while requiring lower computational cost...
Simultaneous translation, which translates concurrently with the source language speech, is widely used in many scenarios including multilateral organizations. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. On the other hand, simultaneous translation...
Diffusion processes in networks are common models for many domains, including species colonization, information/idea cascade, disease propagation and fire spreading. In diffusion networks, a diffusion event occurs when a behavior spreads from one node to the other following a probabilistic model, where the behavior could be species, an idea, a...
Natural Language Comprehension is a challenging domain of Natural Language Processing. To improve a model’s language comprehension/understanding, one approach would be to enrich the structure of the model to enhance its capability in learning the latent rules of the language.
In this dissertation, we will first introduce several deep models...
Machine learning (ML) and deep learning (DL) models impact our daily lives with applications in natural language modeling, image analysis, healthcare, genomics, and bioinformatics. The exponential growth of biological sequence data necessitates accompanying advances in computational methods. Although deep learning is highly effective for detecting and classifying biological sequences, challenges...
RNA structure prediction is a challenging problem, especially with pseudoknots. Recently, there has been a shift from the classical minimum free energy-based methods (MFE) to partition function-based ones that assemble structures based on base-pairing probabilities. Two typical examples of the latter group are the popular maximum expected accuracy (MEA) method...
Our goal is to build a system to model the RNA sequences that reveals their structural information by using efficient dynamic programming algorithms and deep learning approaches. We aim to 1) achieve linear-time for RNA secondary structure prediction based on existing minimum free energy models; 2) utilize deep neural networks...
Learning Analytics and other branches of Educational Research such as Computing Education Research (CER) implicitly assume that students, especially college students, have no barriers to access learning platforms or software packages. This assumption may be attributed to such pervasive beliefs such as "everyone has a device", or "everyone can access...
Situated cognition theory emphasizes the role that social and material contexts have on learning and knowledge application. Several studies of engineering workplace environments have noted differences between the social and material contexts of the workplace and those of undergraduate engineering education. No existing research has studied the social and material...
Most database users do not know formal query languages, such as SQL, and prefer to express their information needs using usable query languages, such as keyword queries. Keyword queries, however, are inherently ambiguous and challenging for the database systems to understand and answer effectively. We propose a novel approach to...
The advent of deep learning models leads to a substantial improvement in a wide range of NLP tasks, achieving state-of-art performances without any hand-crafted features. However, training deep models requires a massive amount of labeled data. Labeling new data as a new task or domain emerges consumes time and efforts...
Automatic music transcription (AMT) is the task, given an acoustic representation of music, to recover a symbolic notation of the written notes expressed by the sound. Transcribing music with multiple notes sounding simultaneously is difficult for both humans and machines. Much existing work on AMT has focused on suitable acoustic...
Learning novel concepts from relational databases is an important problem with applications in several disciplines, such as data management, natural language processing, and bioinformatics. For a learning algorithm to be effective, the input data should be clean and in some desired representation. However, real-world data is usually heterogeneous – the...
Anomaly detection has been used in variety of applications in practice, including cyber-security, fraud detection and detecting faults in safety critical systems, etc. Anomaly detectors produce a ranked list of statistical anomalies, which are typically examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, most...
A series of laboratory experiments were conducted to study the wave field in the inner lagoon excited by ‘long’ incident waves. Three cases were considered: Cases A, B and C presenting incident waves of wavelength with factors of 1, 2 and 2.5 times the width of the reef respectively. The...
There are growing interests in designing polynomial-time approximation schemes (PTAS) for optimization problems in planar graphs. Many NP-hard problems are shown to admit PTAS in planar graphs in the last decade, including Steiner tree, Steiner forest, two- edge-connected subgraphs and so on. We follow this research line and study several...
Recent advancements in joining operations and additive manufacturing now allow complex metal parts to be built up from raw materials, as opposed to being machined down from solid blocks. This not only opens up the design space, but also allows for much more efficient manufacturing. By decomposing a complex part...
Changes in the global climate and forest management practices have given rise to increasing numbers and severity of wildfires. More than five million acres burned in the United States in 2017, while in Canada 7.4 million acres burned. In particular, an increasing amount of dead woody biomass is a key...
The Rust programming language is a systems programming language with a strong static type system. A central feature of Rust’s type system is its unique concept of “ownership”, which enables Rust to give a user safe, low-level control over resources without the overhead of garbage collection. In Haskell, most data...
Automatic event extraction from natural text is an important and challenging task for natural language understanding. Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on...
Although machine learning systems are often effective in real-world applications, there are situations in which they can be even better when provided with some degree of end user feedback. This is especially true when the machine learning system needs to customize itself to the end user's preferences, such as in...
Society faces many complex management problems, particularly in the area of shared public resources such as ecosystems. Existing decision making processes are often guided by personal experience and political ideology rather than state-of-the-art scientific understanding. This dissertation envisions a future in which multiple stakeholders are provided with computational tools for...
There is growing commercial interest in the use of unmanned aerial vehicles (UAVs) in urban environments, specifically for package delivery applications. However, the size, complexity and sheer numbers of expected UAVs makes conventional air traffic management that relies on human air traffic controllers infeasible. To enable UAVs to safely and...
In Intense Pulsed Light (IPL) sintering, pulsed large-area visible light from a xenon lamp is absorbed by nanoparticle films or patterns and converted to heat, resulting in rapid sintering of the nanoparticles. This work experimentally characterizes IPL sintering of silver nanoparticle films. A newly observed turning point in the evolution...
Oxo-hydroxo Group 5 metal clusters are an untapped resource to study and advance aqueous solution processing of metal oxide thin films. The tetramethylammonium (TMA) hexatantalate salt (TMA6[H2Ta6O19]) yields dense Ta2O5 films (~95% of the bulk ß-Ta2O5 density) with atomically smooth surfaces (<4 Å root mean square surface roughness). This same...
Appropriate representations of variational software simplify the analysis of their properties.This thesis proposes tailored representations of two kinds variational softwares: difference files of merge commits in Git and feature models. For the former, we use the Choice Edit Model, which is based on the choice calculus, to represent changes introduced...
This work is inspired by problems in natural resource management centered on the challenge of invasive species. Computing optimal management policies for maintaining ecosystem sustainable is challenging. Many ecosystem management problems can be formulated as MDP (Markov Decision Process) planning problems. In a simulator-defined MDP, the Markovian dynamics and rewards...
The thesis focuses on activity recognition from sensor data, which has spurred a great deal of interest due to its impact on health care and security. Previous work on activity recognition from multivariate time series data has mainly applied supervised learning techniques which require a high degree of annotation effort...
In the field of machine learning, clustering and classification are two fundamental tasks. Traditionally, clustering is an unsupervised method, where no supervision about the data is available for learning; classification is a supervised task, where fully-labeled data are collected for training a classifier. In some scenarios, however, we may not...
Most tasks in natural language processing (NLP) try to map structured input (e.g., sentence or word sequence) to some form of structured output (tag sequence, parse tree, semantic graph, translated/paraphrased/compressed sentence), a problem known as “structured prediction”. While various learning algorithms such as the perceptron, maximum entropy, and expectation-maximization have...
Monte Carlo tree search (MCTS) is a class of online planning algorithms for Markov decision processes (MDPs) and related models that has found success in challenging applications. In the online planning approach, the agent makes a decision in the current state by performing a limited forward search over possible futures...
Software testing is the process of evaluating the accuracy and performance of software, and automated software testing allows programmers to develop software more efficiently by decreasing testing costs. We compared two advanced random test generators, a Feedback-Directed Random Test Generator (FDR) and a Feedback-Controlled Random Test Generator (FCR), for an...
Mutation testing is one of the effective approaches measuring test adequacy of test suites. It is widely used in both academia and industry. Unfortunately, the adoption and practical use of mutation testing for Python 2.x programs face three obstacles. First, limited useful mutation operators. Existing mutation testing tools support very...
This thesis focuses on the problem of object tracking. Given a video, the general objective of tracking is to track the location over time of one or more targets in the image sequence. This is a very challenging task as algorithms need to deal with problems such as appearance variations,...
With the development of technologies in genome sequencing and variant detection, a huge number of variants are detected. To further analyze the variants, it requires an efficient tool to annotate the functional effect of variants. This project managed to develop an efficient program to annotate the functional effect of variants...
Machine learning models for natural language processing have traditionally relied on large numbers of discrete features, built up from atomic categories such as word forms and part-of-speech labels, which are considered completely distinct from each other. Recently however, the advent of dense feature representations coupled with deep learning techniques has...
In this thesis, we will study certain generalizations of the classical Shannon Sampling Theorem, which allows for the reconstruction of a pi-band-limited, square-integrable function from its samples on the integers. J. R. Higgins provided a generalization where the integers can be perturbed by less than 1/4, which includes nonuniform and...
Autonomous multiagent teams can be used in complex exploration tasks to both expedite the exploration and improve the efficiency. However, use of multiagent systems presents additional challenges. Specifically, in domains where the agents' actions are tightly coupled, coordinating multiple agents to achieve cooperative behavior at the group level is difficult....
We model the popular board game of Clue as an MDP and evaluate Monte-Carlo policy rollout in a simulated environment pitting different agents and policies against each other. We describe the choices we made in the representation, along with some of the problems we encountered along the way. We find...
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do...
The Open Modeling Environment (OME) is a tool developed to address some known shortcomings in ecological System Dynamics (SD) modeling research. OME provides a common set of methods for interacting directly with spatial information, reducing the need for modelers to create their own methods for doing so. The environment is...
Through passive adaptation to incidental flow, flexible aerodynamic surfaces exploit effects of increased lift, delayed stall and disturbance rejection. Wings of birds, bats, and insects exhibit these passive effects, and at the same time through the use of structural state feedback sensed from the loads on the wing, active control...
The study of variational typing originated from the problem of type inference for variational programs, which encode numerous different but related plain programs. In this dissertation, I present a sound and complete type inference algorithm for inferring types of all plain programs encoded in variational programs. The proposed algorithm runs...
Direct red Solid Oxide Fuel Cell (SOFC) Turbine hybrid plants have the potential to dramatically increase power plant efficiency, decrease emissions, and provide fast response to transient loads. The US Department of Energy's (DOE) Hybrid Performance Project is an experimental hybrid SOFC plant, built at the National
Energy Technology Laboratory...
Monte-Carlo Tree Search (MCTS) is an online-planning algorithm for decision-theoretic planning in domains with stochastic and combinatorial structure. The general applicability of MCTS makes it an ideal first choice to investigate when developing planners for complex applications requiring automated control and planning. The first contribution of this thesis is to...
Automatic analysis of American football videos can help teams develop strategies and extract patterns with less human effort. In this work, we focus on the problem of automatically determining which team is on offense/defense, which is an important subproblem for higher-level analysis. While seemingly mundane, this problem is quite challenging...
Writing a program that performs well in a complex environment is a challenging task. In such problems, a method of deterministic programming combined with reinforcement learning (RL) can be helpful. However, current systems either force developers to encode knowledge in very specific forms (e.g., state-action features), or assume advanced RL...
We investigate the data collection problem in sensor networks. The network consists of a number of stationary sensors deployed at different sites for sensing and storing data locally. A mobile element moves from sites to sites to collect data from the sensors periodically. There are different costs associated with the...
Constructing a panorama from a set of videos is a long-standing problem in computer vision. A panorama represents an enhanced still-image representation of an entire scene captured in a set of videos, where each video shows only a part of the scene. Importantly, a panorama shows only the scene background,...
Pedestrian distraction at roadway crossings has been correlated with a higher risk of pedestrian-vehicle collisions due to the pedestrian's cognitive, visual, and motor attention being drawn to a wide variety of secondary tasks.
This study is different from previous field studies of pedestrian midblock crossings in that the geometric layout...
The purpose of this article is to explore student reasoning with regard to problems in logic, particularly those related to the Principle of Mathematical Induction (PMI). The five case studies presented build off of work done by other researchers, most notably Dubinsky and Harel, who both looked at how students'...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
This thesis considers the problem in which a teacher is interested in teaching action policies to computer agents for sequential decision making. The vast majority of policy
learning algorithms o er teachers little flexibility in how policies are taught. In particular,
one of two learning modes is typically considered: 1)...
Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. This thesis studies the problem of learning...
Auctions are used to solve resource allocation problem between many agents and many items in real-world settings. Unfortunately, in most cases, it is possible for selfish agents to manipulate the system for their own interest at the expense of the social welfare. Such manipulation can be prevented using the Vickrey-Clarke-Groves...
Novelty detection plays an important role in machine learning and signal processing. This
project studies novelty detection in a new setting where the data object is represented as
a bag of instances and associated with multiple class labels, referred to as multi-instance
multi-label (MIML) learning. Contrary to the common assumption...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
Physical activity recognition using accelerometer data is a rapidly emerging field with many real-world applications. Much of the previous work in this area has assumed that the accelerometer data has already been segmented into pure activities, and the activity recognition task has been to classify these segments. In reality, activity...