I present a new heuristic search approach to compute approximate answers for the probability query in belief nets. This approach can compute the 'best' bounds for a query in a period of any given time (if time permitted, it will get an exact value). It inherits the essence of Symbolic...
We have developed SearchPak, a machine independent parallel searching tool on shared and distributed memory machines. It can be used for combinatorial optimization problems and OR-parallel computations as well. Both depth-first and best-first search of the state space can be performed using the SearchPak. With SearchPak, a user just provides...
High level data-parallel languages are easy to use and shield the programmer from machine specific details. A simple and efficient way of providing an interface to such languages is to develop a machine-independent compiler and a routing library, which isolates the low-level machine dependent communication functions, The compiler translates the...
Dataparallel C has been designed for various kinds of architectures and was developed jointly at University of New Hampshire and Oregon State University.
This project dealt with porting the libraries to the nCUBE 2 system and helping a user compile his programs on nCUBE 2 directly from a Sun workstation...
In this paper we describe a fuzzy logic based control system implemented on a PC architecture. The rule based inference engine of this expert system is easily configured through an ascii text file and is demonstrated to be capable of controlling various simulations, including an inverted pendulum. The controller is...
We present a model for a distributed virtual market place that can be constructed on the Internet to support selling and buying requests, such as those found as classified advertisements. One requirement for a transaction to take place in the virtual market place is that a sell request and a...
Wireless Application Protocol (WAP) is an industry standard aimed to bring the web to handheld devices. The handheld devices are constrained by battery life and it becomes important that power is conserved across web transactions. The power consumed by the handheld device is directly proportional to the time taken for...
Structure Query Language (SQL) is widely used to access data stored in relational database systems. Although a powerful and flexible language, SQL can also be complex and hard to learn. For most new SQL users, it's easy to write SQL statement by following SQL grammar and syntax rules, but it's...
A video server is the multimedia equivalent of a data file server. As video goes digital video servers that support digital media are required for production and broadcasting in television companies. Grass Valley Group provides comprehensive digital video solutions through a range of video servers called Profiles.
The Profile Media...
Recent work has shown that AdaBoost can be viewed as an algorithm that maximizes the margin on the training data via functional gradient descent. Under this interpretation, the weight computed by AdaBoost, for each hypothesis generated, can be viewed as a step size parameter in a gradient descent search. Friedman...
Analysis of programs forms an important activity in the field of software engineering. It is necessary to help understand the code, which facilitates comprehensive testing, maintenance and optimization of code. Aristotle is a tool for analyzing programs written in C. We have designed a system on similar lines for Java...
Remote sensing is the most practical way to acquire large amounts of land cover data for monitoring and understanding environmental change, so it is important to be able to map land cover from imagery. Maps defining land cover patches as polygons rather than pixels greatly improve processing efficiency in models...
We have developed a web-based software tool for evaluating the potential risk of pesticides on the surrounding environment. The software tool is called Web-based Pesticide Screening Tool (Web-PST). It uses the formulas and standards specified by the Soil/Pesticide Interaction Screening Procedure Version II (SPISP II). Web-PST closely models the stand-alone...
Regression testing is an expensive testing process used to validate changes made to previously tested software. Different regression testing techniques can have different impacts on the cost-effectiveness of testing. This cost-effectiveness can also vary with different characteristics of test suites. One such characteristic, test suite granularity, reflects the way in...
Component based software technologies are viewed as essential for creating the software systems of the future. However the use of externally provided components has serious drawbacks for a wide range of software engineering activities often because of a lack of information about the components. One such drawback involves validation of...
Component based software technologies are viewed as essential for creating the software systems of the future. However, the use of externally provided component has serious drawbacks for a wide range of software engineering activities often because of a lack of information about the components. One such drawback involves validation of...
Web-Site Generator 3 (WebSiteGen3) is a rapid application development (RAD) tool that generates ASP.NET forms to insert, query, update and delete data stored in the user tables in a SQL Server 2000 database. WebSiteGen3 uses a graphical user interface to show the user tables in a hierarchical tree based on...
Revised Universal Soil Loss Equation (RUSLE) is a standard for estimating soil loss caused by rainfall and overland flow. The current software tool that implements this standard is RUSLE 1.06b, which is a stand-alone DOS application. In this project, we converted the DOS application to a Web application, which we...
We have developed a framework for Web-based GIS/database applications which allow users to insert, update, delete, and query data with a map interface displayed by Web browsers. The framework was designed so that a Web-based GIS application that uses ArcIMS as a map server can be easily created, customized, and...
Environmental and regulatory concerns are causing confined animal feeding operations (CAFO's) to account for phosphorous content when applying wastewater to agricultural fields for disposal. In most cases this requires more land to spread the waste onto so that the phosphorous needs of the crop are not exceeded. A mobile process...
Regression testing is an expensive software engineering activity intended to provide confidence that modifications to a software system have not introduced faults. Test case prioritization techniques help to reduce regression testing cost by ordering test cases in a way that better achieves testing objectives. In this thesis, we are interested...
We have developed a prototype web-based GIS application for tracking the locations of moving entities. This application, which is called the Responder application, consists of two parts: the Responder client and the Responder server. The Responder client is a .NET application written in C#. It reads the location data from...
CREEDA-BG (Crop Rotation Economic and Environmental Decision-Aid Budget Generator) is a web-based enterprise budgeting tool that allows a user to manage information on her fields, rotations, crops, and operations, and to view estimates of income and expenses. The user can then use this information to evaluate alternative plans and make...
Web-based Pesticide Screening Tool (Web-PST) is a web-based software application for evaluating the potential risk of pesticides on the surrounding environment. It uses the formulas and standards specified by the Soil/Pesticide Interaction Screening Procedure Version II (SPISP II). Web-PST closely models the stand-alone Windows application Windows Pesticide Screening Tool (WIN-PST)....
Image segmentation continues to be a fundamental problem in computer vision and image understanding. In this thesis, we present a Bayesian network that we use for object boundary detection in which the MPE (most probable explanation) before any evidence can produce multiple non-overlapping, non-self-intersecting closed contours and the MPE with...
End users develop more software than any other group of programmers, using software authoring devices such as e-mail filtering editors, by-demonstration macro builders, and spreadsheet environments. Despite this, there has been only a little research on finding ways to help these programmers with the dependability of the software they create....
"Collaborative filtering algorithms’ performances have been evaluated using a variety of metrics.
These metrics, such as Mean Absolute Error and Precision, have often ignored recommendations for
which they do not have data. Ignoring these recommendations has provided numbers which do not
accurately represent the user experience. Qualitatively we have seen...
The appropriate separation of concerns is a fundamental engineering principle. A concern, for software developers, is that which must be represented by code in a program; by extension, separation of concerns is the ability to represent a single concern in a single appropriate programming language construct. Advanced separation of concerns...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
The thesis focuses on model-based approximation methods for reinforcement
learning with large scale applications such as combinatorial optimization problems.
First, the thesis proposes two new model-based methods to stablize the
value–function approximation for reinforcement learning. The first one is the
BFBP algorithm, a batch-like reinforcement learning process which iterates between...
Software maintenance accounts for a large portion of the software development cost, particularly the process of updating programs either to adapt for requirement change or to enhance design or efficiency. Currently, program updates are generally performed manually by programmers using text editors. This is an unreliable
method because syntax and...
Accessing information on the Web has become ingrained into our daily lives, and we seek information from many different sources, including conference and journal publications, personal web pages, and others. Increasingly, web-based information retrieval systems such as web-based search engines, library on-line catalog systems, and subscription-based federated search systems are...
Domain-independent automated planning is concerned with computing a sequence of actions that can transform an initial state into a desired goal state. Resource production domains form an interesting class of such problems, in that they typically require reasoning about concurrent durative-actions with continuous effects while minimizing some cost function. Although...
Image feature detection and matching are two critical processes for many computer vision tasks. Currently, intensity-based local interest region detectors and local feature-based matching methods are used widely in computer vision applications. But in some applications, such as biological object recognition tasks, within-class changes in pose, lighting, color, and texture...
Current ACI design provisions for shear in reinforced concrete (RC) beams strengthened
with externally bonded carbon fiber-reinforced polymer (CFRP) U-wraps may not adequately account for the effect of scale. This paper describes tests of 6 geometrically scaled beams to identify possible scale effects of such beams. Results indicated that effective...
This thesis addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree models of the conditional probability distributions of the DBNs. Existing algorithms presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise....
We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...
Automated recognition of object categories in images is a critical step for many real-world computer vision applications. Interest region detectors and region descriptors have been widely employed to tackle the variability of objects in pose, scale, lighting, texture, color, and so on. Different types of object recognition problems usually require...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Recently, the concept of performance based design has become popular for many types of structures, including port facilities under seismic loading. For the case of pile-supported wharves, the level of performance is generally estimated using the displacement capacity of the structure. Therefore, understanding soil-pile interaction is one of the most...
Ocean wave energy is a new and developing field of renewable energy with great potential. The energy contained in one meter of an average wave off the coast of Newport Oregon could supply dozens of homes with electricity. However, ocean waves are usually quite irregular which leads to large bursts...
The problem of document classification has been widely studied in machine learning and data mining. In document classification, most of the popular algorithms are based on the bag-of-words representation. Due to the high dimensionality of the bag-of-words representation, significant research has been conducted to reduce the dimensionality via different approaches....
This dissertation explores the idea of applying machine learning technologies to help computer users find information and better organize electronic resources, by presenting the research work conducted in the following three applications: FolderPredictor, Stacking Recommendation Engines, and Integrating Learning and Reasoning.
FolderPredictor is an intelligent desktop software tool that helps...
In wood-based composites, the glue-line (interface) between wood-strands affects the stress transfer from one member to the next. The glue-line properties determine the rate of load transfer between phases and these properties depend on wood species, surface preparation, glue properties, glue penetration into wood cells, and moisture content of the...
Customer requirements and engineering specifications will influence the direction of the design in any product, so it is crucial to make the requirements as stable as possible so quality designs can be produced on time within a budget. It would be optimal for requirements and specifications to remain constant but...
Monte-Carlo planning algorithms such as UCT make decisions at each step by
intelligently expanding a single search tree given the available time and then
selecting the best root action. Recent work has provided evidence that it can be
advantageous to instead construct an ensemble of search trees and make a...
This dissertation explores algorithms for learning ranking functions to efficiently solve search problems, with application to automated planning. Specifically, we consider the frameworks of beam search, greedy search, and randomized search, which all aim to maintain tractability at the cost of not guaranteeing completeness nor optimality. Our learning objective for...
A thesis presented on the determination of factors related to deposition, clearance, and dosimetry of the ICRP 30 lung model as related to smoking. Variations in calculations used to determine radiation doses leave a need to determine a standard method to determine absorbed dose from smoking. Standard parameters are used...
Since free riders in P2P network reduce the system's performance, how to maintain and encourage the nodes' cooperation is an important aspect of P2P related research. In this thesis, a P2P system is modeled based on two games: stag hunt game and snowdrift game. To relate the model to the...
The first edition of the Highway Safety Manual (HSM) provides a quantitative
approach to predict the safety of transportation facilities based on the recently
developed scientific methods. This approach, known as the predictive method, was
developed for several states in the United States. Due to differences in driver
population, weather...
Microstructure changes in uranium and uranium/metal alloys due to radiation damage are of great interest in nuclear science and engineering. Titanium has attracted attention because of its similarity to Zr. It has been proposed for use in the second generation of fusion reactors due to its resistance to radiation-induced swelling....
Microchannel process technology (MPT) offers several advantages to the field of nanomanufacturing: 1) improved process control over very short time intervals owing to shorter diffusional distances; and 2) reduced reactor size due to high surface area to volume ratios and enhanced heat and mass transfer. The objective of this thesis...
We consider the problem of strategic adversarial planning in a Real-Time Strategy (RTS) game. Strategic adversarial planning is the generation of a network of high-level tasks to satisfy goals while anticipating an adversary's actions. In this thesis we describe an abstract state and action space used for planning in an...
In this work, I examine the problem of understanding American football in video. In particular, I present several mid-level computer vision algorithms that each accomplish a different sub-task within a larger system for annotating, interpreting, and analyzing collections of American football video. The analysis of football video is useful in...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are...
Air traffic flow management over the U.S. airpsace is a difficult problem. Current management approaches lead to hundreds of thousands of hours of delay, costing billions of dollars annually. Weather and airport conditions may instigate this delay, but routing decisions balancing delay with congestion contribute significantly to the propagation of...
Markov Decision Process (MDP) is a well-known framework for devising the optimal decision making strategies under uncertainty. Typically, the decision maker assumes a stationary environment which is characterized by a time-invariant transition probability matrix. However, in many real-world scenarios, this assumption is not justified, thus the optimal strategy might not...
Physical activity recognition using accelerometer data is a rapidly emerging field with many real-world applications. Much of the previous work in this area has assumed that the accelerometer data has already been segmented into pure activities, and the activity recognition task has been to classify these segments. In reality, activity...
Auctions are used to solve resource allocation problem between many agents and many items in real-world settings. Unfortunately, in most cases, it is possible for selfish agents to manipulate the system for their own interest at the expense of the social welfare. Such manipulation can be prevented using the Vickrey-Clarke-Groves...
Novelty detection plays an important role in machine learning and signal processing. This
project studies novelty detection in a new setting where the data object is represented as
a bag of instances and associated with multiple class labels, referred to as multi-instance
multi-label (MIML) learning. Contrary to the common assumption...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. This thesis studies the problem of learning...
This thesis considers the problem in which a teacher is interested in teaching action policies to computer agents for sequential decision making. The vast majority of policy
learning algorithms o er teachers little flexibility in how policies are taught. In particular,
one of two learning modes is typically considered: 1)...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...
The purpose of this article is to explore student reasoning with regard to problems in logic, particularly those related to the Principle of Mathematical Induction (PMI). The five case studies presented build off of work done by other researchers, most notably Dubinsky and Harel, who both looked at how students'...
Pedestrian distraction at roadway crossings has been correlated with a higher risk of pedestrian-vehicle collisions due to the pedestrian's cognitive, visual, and motor attention being drawn to a wide variety of secondary tasks.
This study is different from previous field studies of pedestrian midblock crossings in that the geometric layout...
We investigate the data collection problem in sensor networks. The network consists of a number of stationary sensors deployed at different sites for sensing and storing data locally. A mobile element moves from sites to sites to collect data from the sensors periodically. There are different costs associated with the...
Constructing a panorama from a set of videos is a long-standing problem in computer vision. A panorama represents an enhanced still-image representation of an entire scene captured in a set of videos, where each video shows only a part of the scene. Importantly, a panorama shows only the scene background,...
Through passive adaptation to incidental flow, flexible aerodynamic surfaces exploit effects of increased lift, delayed stall and disturbance rejection. Wings of birds, bats, and insects exhibit these passive effects, and at the same time through the use of structural state feedback sensed from the loads on the wing, active control...
The Open Modeling Environment (OME) is a tool developed to address some known shortcomings in ecological System Dynamics (SD) modeling research. OME provides a common set of methods for interacting directly with spatial information, reducing the need for modelers to create their own methods for doing so. The environment is...
Monte-Carlo Tree Search (MCTS) is an online-planning algorithm for decision-theoretic planning in domains with stochastic and combinatorial structure. The general applicability of MCTS makes it an ideal first choice to investigate when developing planners for complex applications requiring automated control and planning. The first contribution of this thesis is to...
Automatic analysis of American football videos can help teams develop strategies and extract patterns with less human effort. In this work, we focus on the problem of automatically determining which team is on offense/defense, which is an important subproblem for higher-level analysis. While seemingly mundane, this problem is quite challenging...
Writing a program that performs well in a complex environment is a challenging task. In such problems, a method of deterministic programming combined with reinforcement learning (RL) can be helpful. However, current systems either force developers to encode knowledge in very specific forms (e.g., state-action features), or assume advanced RL...
Direct red Solid Oxide Fuel Cell (SOFC) Turbine hybrid plants have the potential to dramatically increase power plant efficiency, decrease emissions, and provide fast response to transient loads. The US Department of Energy's (DOE) Hybrid Performance Project is an experimental hybrid SOFC plant, built at the National
Energy Technology Laboratory...
The study of variational typing originated from the problem of type inference for variational programs, which encode numerous different but related plain programs. In this dissertation, I present a sound and complete type inference algorithm for inferring types of all plain programs encoded in variational programs. The proposed algorithm runs...
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do...
Autonomous multiagent teams can be used in complex exploration tasks to both expedite the exploration and improve the efficiency. However, use of multiagent systems presents additional challenges. Specifically, in domains where the agents' actions are tightly coupled, coordinating multiple agents to achieve cooperative behavior at the group level is difficult....
We model the popular board game of Clue as an MDP and evaluate Monte-Carlo policy rollout in a simulated environment pitting different agents and policies against each other. We describe the choices we made in the representation, along with some of the problems we encountered along the way. We find...
In this thesis, we will study certain generalizations of the classical Shannon Sampling Theorem, which allows for the reconstruction of a pi-band-limited, square-integrable function from its samples on the integers. J. R. Higgins provided a generalization where the integers can be perturbed by less than 1/4, which includes nonuniform and...
With the development of technologies in genome sequencing and variant detection, a huge number of variants are detected. To further analyze the variants, it requires an efficient tool to annotate the functional effect of variants. This project managed to develop an efficient program to annotate the functional effect of variants...
Machine learning models for natural language processing have traditionally relied on large numbers of discrete features, built up from atomic categories such as word forms and part-of-speech labels, which are considered completely distinct from each other. Recently however, the advent of dense feature representations coupled with deep learning techniques has...
This work is inspired by problems in natural resource management centered on the challenge of invasive species. Computing optimal management policies for maintaining ecosystem sustainable is challenging. Many ecosystem management problems can be formulated as MDP (Markov Decision Process) planning problems. In a simulator-defined MDP, the Markovian dynamics and rewards...
Appropriate representations of variational software simplify the analysis of their properties.This thesis proposes tailored representations of two kinds variational softwares: difference files of merge commits in Git and feature models. For the former, we use the Choice Edit Model, which is based on the choice calculus, to represent changes introduced...
This thesis focuses on the problem of object tracking. Given a video, the general objective of tracking is to track the location over time of one or more targets in the image sequence. This is a very challenging task as algorithms need to deal with problems such as appearance variations,...
Monte Carlo tree search (MCTS) is a class of online planning algorithms for Markov decision processes (MDPs) and related models that has found success in challenging applications. In the online planning approach, the agent makes a decision in the current state by performing a limited forward search over possible futures...
Mutation testing is one of the effective approaches measuring test adequacy of test suites. It is widely used in both academia and industry. Unfortunately, the adoption and practical use of mutation testing for Python 2.x programs face three obstacles. First, limited useful mutation operators. Existing mutation testing tools support very...
Software testing is the process of evaluating the accuracy and performance of software, and automated software testing allows programmers to develop software more efficiently by decreasing testing costs. We compared two advanced random test generators, a Feedback-Directed Random Test Generator (FDR) and a Feedback-Controlled Random Test Generator (FCR), for an...
Automatic event extraction from natural text is an important and challenging task for natural language understanding. Traditional event detection methods heavily rely on manually engineered rich features. Recent deep learning approaches alleviate this problem by automatic feature engineering. But such efforts, like tradition methods, have so far only focused on...
Although machine learning systems are often effective in real-world applications, there are situations in which they can be even better when provided with some degree of end user feedback. This is especially true when the machine learning system needs to customize itself to the end user's preferences, such as in...
Most tasks in natural language processing (NLP) try to map structured input (e.g., sentence or word sequence) to some form of structured output (tag sequence, parse tree, semantic graph, translated/paraphrased/compressed sentence), a problem known as “structured prediction”. While various learning algorithms such as the perceptron, maximum entropy, and expectation-maximization have...
The thesis focuses on activity recognition from sensor data, which has spurred a great deal of interest due to its impact on health care and security. Previous work on activity recognition from multivariate time series data has mainly applied supervised learning techniques which require a high degree of annotation effort...
Society faces many complex management problems, particularly in the area of shared public resources such as ecosystems. Existing decision making processes are often guided by personal experience and political ideology rather than state-of-the-art scientific understanding. This dissertation envisions a future in which multiple stakeholders are provided with computational tools for...
In the field of machine learning, clustering and classification are two fundamental tasks. Traditionally, clustering is an unsupervised method, where no supervision about the data is available for learning; classification is a supervised task, where fully-labeled data are collected for training a classifier. In some scenarios, however, we may not...
Oxo-hydroxo Group 5 metal clusters are an untapped resource to study and advance aqueous solution processing of metal oxide thin films. The tetramethylammonium (TMA) hexatantalate salt (TMA6[H2Ta6O19]) yields dense Ta2O5 films (~95% of the bulk ß-Ta2O5 density) with atomically smooth surfaces (<4 Å root mean square surface roughness). This same...
There is growing commercial interest in the use of unmanned aerial vehicles (UAVs) in urban environments, specifically for package delivery applications. However, the size, complexity and sheer numbers of expected UAVs makes conventional air traffic management that relies on human air traffic controllers infeasible. To enable UAVs to safely and...
In Intense Pulsed Light (IPL) sintering, pulsed large-area visible light from a xenon lamp is absorbed by nanoparticle films or patterns and converted to heat, resulting in rapid sintering of the nanoparticles. This work experimentally characterizes IPL sintering of silver nanoparticle films. A newly observed turning point in the evolution...
The Rust programming language is a systems programming language with a strong static type system. A central feature of Rust’s type system is its unique concept of “ownership”, which enables Rust to give a user safe, low-level control over resources without the overhead of garbage collection. In Haskell, most data...