Until now, most hypertext systems have been implemented on large scale computers. With improvements in microprocessors and development of graphical user interfaces, personal computers can run systems that previously needed the power of a mainframe. The low costs and widespread use of PCs will enable many people to use hypertext...
Error-correcting output coding (ECOC) is a method for converting a k-classsupervised learning problem into a large number L of two-class supervised learningproblems and then combining the results of these L evaluations. Previous researchhas shown that ECOC can dramatically improve the classi cation accuracy of supervisedlearning algorithms that learn to classify...
Artificial Intelligence (AI) planning techniques have been central to automating a gamut of tasks from the mundane route planning and beer production to the ethereal image processing of space-ship images. Of all the planning techniques, hierarchical- decomposition planning has been the technique most employed in industrial-strength planners. Hierarchical-decomposition planning is...
This report addresses the design and implementation of an internet-based grading tool for the "Translators" course. The motivation is to avoid exposing the instructor's Java byte-code to possible reverse-engineering tools and enable students to submit their homework virtually from any machine across the internet. This tool is intended to replace...
The Elliptic Curve Digital Signature Algorithm (ECDSA) is the elliptic curve analog of the Digital Signature Algorithm (DSA) and a federal government approved digital signature method. In this thesis work, software optimization techniques were applied to speed up the ECDSA for a particular NTST curve over GF(p). The Montgomery multiplication...
This dissertation explores and analyzes the performance of several Bayesian anytime inference algorithms for dynamic influence diagrams. These algorithms are compared on the On-Line Maintenance Agent testbed, a software artifact permitting comparison of dynamic reasoning algorithms used by an agent on a variety of simulated maintenance and monitoring tasks. Analysis...
Remote sensing is the most practical way to acquire large amounts of land cover data for monitoring and understanding environmental change, so it is important to be able to map land cover from imagery. Maps defining land cover patches as polygons rather than pixels greatly improve processing efficiency in models...
Reinforcement Learning (RL) is the study of agents that learn optimal
behavior by interacting with and receiving rewards and punishments from an unknown
environment. RL agents typically do this by learning value functions that
assign a value to each state (situation) or to each state-action pair. Recently,
there has been...
Regression testing is a common and necessary task carried out by software practitioners to validate the quality of evolving software systems. Unfortunately, regression testing is often an expensive, time-consuming process, particularly when applied to large software systems. Consequently, practitioners may wish to prioritize the test cases in their regression test...
Coarse resolution imagery, such as that produced by the MODIS instrument, poses the challenge of estimating sub-pixel proportions of di erent land cover types. This problem is di cult because of the variety and variability of vegetation within individual pixels. This thesis describes and compares two existing algorithms for estimating...
Regression testing is an expensive software engineering activity intended to provide confidence that modifications to a software system have not introduced faults. Test case prioritization techniques help to reduce regression testing cost by ordering test cases in a way that better achieves testing objectives. In this thesis, we are interested...
Farm machinery continues to increase in its importance to the agricultural sector. Depreciation, the decline in value of a durable asset over time, represents one of the largest costs of agricultural production. The general objectives of this study were to update and expand the number of Remaining Value (RV) functions...
In its simplest form, the process of diagnosis is a decision-making process in which the diagnostician performs a sequence of tests culminating in a diagnostic decision. For example, a physician might perform a series of simple measurements (body tem- perature, weight, etc.) and laboratory measurements (white blood count, CT scan,...
End-user programmers are writing an unprecedented number of programs, due in large part to the significant effort put forth to bring programming power to end users. Unfortunately, this effort has not been supplemented by a comparable effort to increase the correctness of these often faulty programs. To address this need,...
Supervised learning is concerned with discovering the relationship between example sets of features and their corresponding classes. The traditional supervised learning formulation assumes that all examples are independent from one another. The order of the examples contains no information. Nonetheless, many problems have a sequential nature. Classifiers for these problems...
Shape transformation is a technique for gradually changing one geometric shape to another. A recent approach presents the use of thin-plate radial basis functions as opposed to traditional "blobby sphere" implicit functions. Without the explicit evaluation of he energy function, this approach combined the two traditional steps into one by...
This thesis presents the results of two studies that investigate the question of what interruption-styles are most appropriate for end-user programmers who are debugging programs. In the studies, end-user programmers are presented with surprises that encourage them to investigate, use, and learn about debugging devices in their programming environment. We...
Statistics and Metrics Generator (SMG), is a software tool that gathers, stores and reports
performance and execution metrics for a web-based software installation process. The
purpose of this project was to develop a software utility that provides feedback about software
download and installation process efficiency. A web-based software installation is...
Edit distances are a well-established technique for classification problems. They have been employed successfully in many classification problems including chromosome classification and hand-written digit recognition. Virtually all machine learning algorithms represent the objects to be classified as vectors of features. However, edit distances provide only a measure of the difference...
Image segmentation continues to be a fundamental problem in computer vision and image understanding. In this thesis, we present a Bayesian network that we use for object boundary detection in which the MPE (most probable explanation) before any evidence can produce multiple non-overlapping, non-self-intersecting closed contours and the MPE with...
Hardwood lumber is a major forest product, and board grading is an important part of its manufacturing and marketing. Computer grading programs have been used to train graders and to grade lumber for board data banks, but they have not been used to machine-grade boards in an industrial environment because...
End-user programming is growing at a rapid rate, but there has been little in the way of tools or environments to improve the correctness of programs created by end users. We present an approach to dynamic assertions in one of the most widely used end-user programming paradigms - namely the...
End users develop more software than any other group of programmers, using software authoring devices such as e-mail filtering editors, by-demonstration macro builders, and spreadsheet environments. Despite this, there has been only a little research on finding ways to help these programmers with the dependability of the software they create....
Alignment of genomic sequences from different species is becoming an increasingly powerful method in biology, and is being used for many purposes. The result of sequence alignments is a list of pairs of matched locations between the pattern string and the text string. However, without any proper visualization tools to...
This thesis presents a novel technique for retiming keyframe-based animation. We call our approach Performance Timing. Keyframing is a standard technique for generating computer animation that typically requires artistic ability and a set of skills for the software package being used. From our experience observing novice animators and their work,...
Controlling a virtual character with a pen input device is difficult. Pen input
devices require freeform gestures and users are not confined to particular mapping of a
key or a button that is exactly repeatable. This is a problem since an intuitive motion
gesture for one user might not be...
Graph-based approaches for sequencing motion capture data have produced some of the most realistic and controllable character motion to date. Most previous graph-based approaches have employed a run-time global search to find paths through the motion graph that meet user-defined constraints such as a desired locomotion path. Such searches do...
The appropriate separation of concerns is a fundamental engineering principle. A concern, for software developers, is that which must be represented by code in a program; by extension, separation of concerns is the ability to represent a single concern in a single appropriate programming language construct. Advanced separation of concerns...
In some practical systems, most of the errors are of 1 → 0 type and 0 → 1
errors occur very rarely. In this thesis, first, the capacity of the asymmetric
channel is derived. The capacity of the binary symmetric channel (BSC) and the
Z-channel can be derived from this...
Professional software developers do not test code adequately, even though testing tools are widely available. Until developers realize the deficiencies in their tests, inadequate testing of software seems likely to remain a major problem. To support developers writing tests, industry and researchers have proposed systems that visualize “testedness” for end-user...
Machine learning encompasses probabilistic and statistical techniques that can build models from large quantities of extensional information (examples) with minimal dependence on intensional information (domain knowledge). This focus of machine learning is reflected in the never-ending quest for "off-the-shelf" classifiers. To generalize to unseen data, however, we must make use...
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a public key cryptosystem used for creation and verification of digital signatures in electronic documents. In this thesis, we created a Java applet that provides the functionality of the ECDSA using all of the NIST elliptic curves over GF(p). This applet was...
Successful software systems evolve over their lifetimes through the cumulative changes made by software maintainers. As software evolves, the problems resulting from software change worsen, exacerbated by increased system size and complexity, lack of program understanding, amount of effort required to make changes, and number of personnel involved. Experience shows...
"‘Biometrics is at the forefront in our agenda for homeland security,’ declared Asa Hutchinson, the Department of Homeland Security's undersecretary for border and transportation security, at the 2004 Biometric Consortium Conference” [11].
Flashy retinal scanning and voice activated computers were once considered technologies for science fiction movies and novels. Nowadays,...
Functional programming is concerned with referential transparency, that is, given a certain function and its parameter, that the result will always be the same. However, it seems that this is violated in applications involving uncertainty, such as rolling a dice. This thesis defines the background of probabilistic programming and domain-specific...
3D datasets acquire great importance in the context of medical imaging. In this thesis we survey and enhance solutions to problems inherently associated with 3D datasets-processing time,noise and visualization. Efforts include development of a tool kit to provide a multi-threaded processing platform to cut processing time, produce real time visualization...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
The thesis focuses on model-based approximation methods for reinforcement
learning with large scale applications such as combinatorial optimization problems.
First, the thesis proposes two new model-based methods to stablize the
value–function approximation for reinforcement learning. The first one is the
BFBP algorithm, a batch-like reinforcement learning process which iterates between...
Although researchers have begun to explicitly support end-user programmers’ debugging by providing information to help them find bugs, there is little research addressing the right content to communicate to these users. The specific semantic content of these debugging communications matters because, if the users are not actually seeking the information...
The computer game industry continues to progress toward realistic-looking character motion. However, even in state-of-the-art games, the use of motion capture data in character animation may result in errors such as “foot slipping,” where the feet do not match up with the floor properly during translation. Various algorithms have been...
Popular applications such as P2P file sharing, multiplayer gaming, videoconferencing, etc. rely on the efficiency of content distribution from a single source to multiple receivers. Most users of these applications are on the widely prevalent source constraint networks such as the Digital Subscriber Line (DSL) and wireless networks. Overlay multicast...
This thesis describes the implementation of an interface for querying established correspondences between anatomical structures across species. I was the main developer of this query engine, called the Comparative Anatomy Information System. My work involved developing methods to query the knowledge base, perform the specified comparison, display the anatomical hierarchies...
This thesis presents a domain specific visual language designed to allow coaches to create content that exhibits the complex 2D interactions observed in the game of American football. Coaches can visually program the content by using symbols and drawing primitives similar to those that they currently use to design static...
As broadband Internet becomes widely available, Peer-to-Peer (P2P) applications over the Internet are becoming increasingly popular. Such an example is a video multicast application in which, one source streams a video to a large number of destination nodes through an overlay multicast tree consisting of peers.
These overlay multicast-based applications,...
The high cost of manually producing background characters creates a demand for a
way to automatically generate plausible behaviors. These background extras need to
behave in a manner that is believable such that they do not distract the focus of the
audience from the primary action occurring in the scene....
Traditional application of Voronoi diagrams for space partitioning creates Voronoi regions, with areas determined by the generators’ relative locations and weights. Especially in the area of information space (re)construction, however, there is a need for inverse solutions; i.e., finding weights that result in regions with predefined areas. In this thesis,...
This thesis presents a model for simulating individual pedestrian motion based on empirical data. The model keeps track of a pedestrian’s position, orientation, and body configuration and leverages motion capture data to generate plausible motion. Our model can automatically incorporate a pedestrian’s physical limitations when making movement decisions, since it...
We present an approach for generating a character’s response in anticipation of an impending impact. Protective anticipatory movement is built upon several simple actions that have been identified as response mechanisms in monkeys and in humans. These actions are parameterized by a model of the interaction based on the approaching...
The code reuse problem is a common software engineering problem in scientific computing. As a prevailing programming language in many scientific fields, Fortran does not provide support to address this problem. One particular reason is that Fortran lacks the support for generic programming. By applying program-generation techniques, we developed two...