"‘Biometrics is at the forefront in our agenda for homeland security,’ declared Asa Hutchinson, the Department of Homeland Security's undersecretary for border and transportation security, at the 2004 Biometric Consortium Conference” [11].
Flashy retinal scanning and voice activated computers were once considered technologies for science fiction movies and novels. Nowadays,...
Functional programming is concerned with referential transparency, that is, given a certain function and its parameter, that the result will always be the same. However, it seems that this is violated in applications involving uncertainty, such as rolling a dice. This thesis defines the background of probabilistic programming and domain-specific...
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a public key cryptosystem used for creation and verification of digital signatures in electronic documents. In this thesis, we created a Java applet that provides the functionality of the ECDSA using all of the NIST elliptic curves over GF(p). This applet was...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
Popular applications such as P2P file sharing, multiplayer gaming, videoconferencing, etc. rely on the efficiency of content distribution from a single source to multiple receivers. Most users of these applications are on the widely prevalent source constraint networks such as the Digital Subscriber Line (DSL) and wireless networks. Overlay multicast...
As broadband Internet becomes widely available, Peer-to-Peer (P2P) applications over the Internet are becoming increasingly popular. Such an example is a video multicast application in which, one source streams a video to a large number of destination nodes through an overlay multicast tree consisting of peers.
These overlay multicast-based applications,...
This thesis describes the implementation of an interface for querying established correspondences between anatomical structures across species. I was the main developer of this query engine, called the Comparative Anatomy Information System. My work involved developing methods to query the knowledge base, perform the specified comparison, display the anatomical hierarchies...
The computer game industry continues to progress toward realistic-looking character motion. However, even in state-of-the-art games, the use of motion capture data in character animation may result in errors such as “foot slipping,” where the feet do not match up with the floor properly during translation. Various algorithms have been...
The high cost of manually producing background characters creates a demand for a
way to automatically generate plausible behaviors. These background extras need to
behave in a manner that is believable such that they do not distract the focus of the
audience from the primary action occurring in the scene....
Media application on the internet has become more and more popular as the bandwidth of the network links increase. The bottleneck of the existing media systems is no longer the link bandwidth at user’s end, but the server’s ability to handle streaming requests. These existing streaming systems do not scale...
This thesis presents a domain specific visual language designed to allow coaches to create content that exhibits the complex 2D interactions observed in the game of American football. Coaches can visually program the content by using symbols and drawing primitives similar to those that they currently use to design static...
The thesis focuses on model-based approximation methods for reinforcement
learning with large scale applications such as combinatorial optimization problems.
First, the thesis proposes two new model-based methods to stablize the
value–function approximation for reinforcement learning. The first one is the
BFBP algorithm, a batch-like reinforcement learning process which iterates between...
We present an approach for generating a character’s response in anticipation of an impending impact. Protective anticipatory movement is built upon several simple actions that have been identified as response mechanisms in monkeys and in humans. These actions are parameterized by a model of the interaction based on the approaching...
Traditional application of Voronoi diagrams for space partitioning creates Voronoi regions, with areas determined by the generators’ relative locations and weights. Especially in the area of information space (re)construction, however, there is a need for inverse solutions; i.e., finding weights that result in regions with predefined areas. In this thesis,...
The code reuse problem is a common software engineering problem in scientific computing. As a prevailing programming language in many scientific fields, Fortran does not provide support to address this problem. One particular reason is that Fortran lacks the support for generic programming. By applying program-generation techniques, we developed two...
Although researchers have begun to explicitly support end-user programmers’ debugging by providing information to help them find bugs, there is little research addressing the right content to communicate to these users. The specific semantic content of these debugging communications matters because, if the users are not actually seeking the information...
This thesis presents a model for simulating individual pedestrian motion based on empirical data. The model keeps track of a pedestrian’s position, orientation, and body configuration and leverages motion capture data to generate plausible motion. Our model can automatically incorporate a pedestrian’s physical limitations when making movement decisions, since it...
Software maintenance accounts for a large portion of the software development cost, particularly the process of updating programs either to adapt for requirement change or to enhance design or efficiency. Currently, program updates are generally performed manually by programmers using text editors. This is an unreliable
method because syntax and...
Finding information can cost a significant amount of time, even when the information is already stored on the user’s local computer system. There is significant research aimed at reducing these time costs, but little research into exactly what these costs are or how they impact people’s use of tools and...
Oftentimes in visualization, the goal of using volume datasets is not just to visualize them but also to analyze and compare them. In order to compare the two volumes, we cannot take all the voxels into consideration. The size of a typical volume data set is quite large (maybe a...
The Line Integral Convolution (LIC) is a mainstay of flow visualization. It is, however, computationally intensive, which limits its interactivity. Also, when used to view three-dimensional (3D) vector fields, the resulting images are dense and cluttered, making it difficult to perceive the flow on the interior parts of the field....
Packet loss, delay and time-varying bandwidth are three main problems facing multimedia streaming applications over the Internet. Existing techniques such as Media-aware network protocol, network adaptive source and channel coding, etc. have been proposed to either overcome or alleviate these drawbacks of the Internet. But these techniques either need specialized...
3D datasets acquire great importance in the context of medical imaging. In this thesis we survey and enhance solutions to problems inherently associated with 3D datasets-processing time,noise and visualization. Efforts include development of a tool kit to provide a multi-threaded processing platform to cut processing time, produce real time visualization...
Spreadsheets are among the most widely used end-user programming systems. Unfortunately, there is a high incidence of errors in end-user spreadsheets, and some of these errors have high impact. In this dissertation, we describe techniques we have developed to help end users develop safer spreadsheets. As part of our research,...
Until recently, research has not considered whether the design of end-user programming environments, such as spreadsheets, multimedia authoring languages, and CAD systems, affects males and females differently. As a result, we began investigating how the two genders are impacted by end-user programming software and whether attention to gender differences is...
Accessing information on the Web has become ingrained into our daily lives, and we seek information from many different sources, including conference and journal publications, personal web pages, and others. Increasingly, web-based information retrieval systems such as web-based search engines, library on-line catalog systems, and subscription-based federated search systems are...
Most of the work so far in the subfield of Gender HCI has followed a theory-driven approach. Established theories, however, do not take into account specific issues that arise in end-user debugging. We suspected that there may be important information that we were overlooking. We therefore employed a methodology change:...
An n-bit Gray code is an ordered set of all 2n binary strings of length n. The
special property of this listing is that Hamming distance between consecutive vectors
is exactly 1. If the last and first codeword also have a Hamming distance 1 then the
code is said to...
Domain-independent automated planning is concerned with computing a sequence of actions that can transform an initial state into a desired goal state. Resource production domains form an interesting class of such problems, in that they typically require reasoning about concurrent durative-actions with continuous effects while minimizing some cost function. Although...
A basic tradeoff to consider when designing a distributed data-mining framework is the need for a compromise between the cost of communication and computation resources and the accuracy of the mining results. This is essentially a decision of whether it is more efficient to communicate all of the data to...
The goal of many machine learning problems can be formalized as the creation of a function that can properly classify an input vector, given a set of examples of that function. While this formalism has produced a number of success stories, there are notable situations in which it fails. One...
Image feature detection and matching are two critical processes for many computer vision tasks. Currently, intensity-based local interest region detectors and local feature-based matching methods are used widely in computer vision applications. But in some applications, such as biological object recognition tasks, within-class changes in pose, lighting, color, and texture...
There has been little research into how end-user programming environments can provide explanations that could fill a critical information gap for end-user debuggers – help with debugging strategy. To address this need, we designed and prototyped a video-based approach for explaining debugging strategy, and accompanied it with a text-only approach....
Nowadays, sports events are a significant part of the every-day entertainment with local, national, and international championships. A lot of money is invested by broadcasting companies to attract new and more viewers, acquire broadcasting rights, or send entire crews on site to cover such events. Journalists are among the few...
Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can operate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors ranging from corrupt communications to partial...
Protecting end-users privacy and building trust are the two most important factors needed to support the growth of ecommerce. The increased dependence on the Internet for a wide variety of daily transactions causes a corresponding loss in privacy for many users, as virtually all websites collect data from users directly...
There has been little prior research reporting strategy usage in end-user problem solving, and even less using gender as a factor. Without this type of information, enduser programming systems cannot know the “target” at which to aim, if they are to support male and female end-user programmers’ debugging. As a...
Building intelligent computer assistants has been a long-cherished goal of AI. Many intelligent assistant systems were built and fine-tuned to specific application domains. In this work, we develop a general model of assistance that combines three powerful ideas: decision theory, hierarchical task models and probabilistic relational languages. We use the...
Active participation and collaboration of community members are crucial to the continuation and expansion of open source software projects. Researchers have recognized the value of community in open source development and studied various aspects of it including structure of communities, motivations for participation, and collaboration among members. However, the majority...
Web applications are popular attack targets. Misuse detection systems use signature databases to detect known attacks. However, it is difficult to keep the database up to date with the rate of discovery of vulnerabilities. They also cannot detect zero-day attacks. By contrast, anomaly detection systems learn the normal behavior of...
Forward Error Correction and retransmission are two approaches used to reliably broadcast data in a network with poor quality of service. Taking some assumptions, it has been suggested that a retransmission based reliable broadcasting scheme using network coding should in theory provide an increase in bandwidth efficiency by combining packets...
DiskGrapher is a graphical visualization tool designed to help users better manage the
space on their hard drives. The main goal of DiskGrapher is to provide a different
visualization technique to display information, with the goal of providing a more intuitive
understanding of the directory structure of the disk than...
Factorization of integers is an important aspect of cryptography since it can be used as an
attack against some of the common cryptographic methods being used. There are
numerous methods in existence for factoring integers. Some of these are faster than
others for general numbers, while others work best on...
Fluid simulation is an interesting research problem with a wide range of applications including mechanical engineering, special effects in movies and games, and scientific simulation. Due to the complex nature of typical fluid flow equations, there are circumstances where a full volumetric fluid simulation may not be necessary to generate...
This thesis addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree models of the conditional probability distributions of the DBNs. Existing algorithms presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise....
In diversity combining automatic repeat request (ARQ), erroneous packets are combined together forming a single, more reliable, packet. In this thesis, we give a diversity combining scheme for the m-ary unidirectional channel. A system using the given scheme with a t-unidirectional error detecting code is able to correct up to...
Motion capture data is a digital representation of the complex temporal structure of human motion. Motion capture is widely used for data-driven animation in sports,medicine and entertainment, because of its ability to capture complex and realistic
motions. Due to its efficiency and cost, methods for reusing collections of motion capture...
Transportation infrastructure provides a vital service for the functionality of a
city. The efficient design of road networks poses an interesting topic in computer
science for digital content developers. For civil engineers, the visualization of
analysis results on infrastructure both efficiently and intuitively is crucial. The
following contributions are made...
Recent efforts in user-control of data-driven characters have focused on designing high-level graph data-structures that we call a Behavior Finite State Machine (BFSM). A BFSM is an interactive data-structure that benefits from the advantages of both motion graphs and blend-based techniques for generating animated motion. Each node in a BFSM...
Supervised learning is concerned with discovering the relationship between example sets of features and their corresponding classes. The traditional supervised learning formulation assumes that all examples are independent from one another. The order of the examples contains no information. Nonetheless, many problems have a sequential nature. Classifiers for these problems...
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to identify statistically significant anomalies, a more difficult task is to identify anomalies that are both interesting and statistically significant. Category detection is an emerging area of machine learning...
Professional software developers do not test code adequately, even though testing tools are widely available. Until developers realize the deficiencies in their tests, inadequate testing of software seems likely to remain a major problem. To support developers writing tests, industry and researchers have proposed systems that visualize “testedness” for end-user...
Probabilistic inference using Bayesian networks is now a well-established approach for reasoning under uncertainty. Among many e ciency-driven tech- niques which have been developed, the Optimal Factoring Problem (OFP) is distinguished for presenting a combinatorial optimization point of view on the problem. The contribution of this thesis is to extend...
Modern digital still cameras are equipped with just a single CCD for color image acquisition. Since only one spectral band can be recorded in each pixel, a mosaic of red, green and blue color filters is placed in front of the chip. The process of subsequently calculating a full color...
End users develop more software than any other group of programmers, using software authoring devices such as e-mail filtering editors, by-demonstration macro builders, and spreadsheet environments. Despite this, there has been only a little research on finding ways to help these programmers with the dependability of the software they create....
This project describes a web-based information management system which has been
widely used by staff members and students at OSU. Without this system, it was very
hard for the staff members to manage a variety of fees; for example, if they want to
change a certain fee, staff members had...
Practical parallel programming demands that the details of distributing data to processors and inter- processor communication be managed by the compiler. These tasks quickly become too di cult for a programmer to do by hand for all but the simplest parallel programs. Yet, many parallel languages still require the programmer...
Distance-based algorithms are machine learning algorithms that classify queries
by computing distances between these queries and a number of internally stored
exemplars. Exemplars that are closest to the query have the largest in
uence on
the classi cation assigned to the query. Two speci c distance-based algorithms, the
nearest neighbor...
In its simplest form, the process of diagnosis is a decision-making process in which the diagnostician performs a sequence of tests culminating in a diagnostic decision. For example, a physician might perform a series of simple measurements (body tem- perature, weight, etc.) and laboratory measurements (white blood count, CT scan,...
We developed and investigated machine learning methods that require
minimal preprocessing of the input data, use few training examples, run fast, and
still obtain high levels of accuracy.
Most approaches to designing machine learning programs are based on the
supervised learning paradigm – training examples are chosen randomly and given...
We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...
Markov models are commonly used for joint inference of label sequences. Unfortunately, inference scales quadratically in the number of labels, which is problematic for training methods where inference is repeatedly preformed and is the primary computational bottleneck for large label sets. Recent work has used output coding to address this...
The push towards higher performing and more sensitive mixed signal circuitry has required the parallel development of increasingly more complex and sensitive test and calibration harnesses. Current off-chip methods of test and calibration may require higher pin counts or induce unwanted parasitic interference.
In this thesis, the design of a...
In this dissertation, we present a user-in-the-loop method for the design of an interactive motion data structure that benefits from the advantages of both motion graphs and blend-based techniques. Our novel approach automatically analyzes a traditional motion graph built from labeled motion clips. The result is a more condensed, coarser...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Ensuring correctness of real-world software applications is a challenging task. Testing can be used to find many bugs, but is typically not sufficient for proving correctness or even eliminating entire classes of bugs. However, formal proof and verification techniques tend to be very heavy weight and are simply not available...
Open source software has become a powerful force in the world of computing. While once confined to the domain of technical specialists, people of all types have begun to adopt this software – from the casual web-surfer who uses Firefox, to the professional web developer who codes in PHP or...
As the volume of genetic sequence data increases due to improved sequencing techniques and increased interest, the computational tools available to analyze the data are becoming inadequate. This thesis seeks to improve a few of the computational methods available to access and analyze data in the genetic sequence databases. The...
Coarse resolution imagery, such as that produced by the MODIS instrument, poses the challenge of estimating sub-pixel proportions of di erent land cover types. This problem is di cult because of the variety and variability of vegetation within individual pixels. This thesis describes and compares two existing algorithms for estimating...
Successful software systems evolve over their lifetimes through the cumulative changes made by software maintainers. As software evolves, the problems resulting from software change worsen, exacerbated by increased system size and complexity, lack of program understanding, amount of effort required to make changes, and number of personnel involved. Experience shows...
Multiparadigm programming languages are a recent development in the realm of programming languages. A multiparadigm programming language allows the use of multiple, differing programming paradigms without departing from a single, unified linguistic framework. Multiparadigm programming languages are claimed to have benefits to both pedagogy and complex application creation. The beneficial...
Automatic painterly rendering systems have been proposed but they opted for selecting a single style to generate paintings from images, which lacks the ability of creatively using multiple styles to focus important objects and deemphasize unimportant part of the scenes. We provide a multi-style painting framework to
address this issue...
MIDAS is an application framework developed at College of Oceanic and Atmospheric
Science for interactive remote data acquisition and visualization. The objective is to provide
dynamic reconfiguration of the sensing process. The current MIDAS application framework
utilizes the code mobility and portability of Java 2 platform. The Jini technology for...
Events are an important concept in the Microsoft windows operating system. When a
program runs interactively, it uses a user interface or a console to communicate with the
user. Background services, however, do not have such mechanisms; instead, they use
events to notify the user about changes and to report...
Controlling a virtual character with a pen input device is difficult. Pen input
devices require freeform gestures and users are not confined to particular mapping of a
key or a button that is exactly repeatable. This is a problem since an intuitive motion
gesture for one user might not be...
In this research, we have captured, in pattern form, key elements of programming and design in four programming paradigms (imperative, object-oriented, functional and logical) as well as multiparadigm programming. These pattern sets have formed a foundation upon which we were able to build a deeper understanding of multiparadigm programming and...
Learning easily understandable decision rules from examples is one of the classic problems in machine learning. Most learning algorithms for this problem employ some variation of a greedy separate-and-conquer algorithm. In this paper, we describe a system called LERILS that learns highly accurate and comprehensible rules from examples using a...
Almost every student in the School of EECS undergoes the process of Blanket Credit
Registration wherein the student has to fill out the form for registration, meet the
concerned professor and obtain his approval. The staff of the department has to maintain
the details of the student and the Professor...
Remote Event Listener(REL) is designed to glue remote events and remote listeners dynamically, and dispatch remote events efficiently and transparently
for distributed object-oriented systems. Components can be independently developed and remotely interconnected with REL, and software reusability can
be improved. Remote Event Listener along with Remote Method Invocation
makes distributed...
The visual programming language Forms/3 currently uses a graphical user interface implemented in Garnet. Garnet was developed by the User Interface Software Group in the Human Computer Interaction Institute at Carnegie Mellon University, but is no longer supported. This paper presents an implementation of a user interface for Forms/3 written...
This paper describes the design and performance of a distributed, multi-tier architecture for scientific information management and data exploration. A novel aspect of this framework is its integration of Java IDL, the CORBA distributed object computing middleware with JavaBeans, the Java Component model to provide a flexible, interactive framework for...
This is an attempt to increase the power of a spreadsheet and try to use the spreadsheet as a powerful programming tool. The basic idea is to treat each cell of the spreadsheet as an object. The cell (Object) could be programmed, that is, the attributes and the functionality of...
We have developed a framework for Web-based GIS/database (WebGD) applications
that allow users to insert, query, and delete data with map interfaces displayed by Web
browsers. The framework uses such open source software packages as Minnesota
MapServer, PostGIS, and PostgreSQL. With this framework, we can create the map
interface of...
Although standard tools have been used for lexical and syntactic analysis since the late 1970's, no
standard tools exist for the remaining parts of a compiler. Part of the reason for this de ciency is due to
the di culty of producing elegant tools capable of handling the large amount...
The purpose of this project is to load test, and fine tune the loan search functionality of the Broker Blueprint web application, an innovative Business-to-Business (B2B) online service aiding mortgage lenders and brokers in today's highly competitive mortgage market.
Broker Blueprint enables brokers to search for suitable mortgage loans across...
The Services for Students with Disabilities (SSD) is the department responsible for providing reasonable
accommodation to students with documented disabilities. Each term, SSD serves approximately 500
students and receives hundreds of requests for various services. These services range from alternative
testing, alternative formats, notetaking, classroom relocation, and requests for tables...
Regression testing is a common and necessary task carried out by software practitioners to validate the quality of evolving software systems. Unfortunately, regression testing is often an expensive, time-consuming process, particularly when applied to large software systems. Consequently, practitioners may wish to prioritize the test cases in their regression test...
We designed a concise way to store and manipulate GIS coverage data in a geospatial database. Our
geospatial database is implemented with PostgreSQL and PostGIS. PostgreSQL is an object-relational
database, and PostGIS supports various geospatial operations as an SQL extension. In our Oregon
Relational Spatial Topology (ORST) approach, topological relationships...
Programming parallel machines has been a difficult and unrewarding task. The short lifespan of parallel machines and their incompatibility have made it difficult to utilize them. Our goal here is to create an environment for parallel computing which allows users to take advantage of parallel computers without writing parallel programs....
WebGen 5 is a software tool for automatically generating Web scripts that display Web forms and operate on data in a database. WebGen 5 is implemented as a collection of templates. Each template, combined with a corresponding configuration file, generates one of the following six types of Web scripts: search,...
This thesis presents a case study of applying machine learning tools to build a predictive
model of annual infestations of grasshoppers in Eastern Oregon. The purpose of the
study was two-fold. First, we wanted to develop a predictive model. Second, we wanted
to explore the capabilities of existing machine learning...
Edit distances are a well-established technique for classification problems. They have been employed successfully in many classification problems including chromosome classification and hand-written digit recognition. Virtually all machine learning algorithms represent the objects to be classified as vectors of features. However, edit distances provide only a measure of the difference...
Hardwood lumber is a major forest product, and board grading is an important part of its manufacturing and marketing. Computer grading programs have been used to train graders and to grade lumber for board data banks, but they have not been used to machine-grade boards in an industrial environment because...
The task of mapping spelled English words into strings of phonemes and stresses ("reading aloud") has many practical applications. Several commercial systems perform this task by applying a knowledge base of expert-supplied letter-to-sound
rules. This dissertation presents a set of machine learning methods for automatically constructing letter-to-sound rules by analyzing...
Spreadsheet languages, which include commercial spreadsheets and various research systems, have proven to be flexible tools in many settings. Research shows, however, that spreadsheets often contain faults. This thesis presents an integrated testing and fault localization methodology for spreadsheets. This methodology allows spreadsheet developers to engage in modeless development,
testing...
Tree patterns are natural candidates for representing rules and hypotheses in many tasks such as information extraction and symbolic mathematics. A tree pattern is a tree with labeled nodes where some of the leaves may be labeled with variables, whereas a tree instance has no variables. A tree pattern matches...
Many approaches for achieving intelligent behavior of automated (computer) systems involve components that learn from past experience. This dissertation studies computational methods for learning from examples, for classification and for decision
making, when the decisions have different non-zero costs associated with them. Many practical applications of learning algorithms, including transaction...
Interconnection networks play important roles in designing high performance computers. Recently two new classes of interconnection networks based on the concept of Gaussian and Eisenstein-Jacobi integers were introduced. In this research, efficient routing and broadcasting algorithms for these networks are developed. Furthermore, constructing edge disjoint Hamiltonian cycles in Gaussian networks...
An interdisciplinary study into the theory of design decisions has yielded a model for tracking design changes in hardware/software systems, but it still needs to be applied to a larger system to test its efficiency at tracking important data. This thesis creates an implementation of PLEXIL, a language in development...