"‘Biometrics is at the forefront in our agenda for homeland security,’ declared Asa Hutchinson, the Department of Homeland Security's undersecretary for border and transportation security, at the 2004 Biometric Consortium Conference” [11].
Flashy retinal scanning and voice activated computers were once considered technologies for science fiction movies and novels. Nowadays,...
Functional programming is concerned with referential transparency, that is, given a certain function and its parameter, that the result will always be the same. However, it seems that this is violated in applications involving uncertainty, such as rolling a dice. This thesis defines the background of probabilistic programming and domain-specific...
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a public key cryptosystem used for creation and verification of digital signatures in electronic documents. In this thesis, we created a Java applet that provides the functionality of the ECDSA using all of the NIST elliptic curves over GF(p). This applet was...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
Popular applications such as P2P file sharing, multiplayer gaming, videoconferencing, etc. rely on the efficiency of content distribution from a single source to multiple receivers. Most users of these applications are on the widely prevalent source constraint networks such as the Digital Subscriber Line (DSL) and wireless networks. Overlay multicast...
As broadband Internet becomes widely available, Peer-to-Peer (P2P) applications over the Internet are becoming increasingly popular. Such an example is a video multicast application in which, one source streams a video to a large number of destination nodes through an overlay multicast tree consisting of peers.
These overlay multicast-based applications,...
This thesis describes the implementation of an interface for querying established correspondences between anatomical structures across species. I was the main developer of this query engine, called the Comparative Anatomy Information System. My work involved developing methods to query the knowledge base, perform the specified comparison, display the anatomical hierarchies...
The computer game industry continues to progress toward realistic-looking character motion. However, even in state-of-the-art games, the use of motion capture data in character animation may result in errors such as “foot slipping,” where the feet do not match up with the floor properly during translation. Various algorithms have been...
The high cost of manually producing background characters creates a demand for a
way to automatically generate plausible behaviors. These background extras need to
behave in a manner that is believable such that they do not distract the focus of the
audience from the primary action occurring in the scene....
Media application on the internet has become more and more popular as the bandwidth of the network links increase. The bottleneck of the existing media systems is no longer the link bandwidth at user’s end, but the server’s ability to handle streaming requests. These existing streaming systems do not scale...
This thesis presents a domain specific visual language designed to allow coaches to create content that exhibits the complex 2D interactions observed in the game of American football. Coaches can visually program the content by using symbols and drawing primitives similar to those that they currently use to design static...
The thesis focuses on model-based approximation methods for reinforcement
learning with large scale applications such as combinatorial optimization problems.
First, the thesis proposes two new model-based methods to stablize the
value–function approximation for reinforcement learning. The first one is the
BFBP algorithm, a batch-like reinforcement learning process which iterates between...
We present an approach for generating a character’s response in anticipation of an impending impact. Protective anticipatory movement is built upon several simple actions that have been identified as response mechanisms in monkeys and in humans. These actions are parameterized by a model of the interaction based on the approaching...
Traditional application of Voronoi diagrams for space partitioning creates Voronoi regions, with areas determined by the generators’ relative locations and weights. Especially in the area of information space (re)construction, however, there is a need for inverse solutions; i.e., finding weights that result in regions with predefined areas. In this thesis,...
Although researchers have begun to explicitly support end-user programmers’ debugging by providing information to help them find bugs, there is little research addressing the right content to communicate to these users. The specific semantic content of these debugging communications matters because, if the users are not actually seeking the information...
This thesis presents a model for simulating individual pedestrian motion based on empirical data. The model keeps track of a pedestrian’s position, orientation, and body configuration and leverages motion capture data to generate plausible motion. Our model can automatically incorporate a pedestrian’s physical limitations when making movement decisions, since it...
Finding information can cost a significant amount of time, even when the information is already stored on the user’s local computer system. There is significant research aimed at reducing these time costs, but little research into exactly what these costs are or how they impact people’s use of tools and...
Oftentimes in visualization, the goal of using volume datasets is not just to visualize them but also to analyze and compare them. In order to compare the two volumes, we cannot take all the voxels into consideration. The size of a typical volume data set is quite large (maybe a...
The Line Integral Convolution (LIC) is a mainstay of flow visualization. It is, however, computationally intensive, which limits its interactivity. Also, when used to view three-dimensional (3D) vector fields, the resulting images are dense and cluttered, making it difficult to perceive the flow on the interior parts of the field....
Packet loss, delay and time-varying bandwidth are three main problems facing multimedia streaming applications over the Internet. Existing techniques such as Media-aware network protocol, network adaptive source and channel coding, etc. have been proposed to either overcome or alleviate these drawbacks of the Internet. But these techniques either need specialized...
3D datasets acquire great importance in the context of medical imaging. In this thesis we survey and enhance solutions to problems inherently associated with 3D datasets-processing time,noise and visualization. Efforts include development of a tool kit to provide a multi-threaded processing platform to cut processing time, produce real time visualization...
Most of the work so far in the subfield of Gender HCI has followed a theory-driven approach. Established theories, however, do not take into account specific issues that arise in end-user debugging. We suspected that there may be important information that we were overlooking. We therefore employed a methodology change:...
Domain-independent automated planning is concerned with computing a sequence of actions that can transform an initial state into a desired goal state. Resource production domains form an interesting class of such problems, in that they typically require reasoning about concurrent durative-actions with continuous effects while minimizing some cost function. Although...
A basic tradeoff to consider when designing a distributed data-mining framework is the need for a compromise between the cost of communication and computation resources and the accuracy of the mining results. This is essentially a decision of whether it is more efficient to communicate all of the data to...
The goal of many machine learning problems can be formalized as the creation of a function that can properly classify an input vector, given a set of examples of that function. While this formalism has produced a number of success stories, there are notable situations in which it fails. One...
Image feature detection and matching are two critical processes for many computer vision tasks. Currently, intensity-based local interest region detectors and local feature-based matching methods are used widely in computer vision applications. But in some applications, such as biological object recognition tasks, within-class changes in pose, lighting, color, and texture...
There has been little research into how end-user programming environments can provide explanations that could fill a critical information gap for end-user debuggers – help with debugging strategy. To address this need, we designed and prototyped a video-based approach for explaining debugging strategy, and accompanied it with a text-only approach....
Nowadays, sports events are a significant part of the every-day entertainment with local, national, and international championships. A lot of money is invested by broadcasting companies to attract new and more viewers, acquire broadcasting rights, or send entire crews on site to cover such events. Journalists are among the few...
Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can operate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors ranging from corrupt communications to partial...
Protecting end-users privacy and building trust are the two most important factors needed to support the growth of ecommerce. The increased dependence on the Internet for a wide variety of daily transactions causes a corresponding loss in privacy for many users, as virtually all websites collect data from users directly...
There has been little prior research reporting strategy usage in end-user problem solving, and even less using gender as a factor. Without this type of information, enduser programming systems cannot know the “target” at which to aim, if they are to support male and female end-user programmers’ debugging. As a...
Building intelligent computer assistants has been a long-cherished goal of AI. Many intelligent assistant systems were built and fine-tuned to specific application domains. In this work, we develop a general model of assistance that combines three powerful ideas: decision theory, hierarchical task models and probabilistic relational languages. We use the...
Active participation and collaboration of community members are crucial to the continuation and expansion of open source software projects. Researchers have recognized the value of community in open source development and studied various aspects of it including structure of communities, motivations for participation, and collaboration among members. However, the majority...
Web applications are popular attack targets. Misuse detection systems use signature databases to detect known attacks. However, it is difficult to keep the database up to date with the rate of discovery of vulnerabilities. They also cannot detect zero-day attacks. By contrast, anomaly detection systems learn the normal behavior of...
Forward Error Correction and retransmission are two approaches used to reliably broadcast data in a network with poor quality of service. Taking some assumptions, it has been suggested that a retransmission based reliable broadcasting scheme using network coding should in theory provide an increase in bandwidth efficiency by combining packets...
DiskGrapher is a graphical visualization tool designed to help users better manage the
space on their hard drives. The main goal of DiskGrapher is to provide a different
visualization technique to display information, with the goal of providing a more intuitive
understanding of the directory structure of the disk than...
Factorization of integers is an important aspect of cryptography since it can be used as an
attack against some of the common cryptographic methods being used. There are
numerous methods in existence for factoring integers. Some of these are faster than
others for general numbers, while others work best on...
Fluid simulation is an interesting research problem with a wide range of applications including mechanical engineering, special effects in movies and games, and scientific simulation. Due to the complex nature of typical fluid flow equations, there are circumstances where a full volumetric fluid simulation may not be necessary to generate...
This thesis addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree models of the conditional probability distributions of the DBNs. Existing algorithms presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise....
In diversity combining automatic repeat request (ARQ), erroneous packets are combined together forming a single, more reliable, packet. In this thesis, we give a diversity combining scheme for the m-ary unidirectional channel. A system using the given scheme with a t-unidirectional error detecting code is able to correct up to...
Motion capture data is a digital representation of the complex temporal structure of human motion. Motion capture is widely used for data-driven animation in sports,medicine and entertainment, because of its ability to capture complex and realistic
motions. Due to its efficiency and cost, methods for reusing collections of motion capture...
Transportation infrastructure provides a vital service for the functionality of a
city. The efficient design of road networks poses an interesting topic in computer
science for digital content developers. For civil engineers, the visualization of
analysis results on infrastructure both efficiently and intuitively is crucial. The
following contributions are made...
Recent efforts in user-control of data-driven characters have focused on designing high-level graph data-structures that we call a Behavior Finite State Machine (BFSM). A BFSM is an interactive data-structure that benefits from the advantages of both motion graphs and blend-based techniques for generating animated motion. Each node in a BFSM...
Supervised learning is concerned with discovering the relationship between example sets of features and their corresponding classes. The traditional supervised learning formulation assumes that all examples are independent from one another. The order of the examples contains no information. Nonetheless, many problems have a sequential nature. Classifiers for these problems...
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to identify statistically significant anomalies, a more difficult task is to identify anomalies that are both interesting and statistically significant. Category detection is an emerging area of machine learning...
Modern digital still cameras are equipped with just a single CCD for color image acquisition. Since only one spectral band can be recorded in each pixel, a mosaic of red, green and blue color filters is placed in front of the chip. The process of subsequently calculating a full color...
This project describes a web-based information management system which has been
widely used by staff members and students at OSU. Without this system, it was very
hard for the staff members to manage a variety of fees; for example, if they want to
change a certain fee, staff members had...
Practical parallel programming demands that the details of distributing data to processors and inter- processor communication be managed by the compiler. These tasks quickly become too di cult for a programmer to do by hand for all but the simplest parallel programs. Yet, many parallel languages still require the programmer...
Distance-based algorithms are machine learning algorithms that classify queries
by computing distances between these queries and a number of internally stored
exemplars. Exemplars that are closest to the query have the largest in
uence on
the classi cation assigned to the query. Two speci c distance-based algorithms, the
nearest neighbor...
We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...
Markov models are commonly used for joint inference of label sequences. Unfortunately, inference scales quadratically in the number of labels, which is problematic for training methods where inference is repeatedly preformed and is the primary computational bottleneck for large label sets. Recent work has used output coding to address this...
The push towards higher performing and more sensitive mixed signal circuitry has required the parallel development of increasingly more complex and sensitive test and calibration harnesses. Current off-chip methods of test and calibration may require higher pin counts or induce unwanted parasitic interference.
In this thesis, the design of a...
In this dissertation, we present a user-in-the-loop method for the design of an interactive motion data structure that benefits from the advantages of both motion graphs and blend-based techniques. Our novel approach automatically analyzes a traditional motion graph built from labeled motion clips. The result is a more condensed, coarser...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Ensuring correctness of real-world software applications is a challenging task. Testing can be used to find many bugs, but is typically not sufficient for proving correctness or even eliminating entire classes of bugs. However, formal proof and verification techniques tend to be very heavy weight and are simply not available...
Open source software has become a powerful force in the world of computing. While once confined to the domain of technical specialists, people of all types have begun to adopt this software – from the casual web-surfer who uses Firefox, to the professional web developer who codes in PHP or...
As the volume of genetic sequence data increases due to improved sequencing techniques and increased interest, the computational tools available to analyze the data are becoming inadequate. This thesis seeks to improve a few of the computational methods available to access and analyze data in the genetic sequence databases. The...
Multiparadigm programming languages are a recent development in the realm of programming languages. A multiparadigm programming language allows the use of multiple, differing programming paradigms without departing from a single, unified linguistic framework. Multiparadigm programming languages are claimed to have benefits to both pedagogy and complex application creation. The beneficial...
Automatic painterly rendering systems have been proposed but they opted for selecting a single style to generate paintings from images, which lacks the ability of creatively using multiple styles to focus important objects and deemphasize unimportant part of the scenes. We provide a multi-style painting framework to
address this issue...
MIDAS is an application framework developed at College of Oceanic and Atmospheric
Science for interactive remote data acquisition and visualization. The objective is to provide
dynamic reconfiguration of the sensing process. The current MIDAS application framework
utilizes the code mobility and portability of Java 2 platform. The Jini technology for...
Events are an important concept in the Microsoft windows operating system. When a
program runs interactively, it uses a user interface or a console to communicate with the
user. Background services, however, do not have such mechanisms; instead, they use
events to notify the user about changes and to report...
Controlling a virtual character with a pen input device is difficult. Pen input
devices require freeform gestures and users are not confined to particular mapping of a
key or a button that is exactly repeatable. This is a problem since an intuitive motion
gesture for one user might not be...
In this research, we have captured, in pattern form, key elements of programming and design in four programming paradigms (imperative, object-oriented, functional and logical) as well as multiparadigm programming. These pattern sets have formed a foundation upon which we were able to build a deeper understanding of multiparadigm programming and...
Learning easily understandable decision rules from examples is one of the classic problems in machine learning. Most learning algorithms for this problem employ some variation of a greedy separate-and-conquer algorithm. In this paper, we describe a system called LERILS that learns highly accurate and comprehensible rules from examples using a...
Almost every student in the School of EECS undergoes the process of Blanket Credit
Registration wherein the student has to fill out the form for registration, meet the
concerned professor and obtain his approval. The staff of the department has to maintain
the details of the student and the Professor...
Remote Event Listener(REL) is designed to glue remote events and remote listeners dynamically, and dispatch remote events efficiently and transparently
for distributed object-oriented systems. Components can be independently developed and remotely interconnected with REL, and software reusability can
be improved. Remote Event Listener along with Remote Method Invocation
makes distributed...
The visual programming language Forms/3 currently uses a graphical user interface implemented in Garnet. Garnet was developed by the User Interface Software Group in the Human Computer Interaction Institute at Carnegie Mellon University, but is no longer supported. This paper presents an implementation of a user interface for Forms/3 written...
This paper describes the design and performance of a distributed, multi-tier architecture for scientific information management and data exploration. A novel aspect of this framework is its integration of Java IDL, the CORBA distributed object computing middleware with JavaBeans, the Java Component model to provide a flexible, interactive framework for...
This is an attempt to increase the power of a spreadsheet and try to use the spreadsheet as a powerful programming tool. The basic idea is to treat each cell of the spreadsheet as an object. The cell (Object) could be programmed, that is, the attributes and the functionality of...
We have developed a framework for Web-based GIS/database (WebGD) applications
that allow users to insert, query, and delete data with map interfaces displayed by Web
browsers. The framework uses such open source software packages as Minnesota
MapServer, PostGIS, and PostgreSQL. With this framework, we can create the map
interface of...
Although standard tools have been used for lexical and syntactic analysis since the late 1970's, no
standard tools exist for the remaining parts of a compiler. Part of the reason for this de ciency is due to
the di culty of producing elegant tools capable of handling the large amount...
The purpose of this project is to load test, and fine tune the loan search functionality of the Broker Blueprint web application, an innovative Business-to-Business (B2B) online service aiding mortgage lenders and brokers in today's highly competitive mortgage market.
Broker Blueprint enables brokers to search for suitable mortgage loans across...
The Services for Students with Disabilities (SSD) is the department responsible for providing reasonable
accommodation to students with documented disabilities. Each term, SSD serves approximately 500
students and receives hundreds of requests for various services. These services range from alternative
testing, alternative formats, notetaking, classroom relocation, and requests for tables...
Regression testing is a common and necessary task carried out by software practitioners to validate the quality of evolving software systems. Unfortunately, regression testing is often an expensive, time-consuming process, particularly when applied to large software systems. Consequently, practitioners may wish to prioritize the test cases in their regression test...
We designed a concise way to store and manipulate GIS coverage data in a geospatial database. Our
geospatial database is implemented with PostgreSQL and PostGIS. PostgreSQL is an object-relational
database, and PostGIS supports various geospatial operations as an SQL extension. In our Oregon
Relational Spatial Topology (ORST) approach, topological relationships...
Programming parallel machines has been a difficult and unrewarding task. The short lifespan of parallel machines and their incompatibility have made it difficult to utilize them. Our goal here is to create an environment for parallel computing which allows users to take advantage of parallel computers without writing parallel programs....
WebGen 5 is a software tool for automatically generating Web scripts that display Web forms and operate on data in a database. WebGen 5 is implemented as a collection of templates. Each template, combined with a corresponding configuration file, generates one of the following six types of Web scripts: search,...
This thesis presents a case study of applying machine learning tools to build a predictive
model of annual infestations of grasshoppers in Eastern Oregon. The purpose of the
study was two-fold. First, we wanted to develop a predictive model. Second, we wanted
to explore the capabilities of existing machine learning...
Spreadsheet languages, which include commercial spreadsheets and various research systems, have proven to be flexible tools in many settings. Research shows, however, that spreadsheets often contain faults. This thesis presents an integrated testing and fault localization methodology for spreadsheets. This methodology allows spreadsheet developers to engage in modeless development,
testing...
Tree patterns are natural candidates for representing rules and hypotheses in many tasks such as information extraction and symbolic mathematics. A tree pattern is a tree with labeled nodes where some of the leaves may be labeled with variables, whereas a tree instance has no variables. A tree pattern matches...
Interconnection networks play important roles in designing high performance computers. Recently two new classes of interconnection networks based on the concept of Gaussian and Eisenstein-Jacobi integers were introduced. In this research, efficient routing and broadcasting algorithms for these networks are developed. Furthermore, constructing edge disjoint Hamiltonian cycles in Gaussian networks...
An interdisciplinary study into the theory of design decisions has yielded a model for tracking design changes in hardware/software systems, but it still needs to be applied to a larger system to test its efficiency at tracking important data. This thesis creates an implementation of PLEXIL, a language in development...
This paper examines how six online multiclass text classification algorithms perform in the domain of email tagging within the TaskTracer system. TaskTracer is a project-oriented user interface for the desktop knowledge worker. TaskTracer attempts to tag all documents, web pages, and email messages with the projects to which they are...
In the field of Human-Computer Interaction, provenance refers to the complete history and genealogy of a document. Provenance can be useful in identifying related resources, such as different versions of the same document or resources used in the creation of a new document. Though methods of provenance collection and applications...
Programmers spend a substantial fraction of their debugging time by navigating
through source code, yet little is known about how programmers navigate. With the
continuing growth in size and complexity of software, this fraction of time is likely to
increase, which presents challenges to those seeking both to understand and...
The problem of document classification has been widely studied in machine learning and data mining. In document classification, most of the popular algorithms are based on the bag-of-words representation. Due to the high dimensionality of the bag-of-words representation, significant research has been conducted to reduce the dimensionality via different approaches....
Analysis, visualization, and design of vector fields on surfaces have a wide variety of major applications in both scientific visualization and computer graphics. On the one hand, analysis and visualization of vector fields provide critical insights to the flow data produced from simulation or experiments of various engineering processes. On...
This project aims at implementing Indexing for Web 2.0 Applications. Ajax applications consist of a set of states which are generated by the user through actions such as click, focus, blur etc. events. By saving these DOM states we can index information obtained from dynamically generated web content. To prevent...
Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction.
In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential...
This dissertation explores the idea of applying machine learning technologies to help computer users find information and better organize electronic resources, by presenting the research work conducted in the following three applications: FolderPredictor, Stacking Recommendation Engines, and Integrating Learning and Reasoning.
FolderPredictor is an intelligent desktop software tool that helps...
Communicating dynamic motion content, such as exercise, with a static medium, such as paper, is difficult. The technology exists for presenting 3D animated exercise content to patients; however, the tools for allowing exercise domain experts to effectively author the content do not exist. We conducted two formative studies with exercise...
This thesis presents a progression of novel planning algorithms that culminates in a new family of diverse Monte-Carlo methods for probabilistic planning domains. We provide a proof for performance guarantees and analyze how these algorithms can resolve some of the shortcomings of traditional probabilistic planning methods. The direct policy search...
We took the back-propagation algorithms of Werbos for recurrent and feed-forward neural networks and implemented them on machines with graphics processing units (GPU). The parallelism of these units gave our implementations a 10 to 100 fold increase in speed. For nets with less than 20 neurons the machine performed faster...
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high
stochasticity or "outcome space" explosion. Multiagent domains are particularly susceptible to these problems. This thesis describes ways to mitigate these curses in several different multiagent domains, including real-time delivery of products...
The results of a machine learning from user behavior can be thought of as a program, and like all programs, it may need to be debugged. Providing ways for the user to debug it matters because without the ability to fix errors, users may find that the learned program’s errors...
End users' programs are fraught with errors, costing companies millions of dollars. One reason may be that researchers and tool designers have not yet focused on end-user debugging strategies. To investigate this possibility, this dissertation presents eight empirical studies and a new strategy-based end-user debugging tool for Excel, called StratCel....
Linear transformation for dimension reduction is a well established problem in the field of machine learning. Due to the numerous observability of parameters and data, processing of the data in its raw form is computationally complex and difficult to visualize. Dimension reduction by means of feature extraction offers a strong...
Virtual environments and simulations are being used increasingly to both visualize and understand data as well as to create scenarios for training and analysis purposes. In this paper, we are interested in the use of simulation and visualization of interactive virtual agents to create realistic motions for training scenarios. We...
A financial processor is the most important component of a credit union‘s IT infrastructure. A database storing member demographic information, account balances, and transaction history, it performs financial calculations, such as interest, dividends, and maturities. It also provides a user interface, allowing tellers and financial service representatives to manage accounts...
A relatively new model of error control is the limited magnitude error over high radix channels. In this error model, the error magnitude does not exceed a certain limit known beforehand. In this dissertation, we study systematic error control codes for common channels under the assumption that the maximum error...