Scientists in the biological sciences need to retrieve information from a variety of data collections, traditionally maintained in SQL databases, in order to conduct research. Because current assistant tools are designed primarily for business and financial users, scientists have been forced to use the notoriously difficult command-line SQL interface, supplied...
The data used for the construction of genome maps is imperfect, therefore the mapping of a physically linear structure must take place in a very uneven feature space. As the number of genes to be ordered grows, it appears to be impractical to use exhaustive search techniques to find the...
Although much effort has been invested to build applications that support group work, collaborative applications have not found easy success. The cost of adopting and maintaining collaborative applications has prevented their widespread use, especially among small distributed groups. Application developers have had difficulties recognizing the extra effort required by groups...
This paper describes the design and performance of a distributed, multi-tier architecture for scientific information management and data exploration. A novel aspect of this framework is its integration of Java IDL, the CORBA distributed object computing middleware with JavaBeans, the Java Component model to provide a flexible, interactive framework for...
Reasoning about any realistic domain always involves a degree of uncertainty.
Probabilistic inference in belief networks is one effective way of reasoning under
uncertainty. Efficiency is critical in applying this technique, and many researchers
have been working on this topic. This thesis is the report of our research in this...
This paper investigates the feasibility of compiling the functionality of a decision
theoretic problem solving engine into a set of rules or functionally similar construct.
The decision theoretic engine runs in exponential time, while the rule set runs in
linear time at worst. The main question that will determine the...
This report addresses the design and implementation of an internet-based grading tool for the "Translators" course. The motivation is to avoid exposing the instructor's Java byte-code to possible reverse-engineering tools and enable students to submit their homework virtually from any machine across the internet. This tool is intended to replace...
Development of graphical user interface (GUI) applications is difficult since the process can be both complicated and tedious. We propose a solution directed at reducing programming time and effort required to build a GUI application. Our solution is based on the Petri Network, the Oregon SpeedCode Universe (OSU) Application Framework,...
Various ecological and hydrological models require estimates of the amount and spatial distribution of monthly and annual precipitation. PRISM is an analytical model that distributes point measurements of monthly, seasonal and annual precipitation to a geographic grid. In order to use this model effectively, good graphical user interfaces were needed....
Polyhedra are geometric representations of linear systems of equations and
inequalities. Since polyhedra are used to represent the iteration domains of nested
loop programs, procedures for operating on polyhedra can be used for doing loop
transformations and other program restructuring transformations which are needed
in parallelizing compilers. Thus a need...
The most important part of parallel computation is communication. Except in the most embarassingly parallel examples, processors cannot work cooperatively to solve a problem unless they can communicate. One way to solve the problem of communication is to use an interconnection network. Processors are located at nodes of the network,...
Cryptographic systems are used for secure communications in the government, in industry, and by individuals. Many cryptographic systems base their security on our inability to factor quickly. For this reason, in the last few decades several fast factoring algorithms have been developed. The Quadratic Sieve is one of the best...
Distance-based algorithms are machine learning algorithms that classify queries
by computing distances between these queries and a number of internally stored
exemplars. Exemplars that are closest to the query have the largest in
uence on
the classi cation assigned to the query. Two speci c distance-based algorithms, the
nearest neighbor...
Reinforcement Learning (RL) is the study of learning agents that improve
their performance from rewards and punishments. Most reinforcement learning
methods optimize the discounted total reward received by an agent, while, in many
domains, the natural criterion is to optimize the average reward per time step. In this
thesis, we...
The project aims at building an application that would simulate a transparency that can either be overlaid on top of the graphical display of another application or used as a stand-alone by accepting input from a keyboard or a mouse to enhance a presentation. The project provides an environment to...
We developed and investigated machine learning methods that require
minimal preprocessing of the input data, use few training examples, run fast, and
still obtain high levels of accuracy.
Most approaches to designing machine learning programs are based on the
supervised learning paradigm – training examples are chosen randomly and given...
An active object system is a transition-based object-oriented system suitable for the design of various concurrent systems. An AOS consists of a collection of interacting objects, where the behavior of each object is determined by the transition statements provided in the class of that object. A transition statement is a...
Oregon Speedcode Universe (OSU) is an excellent environment for producing rapid prototypes -- unfortunately, it has no facilities for producing animated sequences. Should we make animation sequences inside OSU or import them? Interactive or not? Given the event-driven nature of Macintosh applications, how do we integrate an on-going event like...
How might capabilities for algorithm animation be seamlessly integrated into a programming language that is both visual and declarative? Until now, visual programming language researchers have not attempted to answer that question, making the fruits of algorithm animation available only to users of textual programming languages. Users of visual programming...
As the volume of genetic sequence data increases due to improved sequencing techniques and increased interest, the computational tools available to analyze the data are becoming inadequate. This thesis seeks to improve a few of the computational methods available to access and analyze data in the genetic sequence databases. The...
Preparing code for parallelization is one of the most time consuming programming tasks; approximately 24% of the program development activity is spend in this phase. In message passing systems, this is usually achieved by the addition of run-time message-passing library calls. The primitives provided by different message-passing libraries are similar,...
A key aspect of how we understand the world revolves around an ability to
manipulate our surroundings to experiment. From the scientific method through
theories of child development, the ability to experiment is deemed critical; however,
few studies have been performed to understand the strengths and weaknesses of
different experimental...
This paper describes the results of a research project to design new graphical user interlace to an existing applications software product, using the techniques of user-centered design. The project began with a literature review to determine the currently accepted methodology in the area of user-centered design, with emphasis on the...
Generalized Radial Basis Functions were used to construct networks
that learn input-output mappings from given data. They are
developed out of a theoretical framework for approximation based
on regularization techniques and represent a class of three-layer
networks similar to backpropagation networks with one hidden
layer.
A network using Gaussian base...
This project is an interactive environment which allows the user to explore various sound generation techniques. The three methods of sound generation used are (1) FM synthesis, (2) Additive synthesis, and (3) Karplus/Strong synthesis of plucked strings. The project also provides the capability to display the generated sound waveforms and...
Although standard tools have been used for lexical and syntactic analysis since the late 1970's, no
standard tools exist for the remaining parts of a compiler. Part of the reason for this de ciency is due to
the di culty of producing elegant tools capable of handling the large amount...
The practice of literate programming is not widespread because existing literate programming systems have some undesirable characteristics such as programming language and text processor dependence and lack of flexible tools for viewing and manipulation of the source file. This dissertation describes the literate programming system AOPS (Abstraction Oriented Programming System)...
In a typical method of database design, ER diagrams are used to represent the conceptual schema of the database. Use-cases are now widely used to capture requirements in designing the basic architecture of an object-oriented system. Use-cases are highly effective in designing forms required for data insertion, retrieval, update, and...
Parallel computing on heterogeneous workstation clusters has proved to be a very efficient use of available resources, increasing their overall utilization. However, for it to be a viable alternative to expensive, dedicated parallel machines, a number of key issues need to be resolved. One of the major challenges of heterogeneous...
Balanced codes, in which each codeword contains equally many 1's and 0's, are useful in such applications as in optical transmission and optical recording. When balanced codes are used, the same number of 1's and 0's pass through the channel after the transmission of every word, so the channel is...
Capacity Planning is a gradually recognized need in UNIX environments. However, the effort to address this problem has been more or less event driven and is often discarded after crises are over. The main cause to this situation is that the tools to support the CP function in UNIX environments...
An important problem in molecular biology is to predict the secondary
structure of proteins from their primary structure. The primary structure of a
protein is the sequence of amino acid residues. The secondary structure is an
abstract description of the shape of the folded protein, with regions identified
as alpha...
We describe a set of data flow techniques and code transformations that translate a single instruction stream, multiple data stream (SIMD) Dataparallel C program into a semantically equivalent single program, multiple data stream (SPMD) C program suitable for execution on shared memory multiprocessor computers, such as the Sequent Balance and...
There are three families of exact methods used for probabilistic inference in
belief nets. It is necessary to compare them and analyze the advantages and
the disadvantages of each algorithm, and know the time cost of making
inferences in a given belief network. This paper discusses the factors that
influence...
The general problem of application development of interactive GUI applications has been addressed by toolkits, libraries, user interface management systems, and more recently domain-specific application frameworks. However, the most sophisticated solution offered by frameworks still lacks a number of features which are addressed by this research: 1) limited functionality --...
The coverage of a learning algorithm is the number of concepts that can be learned by that algorithm from samples of a given size for given accuracy and confidence parameters. This thesis begins by asking whether good learning algorithms can be designed by maximizing their coverage. There are three questions...
Supervised learning programs, such as decision tree learners and neural networks, often must learn Boolean functions. The concept being learned may not easily be expressed in terms of the atomic features given. Constructive induction automatically produces higher level features (combinations of the atomic features), which can improve learning performance. The...
The task of mapping spelled English words into strings of phonemes and stresses ("reading aloud") has many practical applications. Several commercial systems perform this task by applying a knowledge base of expert-supplied letter-to-sound
rules. This dissertation presents a set of machine learning methods for automatically constructing letter-to-sound rules by analyzing...
Knowledge compilation improves search-intensive problem-solvers that are easily specified but inefficient. One promising approach improves efficiency by constructing a database of problem-instance/best-action pairs that replace problem-solving search with efficient lookup. The database is constructed by reverse enumeration- expanding the complete search space backwards, from the terminal problem instances. This approach...
Culprit Tracking is a technique to make lazy evaluation in a programming language even lazier. We sought to develop such a technique after noting poorly-distributed performance characteristics of graphical user interfaces (GUIs) programmed in lazy languages. A characteristic aspect of GUI programs is the intensive screen I/O. These programs are...
Visualization is a useful method to teach classes in algorithms and data structures. Many animation systems of algorithms and data structures have been implemented. In these systems, students can set initial conditions, run algorithms and view the result graphically. In this project, a small visualization tool library is implemented using...
Debanoir is a distributed, collaborative editor for building Bayesian networks. It is composed of a front-end editor client and back-end server programs. The editor client is a Java applet that can be invoked from any Internet browser that supports Java. The server, as of now, runs on the machine from...
The problem of handling dependent evidence is an important practical issue for applications of reasoning with uncertainty in artificial intelligence. The existing solutions to the problem are not satisfactory because of their ad hoc nature, complexities, or limitations. In this dissertation, we develop a general framework that can be used...
First, we study hard real-time scheduling problems where each task is defined by a four tuple (r, c, p, d): r being its release time, c computation time, p period, and d deadline. The question is whether all tasks can meet their deadlines on one processor. If not, how many...
We describe a CASE tool for designing hard real-time applications, called HaRTS. The design tool supports a hierarchical design diagram which combines the control and data flow of a hard real-time application. The design hierarchy separates a design into self-contained subdesigns. Yet, the design can be flatten to give you...
Application supporting a Graphical User Interface (GUI) are difficult to create. Their inherent complexity, their interdependence with many other disciplines, and the inadequacy of the existing tools leave the programmer with too much to do. In particular, almost no help is provided to create application-specific code. Some visual formalisms such...
A Kalman filter was developed to recover information on short period variations of ocean circulation from the satellite altimeter signals for the North Pacific Ocean. The ocean circulation at each grid point is specified by three variables: sea level height h, zonal velocity u, and meridional velocity v. In our...
The problem addressed in this paper is that of usability of the Petri Net Editor used to create applications in the Objex framework. This editor puts a lot of burden on the user, in that the user needs to remember all the GUI resources previously created for the application, has...
In a constant weight code, each code word contains a constant number of 1's. If this number is equal to half the length of the code word then the code is called balanced. These codes find many applications in computer and communication systems noise reduction in VLSI systems, fault masking...
With the strong shift toward network-centric cluster-computing workload management has become increasingly important in both industrial and academic environments. To fully harness the distributed computing resources of the 32 PC's in our department, I developed a workload monitoring system together with a few utilities for users to submit either sequential...
A graph may he drawn in many different ways. We investigate how to draw a graph nicely, in the sense of being visually pleasing. We discuss the history of this field, and look at several algorithms for drawing graphs.
For planar graphs this problem has been algorithmically solved: that is,...
While much work has been done in estimating software reliability, little attention is paid to predict reliability as early as at the design time. In this report, we present our initial research results of building an early stage software reliability prediction model.
In Part I, we will first investigate and...
Index size savings from three techniques are measured. The three techniques are: 1) eliminating common, low information words found in a "stop list" (such as: of, the, at, etc.), 2) truncating terms by eliminating word stems (such as: -s, -ed, -ing, etc.), and 3) simple data compression. Savings are measured...
Multiparadigm languages are languages that are designed to support more than one style of programming. Leda is a strongly-typed multiparadigm programming language that supports imperative, functional, object-oriented, and logic programming. The constraint programming paradigm is a declarative style of programming where the programmer is able to state relationships among some...
One current research goal of Artificial Intelligence and Machine Learning is to build learning systems that robustly improve their planning performance with experience [Tade91]. This work concentrates on learning decomposition rules, i.e., learning rules that guide the planning process by determining the order in which operators are to be applied...
Software development has been characterized by severe schedule slippage, cost overrun and the inability of the developer to estimate with acceptable accuracy the resources and schedule required early in the requirements analysis and functional design phase when critical investment decisions must be made. This estimation difficulty has emerged as one...
Exception handling is a general purpose tool found in many modern programming language, but most work to date is based on textual languages. We present an approach to exception handling in a new context, the visual programming language Forms/3. The visual nature of Forms/3 supports an effective approach to exception...
To be competitive in the development and production of complex electronic systems, a company must constantly scrutinize its manufacturing processes to find ways to reduce costs and increase productivity. One of the more costly problems is the inability to troubleshoot and repair faulty circuit boards on the spot. Rather, they...
The goal of Inductive Learning is to produce general rules from a set of seen examples, which can then be applied to other unseen examples. ID3 is an inductive learning algorithm that can be used for the classification task. The input to the algorithm is a set of tuples of...
Hewlett-Packard (HP) is one of the world's largest computer companies and the foremost producer of test and measurement instruments. In Corvallis, Oregon, HP manufactures several precision products on high speed, automated assembly lines. The alignment process of a cap to a base part is one of the essential processes in...
The majority of database users interact with database systems by manipulating forms. This paper discusses the fundamental mechanisms underlying forms and considers how these mechanisms affect the behavior of forms. We then review the forms supported by commercial products. None of the packages reviewed provide all of the features that...
Forms are the most common means of interacting with databases. Database develop1;11ent products offer differing capabilities for creating forms. We will compare the form design features available using 4th Dimension and Oracle database products. These two products have the most complete set of features compared to other micro computer database...
Practical parallel programming demands that the details of distributing data to processors and inter- processor communication be managed by the compiler. These tasks quickly become too di cult for a programmer to do by hand for all but the simplest parallel programs. Yet, many parallel languages still require the programmer...
Porting a graphical user interface application from one to another window system is usually very costly because it requires a thorough knowledge of the provided application frameworks or toolkits on each system. And the learning curve for graphical user interface application frameworks or toolkits is usually very steep.
The solution...
In this paper we describe a fuzzy logic based control system implemented on a PC architecture. The rule based inference engine of this expert system is easily configured through an ascii text file and is demonstrated to be capable of controlling various simulations, including an inverted pendulum. The controller is...
We believe concreteness, direct manipulation and responsiveness in a visual programming language increase its usefulness. However, these characteristics present a challenge in generalizing programs for reuse, especially when concrete examples are used as one way of achieving concreteness. In this thesis, we present a technique to solve this problem by...
Until now, attempts to extend the one-way constraint evaluation model of the spreadsheet paradigm to support complex objects, such as colored circles or user-defined types, have led to approaches featuring either a direct way of creating objects graphically or strong compatibility with the spreadsheet paradigm, but not both. This inability...
Oregon SpeedCode Universe (OSU) is a rapid prototyping system which automates the production of applications from a description of the user interface. GraphLab is a domain-specific tool which deals with several major problems concerning OSU: (1) limited functionality, that is, lacking graphical capabilities in generated applications, (2) indirect approach to...
Application programming has become increasingly difficult because of the high demands of creating a Graphical User Interface. One solution to this problem is to provide an object oriented application framework which provides much of the functionality of the GUI in a reusable library. However, application frameworks do not provide reusable...
This thesis presents a case study of applying machine learning tools to build a predictive
model of annual infestations of grasshoppers in Eastern Oregon. The purpose of the
study was two-fold. First, we wanted to develop a predictive model. Second, we wanted
to explore the capabilities of existing machine learning...
The secondary structure of a 16S rRNA molecule is a graphical, two dimensional representation used by molecular biologists in determining evolutionary relationships between different organisms. By comparing two secondary structures, scientists can obtain knowledge of how 'related' one species of bacteria may be to another species [OLSE 1986]. To date,...
Djang et al. [1998] introduced a new model of types for declarative visual programming languages (VPLs). Implicit static typing is used in their type model, in order to eliminate the programming mechanisms associated with type declarations, provide immediate visual feedback with respect to type errors and guarantee type safe programs....
I present a new heuristic search approach to compute approximate answers for the probability query in belief nets. This approach can compute the 'best' bounds for a query in a period of any given time (if time permitted, it will get an exact value). It inherits the essence of Symbolic...
Existing parallel file systems are proving inadequate in two important arenas:
programmability and performance. Both of these inadequacies can largely be traced
to the fact that nearly all parallel file systems evolved from Unix and rely on a Unix-oriented,
single-stream, block-at-a-time approach to file I/O. This one-size-fits-all
approach to parallel...
One of the tasks of image processing systems is to characterize images by thinning, or skeletonizing techniques. The standard method, based on a disk growing technique, often results in disconnected skeletons. Since the standard technique is based on a square grid system, the disk shape is also square. The. purpose...
The visual programming language Forms/3 currently uses a graphical user interface implemented in Garnet. Garnet was developed by the User Interface Software Group in the Human Computer Interaction Institute at Carnegie Mellon University, but is no longer supported. This paper presents an implementation of a user interface for Forms/3 written...
Experimental test beds to study Real Time agents are common topic in AI. A principal contribution of this paper is to develop a framework for such experimental test beds. Design and implementation of the test bed and the extension of the framework for different equipment and agents are discussed. Results...
A multiparadigm language is one which combines features of different language paradigms. Leda is a strongly typed, compiled, multi.paradigm language with facilities for imperative, functional, object oriented and relational programming. This report describes the type checker of the Leda compiler and the implementation of first class functions required for functional...
The processor allocation in an n-dimensional hypercube multiprocessor using buddy strategy, gray code strategy, and a new strategy is implemented on Macintosh. Our implementations show that when processor relinquishment is not considered (i.e. static allocation), all these strategies are optimal in the sense that each incoming request sequence is always...
Programming parallel machines has been a difficult and unrewarding task. The short lifespan of parallel machines and their incompatibility have made it difficult to utilize them. Our goal here is to create an environment for parallel computing which allows users to take advantage of parallel computers without writing parallel programs....
The goal of this project was to investigate the addition of parameterized types to the Java programming language. Two different parametric polymorphism mechanisms were developed and compared. The first was a preprocessor and the second was a compiler.
Parameterized types allow a programmer to create generic programs. Much as a...
Distance learning enables students to study effectively from remote locations. With
televised lectures now commonplace, many disciplines can provide effective distance
learning. While the lecture format serves many disciplines, it does not serve those that
require laboratory work. Currently, there is little distance learning support for laboratory
work.
Second Best...
Two types of parallel computers commonly used or solving large scientific problems are clusters of workstations and distributed-memory multicomputers. Each system has strengths and weaknesses for this task. Workstation clusters have a high performance to cost ratio and the advantage of the latest processors. Workstations are commonly under-utilized and can...
The problem addressed in this paper is that of the extension of multimedia mechanism into Objex, an object oriented framework.
Several multimedia productions have been developed recently, such as QuickTime™, HyperCard, MacSpeaker, Multi-Ad Creator 3.0, etc. [Heid 91]. The common problems with them, however, are that they are all implemented...
Applications supporting a graphical user interface (GUI) are difficult to write. While existing tools can accelerate software development, they suffer from a number of problems that limit their helpfulness. They offer too little functionality, and support only a small part of the GUI software development task. They lack architectural models...
As the World Wide Web continues to grow, people clearly want to do much more
with it than just publish static pages of text and graphics. While such increased interactivity
has traditionally been accomplished through the use of server-side CGI scripts,
much recent research on Web browsers has been on...
Multicomputers and multiprocessors, first introduced in the mid 1980s, employ large numbers of microprocessors working in parallel to achieve high performance at low cost. These parallel machines represent an exciting new generation in supercomputing, but their usefulness is currently limited by inadequate programming languages and environments. This problem is being...
In real-time control systems, the value of a control decision depends not
only on the correctness of the decision but also on the time when that decision
is available. Recent work in real-time decision making has used machine learning
techniques to automatically construct reactive controllers, that is, controllers
with little...
The Java programming language is a recent entry in the family of object-oriented languages and like all new languages has yet to achieve widespread standardization of coding styles for the unique aspects it possesses. This paper and project address this issue by presenting a complete and rationalized style guide for...
This research explores the hypothesis that methods from decision theory and machine learning can be combined to provide practical solutions to current manufacturing control problems. This hypothesis is explored by developing an integrated approach to solving one manufacturing problem - the optimization of die-level functional test.
An integrated circuit (IC)...
Knowledge-Based Model Construction (KBMC) has generated a lot of attention
due to its importance as a technique for generating probabilistic or decision-theoretic
models whose range of applicability in AI has been vastly increased. However, no
one has tried to analyze the essential issues in KBMC, to determine if there exists...