Maintaining variation in software is a difficult problem that poses serious challenges for the understanding and editing of software artifacts. Although the C preprocessor (CPP) is often the default tool used to introduce variability to software, because of its simplicity and flexibility, it is infamous for its obtrusive syntax and...
The gradient of a velocity vector field is an asymmetric tensor field which can provide critical insight that is difficult to infer from traditional trajectory-based vector field visualization techniques. I describe the structures in the eigenvalue and eigenvector fields of the gradient tensor and how these structures can be used...
In many traditional computer graphics applications, rendered scenes typically utilize 3D meshes to represent objects within an environment. As the demand to further improve the realism of graphics applications increases, such as for movies and games, it is becoming more important to represent the inner volumes of object meshes. In...
Advanced computer architectures are centered around the parallel computer systems. This project is focused on the experiment on two parallel computer architectures : Sequent Balance 21000 shared memory multiprocessor and Cogent XTM distributed system.
A set of benchmark programs are implemented using "C" language and Dynix parallel programming library on...
It is common practice in the unsupervised anomaly detection literature to create experimental benchmarks by sampling from existing supervised learning datasets. We seek to improve this practice by identifying four dimensions important to real-world anomaly detection applications --- point difficulty, clusteredness of anomalies, relevance of features, and relative frequency of...
This paper continues exploration in the area of
programming for parallel computers. The appendix to the
paper contains an extensive survey of the literature related
to parallel computers and parallel programming techniques.
The paper itself presents a new approach to solving
the Laplace equation on a. parallel computer. A new...
Scientists in the biological sciences need to retrieve information from a variety of data collections, traditionally maintained in SQL databases, in order to conduct research. Because current assistant tools are designed primarily for business and financial users, scientists have been forced to use the notoriously difficult command-line SQL interface, supplied...
A project is described wherein a source code browser was implemented based upon tasks performed by software engineers looking for source code modules with only UNIX utilities. These tasks were discovered by observing the actions of software engineers looking for code modules in a library, The success of the implementation...
With sequential computing technology reaching its speed limits, parallel processing is emerging as the key to very-high-speed computation. However, developing a parallel program is by no means a simple task; neither is analyzing the performance of parallel programs.
C* is a high-level data-parallel language that hides explicit message passing and...
"Collaborative filtering algorithms’ performances have been evaluated using a variety of metrics.
These metrics, such as Mean Absolute Error and Precision, have often ignored recommendations for
which they do not have data. Ignoring these recommendations has provided numbers which do not
accurately represent the user experience. Qualitatively we have seen...
Recent studies have shown that novel continuous dropout methods can be viewed as a Bayesian interpretation of model parameters, though most such studies have shown results using normal distributions. As the posterior distributions over neural network nodes and parameters are intractable, given that they are a result of artificial construction...
Distributed Version Control Systems (DVCS) have seen an increase in popularity relative to traditional Centralized Version Control Systems (CVCS). Yet we know little on whether VCS tools meet the needs of software developers when managing software change or whether developers are benefitting from the extra power of DVCS. Without such...
Through comparison and analysis of selected holidays in countries with significantly different features (Kyrgyzstan, Russia, and the United States), this study aims to gauge the impact of holiday celebrations in influencing cultural values in those countries studied. K. Zygulski’s classification system for holidays was applied to divide the holidays of...
Farm machinery continues to increase in its importance to the agricultural sector. Depreciation, the decline in value of a durable asset over time, represents one of the largest costs of agricultural production. The general objectives of this study were to update and expand the number of Remaining Value (RV) functions...
This paper reviews McCabe's cyclomatic complexity and Halstead's laws; it discusses studies in current literature relating the metrics to software. The studies are reproduced using data obtained from a large software project developed in a major electronics firm. Problems that occur when deriving the metrics are discussed; the result of...
This paper compares three classes of algorithms for finding
Hamiltonian circuits in graphs. Two of the classes are exhaustive
search procedures and this study finds them to have an exponential
dependence on the size of the graph. The third class of algorithms,
based on Warnsdorff's rule, is found to be...
The data used for the construction of genome maps is imperfect, therefore the mapping of a physically linear structure must take place in a very uneven feature space. As the number of genes to be ordered grows, it appears to be impractical to use exhaustive search techniques to find the...
This thesis describes a syntax-directed compiler-compiler
called COMCOM which has been implemented by the author on the CDC
3300 under the OS-3 operating system. The theory and terminology
of the parsing method and compiler-compilers in general are briefly
discussed. COMCOM uses Floyd's operator precedence bottom-up
parsing technique which avoids backup...
This project aims at building a Java application and a Java applet that would simulate a Stern-Gerlach laboratory in Quantum Mechanics. The project provides a tool for allowing the student to quickly design and run on the computer screen a number of experiments involving spin systems. This application could be...
Although much effort has been invested to build applications that support group work, collaborative applications have not found easy success. The cost of adopting and maintaining collaborative applications has prevented their widespread use, especially among small distributed groups. Application developers have had difficulties recognizing the extra effort required by groups...
This thesis presents a model for simulating individual pedestrian motion based on empirical data. The model keeps track of a pedestrian’s position, orientation, and body configuration and leverages motion capture data to generate plausible motion. Our model can automatically incorporate a pedestrian’s physical limitations when making movement decisions, since it...
The objective of this thesis is to develop a tool
which can be used to implement query processing algorithms
produced by a query optimizer. The tool should have the
following properties:
(1) it should support a description of the query solution in a dataflow-like language,
(2) it should support data...
The aim of this thesis is to study past 10 years of security vulnerabilities reported against Linux Kernel and all existing mitigation techniques that prevent the exploitation of those vulnerabilities. To systematically study the security vulnerabilities, they were categorized into classes and sub-classes based on their type.
This thesis first...
This thesis addresses the problem of temporal action segmentation in videos, where the goal is to label every video frame with the appropriate action class present. We focus on the domain of NFL football videos, where action classes represent common football play types. For action segmentation, we use a temporal...
Advances in deep learning based image processing have led to their adoption for a wide range of applications, and in tow with these developments is a dramatic increase in the availability of high quality datasets. With this comes the need to accelerate and scale deep learning applications in order to...
This paper describes the design and performance of a distributed, multi-tier architecture for scientific information management and data exploration. A novel aspect of this framework is its integration of Java IDL, the CORBA distributed object computing middleware with JavaBeans, the Java Component model to provide a flexible, interactive framework for...
Functional programming is concerned with referential transparency, that is, given a certain function and its parameter, that the result will always be the same. However, it seems that this is violated in applications involving uncertainty, such as rolling a dice. This thesis defines the background of probabilistic programming and domain-specific...
Reasoning about any realistic domain always involves a degree of uncertainty.
Probabilistic inference in belief networks is one effective way of reasoning under
uncertainty. Efficiency is critical in applying this technique, and many researchers
have been working on this topic. This thesis is the report of our research in this...
Nonnegative matrices have a myriad of applications in the biological, social, and physical genres. Of particular importance are the primitive matrices. A nonnegative matrix, M, is primitive exactly when there is a positive integer, k, such that M[superscript k] has only positive entries; that is, all the entries in M[superscript...
A radix 2n non-restoring division algorithm is described. The
algorithm is designed to be compatible with hardware multiprecision
multiplication methods currectly used in high speed digital computers.
This enables the use of the same hardware, with only changes in
control logic, to be used to implement both multiplication and
division....
This paper investigates the feasibility of compiling the functionality of a decision
theoretic problem solving engine into a set of rules or functionally similar construct.
The decision theoretic engine runs in exponential time, while the rule set runs in
linear time at worst. The main question that will determine the...
In this thesis, we present semantic equivalence rules for an extension of the choice calculus and sound operations for an implementation of variational lists. The choice calculus is a calculus for describing variation and the formula choice calculus is an extension with formulas. We prove semantic equivalence rules for the...
Continuous Improvement (CI) of academic computing programs is a main requirement of accreditation. Academic computing programs must have a well-documented CI plan in order to be granted accreditation. Based on the existing literature, we developed a comprehensive CI (or 360-CI) model consisting of 8 components: course, curriculum, administration, faculty, research,...
Smart home devices are becoming increasingly popular and by 2021, it is estimated to have 80 million devices in the households of the U.S. The privacy and security threats involved with devices, as a result, are also scaling up in recent years. Smart home cameras have been hacked, private conversations...
The local presence or absence of individual botanical species can be predicted with high accuracy by a simple feed-forward neural network, using only local climate data to make inference. This study proposes a framework for learning these predictive models, demonstrates highly accurate predictions for species with a sufficiently large area...
In this thesis we introduce alpha and beta tree acceptors,
generalizations of tree automata. The alpha tree acceptors recognize
a tree by final symbol and the beta tree acceptors by final state. We
show that alpha and beta tree acceptors recognize the same sets of
Gorn trees and demonstrate that...
In order to aid comparison of estimates of genetic parameters between dominant and codominant makers for population genetics society, we developed a genetic dominance simulation program to determine how the dominance and biallelism could affect the estimation of population genetic statistics. The simulation indicates that genetic diversities within populations based...
This report addresses the design and implementation of an internet-based grading tool for the "Translators" course. The motivation is to avoid exposing the instructor's Java byte-code to possible reverse-engineering tools and enable students to submit their homework virtually from any machine across the internet. This tool is intended to replace...
Development of graphical user interface (GUI) applications is difficult since the process can be both complicated and tedious. We propose a solution directed at reducing programming time and effort required to build a GUI application. Our solution is based on the Petri Network, the Oregon SpeedCode Universe (OSU) Application Framework,...
Various ecological and hydrological models require estimates of the amount and spatial distribution of monthly and annual precipitation. PRISM is an analytical model that distributes point measurements of monthly, seasonal and annual precipitation to a geographic grid. In order to use this model effectively, good graphical user interfaces were needed....
The task of designing and building the user interface portion of a Macintosh application is radically different than the same task on a more conventional computer with a conventional operating system. Just the fact that it is radically different makes it very time consuming to learn how to program this...
The new Technical Report Management System (TRMS) is a client/server web application developed for the Compute Science Department. It provides web services like online browsing, viewing, searching and online database administration including creating and modifying the existing bibliographic records. It is a multi-tiered, component-based application that deploys the latest J2EE...
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a public key cryptosystem used for creation and verification of digital signatures in electronic documents. In this thesis, we created a Java applet that provides the functionality of the ECDSA using all of the NIST elliptic curves over GF(p). This applet was...
The general problem of classroom scheduling is well known to be NP-complete. A typical classroom scheduling problem is that of assigning students to a limited number of pieces of laboratory equipment. Frequently several identical pieces of equipment are available only at fixed hours and have limited access due to physical...
The history of a software project plays a vital role in the software development process. Version control systems enable users of a software repository to look at the evolution of the source code, and see the changes that led to newer versions. Currently, version control systems provide commands that can...
Polyhedra are geometric representations of linear systems of equations and
inequalities. Since polyhedra are used to represent the iteration domains of nested
loop programs, procedures for operating on polyhedra can be used for doing loop
transformations and other program restructuring transformations which are needed
in parallelizing compilers. Thus a need...
The most important part of parallel computation is communication. Except in the most embarassingly parallel examples, processors cannot work cooperatively to solve a problem unless they can communicate. One way to solve the problem of communication is to use an interconnection network. Processors are located at nodes of the network,...
Although there has been research into ways to design spreadsheet systems to improve the processes of creating new spreadsheets and of understanding existing ones, little attention has been given to helping users of these environments test their spreadsheets. To help address this need, we introduce two visual approaches to testing...
The well-known local adjustment algorithm for training a
threshold logic unit, TLU, is extended to a local adjustment
algorithm for training a network of TLUs Computer simulations
show that the extension is unsatisfactory.
A new logic for a committee of TLUs, called modified veto logic,
and a local adjustment algorithm...
This paper demonstrates the effectiveness of a heuristic for
the Traveling Salesman Problem based purely on the efficient
storage of multiple partial sub-tours. The heuristic is among the
best available for the solution of large scale geometric Traveling
Salesman Problems. Additionally, a version of the heuristic can be
used to...
The minimal kernel for a network data base management
system is developed. It includes format specification for
the object schema and the data base file, and routines for
performing the basic network functions.
The object schema defines the relationship between the
schema, records, sets, and fields. It informs the system...
The reaction of fluid or gas flowing around an obstacle is a common engineering problem. Computer simulations are often used to measure and visualize the physical processes involved. In this report, we will discuss a parallel implementation of a simulation using the Smooth Particle Hydrodynamics (SPH) approach to fluid flow....
Cryptographic systems are used for secure communications in the government, in industry, and by individuals. Many cryptographic systems base their security on our inability to factor quickly. For this reason, in the last few decades several fast factoring algorithms have been developed. The Quadratic Sieve is one of the best...
Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can operate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors ranging from corrupt communications to partial...
Prettyprinters are software tools that format program source code so that it conforms to certain standards of consistency and hence improves readability. Traditionally, these standards were fixed for a particular prettyprinter as indicated by a literature survey, with very little or no supporting evidence that the formatting style improves readability....
Rapid Prototyping is a technique that has been created to alleviate some of the problems inherent in the traditional approach to software development. Various prototyping techniques currently employed are presented. The user interface aspect of system design is studied and where the benefit of prototyping user interfaces lies is shown....
Allegro is a network database management system being developed at Oregon State University. This project adds a user friendly query facility to the system. The user is presented with pictorial display of the network records and a query interface modeled on the Query-By-Example system. By request the user may be...
We took the back-propagation algorithms of Werbos for recurrent and feed-forward neural networks and implemented them on machines with graphics processing units (GPU). The parallelism of these units gave our implementations a 10 to 100 fold increase in speed. For nets with less than 20 neurons the machine performed faster...
Interconnection Networks have been used as a high performance communication fabric in parallel processor architectures. Parallel processors built using off-the-shelf components, called clusters, are becoming increasingly attractive because of their high performance to cost ratio over parallel computers.
Many web servers and database servers make efficient use of clustering from...
A rule based transformational model for program development and a metatool based on the above model is presented. The meta-tool can be instantiated to create various program development tools such as tools for building reusable software components, language directed editors, language to language translators, program instrumentation, structured document generator, and...
Domain-independent automated planning is concerned with computing a sequence of actions that can transform an initial state into a desired goal state. Resource production domains form an interesting class of such problems, in that they typically require reasoning about concurrent durative-actions with continuous effects while minimizing some cost function. Although...
Signal is a multimedia messaging application developed by OpenWhisper Systems in 2015 which allows its users to communicate securely between one another through the use of a complex encryption scheme. The set of algorithms used in combination to provide the services of the Signal application to their users is called...
Considerable progress ha been made in Computer-Aided Design (CAD) techniques to assist Mechanical Engineers in the detail design stages and adaptive redesign tasks. Currently, CAD tools support the designer in system layouts, sizing of components, drafting, analytical calculations, generating NC machining data and even motion or energy simulations. However, the...
A software system to study network algorithms was implemented on UNIX. Each part of a network algorithm is written as a single C program which becomes a virtual node in the network. During a simulation, all virtualized nodes run as separate processes on a single PDP 11/40. Inter-node communication is...
A cost reduction analysis is performed by coordinating
the exchange of LANDSAT (formerly ERTS) data between a CDC
3300 and a PDP8/L minicomputer. The LANDSAT data is displayed
on a 4002 Tektronix terminal by means of a grayscale output.
Large amounts of data and number manipulation are processed
in the...
A specialized ATMS for efficiently computing equivalence relations in multiple contexts is introduced. This specialized ATMS overcomes the problems with existing solutions to reasoning with equivalence relations. The most direct implementation of an equivalence relation in the ATMS-encoding the reflexive, transitive, and symmetric rules in the consumer architecture-produces redundant equality...
Given k terminal pairs (s₁,t₁),(s₂,t₂),..., (s[subscript k],t[subscript k]) in an edge-weighted graph G, the k Shortest Vertex-Disjoint Paths problem is to find a collection P₁, P₂,..., P[subscript k] of vertex-disjoint paths with minimum total length, where P[subscript i] is an s[subscript i]-to-t[subscript i] path. As a special case of the...
Distance-based algorithms are machine learning algorithms that classify queries
by computing distances between these queries and a number of internally stored
exemplars. Exemplars that are closest to the query have the largest in
uence on
the classi cation assigned to the query. Two speci c distance-based algorithms, the
nearest neighbor...
Program understanding is a very important part of the testing and maintenance phases of the programming projects. The overall program knowledge, understanding the various parts of the program, and how they communicate are key steps in understanding the program. Hence data communication among modules contributes much to program complexity and...
Traditionally, people learn to perform object assembly tasks by following the steps in a paper-based instruction manual. Using augmented reality (AR) technology, the instructions could instead be computer generated and appear directly within the user’s workspace as they perform the task. Literature suggests AR’s feasibility in improving performance and learning...
Controlling the "complexity" or "understandability"
of computer software is important because of its impact on
program testing and maintenance. Of the large number of
complexity metrics that have been developed to measure the
complexity of a computer program, most assess the
"micro-complexity" of each subprogram and few assess the
"macro-complexity"...
Object categorization is one of the fundamental topics in computer vision research. Most current work in object categorization aims to discriminate among generic object classes with gross differences. However, many applications require much finer distinctions. This thesis focuses on the design, evaluation and analysis of learning algorithms for fine- grained...
Reinforcement Learning (RL) is the study of learning agents that improve
their performance from rewards and punishments. Most reinforcement learning
methods optimize the discounted total reward received by an agent, while, in many
domains, the natural criterion is to optimize the average reward per time step. In this
thesis, we...
”Until relatively recently, mankind was not aware that there was a separable binocular depth sense. Through the ages, people like Euclid and Leonardo understood that we see different images of the world with each eye. But it was Wheatstone who in 1838 explained to the world, with his stereoscope and...
Automatic music transcription (AMT) is the task, given an acoustic representation of music, to recover a symbolic notation of the written notes expressed by the sound. Transcribing music with multiple notes sounding simultaneously is difficult for both humans and machines. Much existing work on AMT has focused on suitable acoustic...
This paper presents data obtained from measurements of the X2 Interpreter. The measurements were made to determine how to Increase the Interpreter's efficiency. The subroutine call operation was found to consume a significant percent of the execution time.
As of February 2012, approximately 46% of American adults own a smartphone. The graphics quality of these devices gets better each year. However, they still have many more limitations in graphics processing and storage space than desktop computers. This means that applications on these devices should focus on optimizing their...
There is a significant amount of research analyzing the effect of race, gender, and other common demographical data on student interest and performance in computer science. However, there is relatively little research concerning less common demographic populations, such as introverts, artistic students, and visual learners. This study investigates if these...
This survey looks at five different polygon breaking algorithms and compares them. The criteria for comparison are:
1. Complexity of the algorithm.
2. The number and type of subpolygons produced.
3. The classes of input polygons that can be broken.
The best algorithm is then coded and a comparison is...
Program comprehension is important in program testing, debugging, and maintenance. Programming style impacts program understanding. However, there has not been any systematic identification of individual style factors and their contribution to program comprehension. In this thesis we present a programming style taxonomy composed of three classes: typographic (program layout and...
A timing attack on a cryptosystem allows the attacker to deduce the secret key information based on the timing differences with respect to different inputs given to an encryption or decryption algorithm. Cryptosystems can take variable amounts of time to process due to performance optimizations in software, branching or conditional...
Monitoring the performance of electrical utility assets is a critical activity for power companies. Gas chromatography has long been established as the analytical technique of choice for assessment of transformer fault conditions by detection of the presence of key fault gases through analysis of transformer oil. Chromatography is one of...
Programming style is a highly individualistic and important part of programming. Yet, there is no general agreement on the definition of programming style nor even the qualities that make up programming style. Existing automated programming style analyzer programs either produce a battery of numbers, leaving the weight and/or importance of...
Alignment of genomic sequences from different species is becoming an increasingly powerful method in biology, and is being used for many purposes. The result of sequence alignments is a list of pairs of matched locations between the pattern string and the text string. However, without any proper visualization tools to...
The project aims at building an application that would simulate a transparency that can either be overlaid on top of the graphical display of another application or used as a stand-alone by accepting input from a keyboard or a mouse to enhance a presentation. The project provides an environment to...
The system described is an interface between a student and a problem-solving production system that solves some class of problems. Its purpose is to help a student learn some part of the realm of problem solving. As a student attempts to solve a problem the system controls the firing of...
Recently several minimal perfect hashing functions far small static word sets have been developed. However, they are limited to sets of 50 words or less. In this paper, a Two Level Minimal Perfect Hash Function for large data sets is given. It partitions a large static set into small sets...
As XML becomes more and more popular, easy-to-use and powerful XML query languages are in great need. Xing is a visual query and restructuring language for XML documents. The objective of this project is to develop a basic version of Xing, including a user-oriented XML query interface and a simple...
A basic tradeoff to consider when designing a distributed data-mining framework is the need for a compromise between the cost of communication and computation resources and the accuracy of the mining results. This is essentially a decision of whether it is more efficient to communicate all of the data to...
Conversion of software written for one machine or
operating system to equivalent software for another machine
or operating system is shown to be economically attractive
using source-to-source translation. The features of an
automatic converter are described using a Pascal-to-C
translater as an example. Solutions to the problems of
denesting procedures,...
This paper addresses the high model complexity and overconfident frame labeling of state-of-the-art (SOTA) action segmenters. Their complexity is typically justified by the need to sequentially refine action segmentation through multiple stages of a deep architecture. However, this multistage refinement does not take into account uncertainty of frame labeling predicted...
In this dissertation, we address action segmentation in videos under limited supervision. The goal of action segmentation is to predict an action class for each frame of a video. The limited supervision means ground truth labels of video frames are not available in training. We focus on three types of...
Active contour models have been widely applied to image segmentation and
analysis. It has been successfully used in contour detection for object recognition,
computer vision, computer graphics, and biomedical image processing such as X-ray,
MRI and Ultrasound images.
The energy-minimizing active contour models or snakes were developed by Kass,
Witkin...
Humans are remarkably efficient in learning by interacting with other people and observing their behavior. Children learn by watching their parents’ actions and mimic their behavior. When they are not sure about their parents demonstration, they communicate with them, ask questions, and learn from their feedback. On the other hand,...
Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are...
We developed and investigated machine learning methods that require
minimal preprocessing of the input data, use few training examples, run fast, and
still obtain high levels of accuracy.
Most approaches to designing machine learning programs are based on the
supervised learning paradigm – training examples are chosen randomly and given...