A relatively new model of error control is the limited magnitude error over high radix channels. In this error model, the error magnitude does not exceed a certain limit known beforehand. In this dissertation, we study systematic error control codes for common channels under the assumption that the maximum error...
This paper provides a measurement-based performance evaluation of the Optimized Link State Routing (OLSR) protocol. Two versions of OLSR, OLSR-ETX and OLSR-ETT, are implemented and evaluated on a mesh network that we built from off-the-shelf commercial components. OLSR-ETX uses the Expected Transmission Count (ETX) metric whereas, OLSR-ETT uses the Expected...
Visual information presented in diagrams promotes information processing both in an individual and in collaborative work. Previous literature has identified the role of diagrams in
understanding information processing in a variety of disciplines. In software engineering, diagrams are a prevalent method involved in process development: diagrams are used for system...
Monte-Carlo planning algorithms such as UCT make decisions at each step by
intelligently expanding a single search tree given the available time and then
selecting the best root action. Recent work has provided evidence that it can be
advantageous to instead construct an ensemble of search trees and make a...
While there are many ways to evaluate a user interface design, the user's mental workload and situation awareness (SA) are particularly important considerations in the supervisory control of safety-critical systems. Typically, operators of these systems must monitor high-volume, time-sensitive status information. Interface design for this domain can be challenging and...
Asymmetric tensor fields are useful for understanding fluid flow and solid deformation. They present new challenges, however, for traditional tensor field visualization techniques such as hyperstreamline placement and glyph packing. This is because the physical behavior of tensors inside real domains where eigenvalues are real is fundamentally different from the...
3D object recognition is a very difficult and important problem in computer vision, arising in a wide range of applications. Typically in 3D object recognition, interest points are extracted from images and then matched. A shortcoming of this approach is that points only carry local visual information. Therefore, there
could...
In text classification, labeling features is often less time consuming than labeling entire documents. In situations where very little labeled training data is available, feature relevance feedback has the potential to dramatically increase classification performance. We review previous work on incorporating feature relevance feedback in the form of labeled features...
This dissertation explores algorithms for learning ranking functions to efficiently solve search problems, with application to automated planning. Specifically, we consider the frameworks of beam search, greedy search, and randomized search, which all aim to maintain tractability at the cost of not guaranteeing completeness nor optimality. Our learning objective for...
Trends in wireless networks are increasingly pointing towards a future with multi-hop
networks deployed in multi-channel environments. In this thesis, we present the design
for iMAC—a protocol targeted at medium access control in such environments. iMAC
uses control packets on a common control channel to faciliate a three-way handshake
between...
Generating solutions to Sokoban levels is an NP-hard problem that is difficult for even modern day computers to solve due to its complexity. This project explores the creation of a Sokoban solver by eliminating as many potential moves as possible to greatly limit the overall search space. This reduction is...
Keeping FOSS developers motivated is a challenging problem, and their motivation levels can affect the team's productivity and satisfaction, leading to higher or lower productivity. Using reputation systems as a motivator has become the de-facto standard for many online communities, rewarding user's activity through badges of honor or achievement levels....
The gradient of a velocity vector field is an asymmetric tensor field which can provide critical insight that is difficult to infer from traditional trajectory-based vector field visualization techniques. I describe the structures in the eigenvalue and eigenvector fields of the gradient tensor and how these structures can be used...
This study introduces a layered model for rendering human teeth. Human teeth exhibit complex light interaction due to their layered structure. While the lighting responses of teeth have been studied in the dental industry for the production of realistic looking dentures, to our knowledge this is the first study of...
Since free riders in P2P network reduce the system's performance, how to maintain and encourage the nodes' cooperation is an important aspect of P2P related research. In this thesis, a P2P system is modeled based on two games: stag hunt game and snowdrift game. To relate the model to the...
For building robust software applications, it is important for the software engineer to make efficient use of the available building blocks. Learning the basic language constructs is only the first step in this process. It is becoming increasingly important for software engineers, especially students, to get acquainted with the available...
Dimensionality reduction (DR) is an efficient approach to reduce the size of
data by capturing the informative intrinsic features and discarding the noise. DR
methods can be grouped through a variety of categories, e.g. supervised/ unsupervised,
linear/non-linear or parametric/non-parametric. Objective function based
methods can be grouped into convex and non...
Parallel processors are classified into two classes: shared-memory multiprocessors and distributed- memory multiprocessors. In the shared-memory system, processors communicate through a common memory unit. However, in the distributed multiprocessor system, each processor has its own memory unit and the communications among the processors are performed through an interconnection network. Thus,...
This dissertation addresses two fundamental problems in computer vision—namely,
multitarget tracking and event recognition in videos. These problems are challenging
because uncertainty may arise from a host of sources, including motion blur,
occlusions, and dynamic cluttered backgrounds. We show that these challenges can be
successfully addressed by using a multiscale,...
This dissertation addresses a number of inter-related and fundamental problems in computer vision. Specifically, we address object discovery, recognition, segmentation, and 3D pose estimation in images, as well as 3D scene reconstruction and scene interpretation. The key ideas behind our approaches include using shape as a basic object feature, and...
Visual programming languages (VPLs) have been widely used to support end-user programming. However, end users are still not able to reuse code as actively as professional programmers, even when given abundant resources such as a large VPL program repository. One reason may be that current VPL development environments lack features...
Beaversource provides both code-hosting tools and social networking in one place. Students and faculty at Oregon State University have been using Beaversource to host their projects, both for classwork and research. Several usability problems were reported in a survey conducted on Beaversource last year. Some of these issues were severe...
Free/Open Source Software (FOSS) is a powerful development paradigm for
creating software. Increasingly more FOSS projects, like Firefox and Android, are
integrated into mainstream technology. It is important that FOSS projects serve its
diverse user base well. Several surveys have found that existing FOSS communities
are very homogenous populations and...
The Internet is growing rapidly in terms of websites, users and uses. People use the Internet for reference, shopping, social networking, communications, business and much more. Though the Internet is useful, there are many risks associated with its use, like malicious websites, identity theft, hateful content and fraudulent practices. Online...
A distributed system is a network of multiple autonomous computational nodes designed primarily for performance scalability and robustness. The performance of a distributed system depends critically on how tasks and resources are distributed among the nodes. Thus, a main thrust in distributed system research is to design schemes for distributing...
Acting intelligently to efficiently solve sequential decision problems requires the ability to extract hierarchical structure from the underlying domain dynamics, exploit it for optimal or near-optimal decision-making, and transfer it to related problems instead of solving every problem in isolation. This dissertation makes three contributions toward this goal.
The first...
Within the past several years the technology of high-throughput sequencing has transformed the study of biology by offering unprecedented access to life's fundamental building block, DNA. With this transformation's potential a host of brand-new challenges have emerged, many of which lend themselves to being solved through computational methods. From de...
Experimental game theory is the use of game theoretic abstractions—games, players, and strategies—in experiments and simulations. It is often used in cases where traditional, analytical game theory fails or is difficult to apply. This thesis collects three previously published papers that provide domain-specific language (DSL) support for defining and executing...
In a software development cycle, programs go through many iterations. Identifying and
understanding program changes is a tedious but necessary task for programmers, especially when
software is developed in a collaborative environment. Existing tools used by the programmers
either lack in finding the structural differences, or report the differences as...
While there are powerful keyword search systems that index all kinds of resources including emails and web pages, people have trouble recalling semantic facts such as the name, location, edit dates and keywords that uniquely identifies resources in their personal repositories. Reusing information exasperates this problem. A rarely used approach...
Free/Open Source Software (FOSS) communities often use open bug reporting to allow users to participate by reporting bugs. This practice can lead to more duplicate reports, as inexperienced users can be less rigorous about researching existing bug reports. The purpose of this research is to determine the extent of this...
The meteoric rise and prevalent usage of wireless networking technologies for mobile
communication applications have captured the attention of media and imagination of
public in the recent decade. One such proliferation is experienced in Wireless Sensor
Networks (WSNs), where multimedia enabled elements are fused with integrated
sensors to empower tightly...
A fundamental problem in computer vision is to partition an image into meaningful segments. While image segmentation is required by many applications, the thesis focuses on segmentation of computed tomography (CT) images for analysis and quality control of composite materials. The key research contribution of this thesis is a novel...
It is possible to purchase, for as little as $10,000, a cluster of computers with the capability to rival the supercomputers of only a few years ago. Now, users that have little to no experience developing distributed applications or managing a cluster are in a position to do so. To...
Streaming media and interactive television viewing experiences are becoming more commonplace with the introduction of services such as Netflix Streaming, the Apple TV, and Google TV aided by the increase adoption of broadband internet. As these services make their way into the living room, and developers struggle to accommodate more...
Modern technology has enabled the advancement of biological research through the use of powerful machines and computers as well as innovative computer programs. Advances in sequencing technology and software enable us to make de novo assemblies of organism genomes, and the development of specialized computer programs can automate routine but...
Application Programming Interfaces (APIs) enable software developers to utilize and create functionality that would otherwise take a lot of time and effort to build from scratch. Consequently, an essential part of software engineering training is for students to learn how to use APIs effectively. The existing jTutors system enables an...
As of February 2012, approximately 46% of American adults own a smartphone. The graphics quality of these devices gets better each year. However, they still have many more limitations in graphics processing and storage space than desktop computers. This means that applications on these devices should focus on optimizing their...
Recent work in machine learning concerns the detection and identification of bird species from audio recordings of their vocalizations. Such analysis can yield valuable ecological information concerning the activity and distribution of species in the wild. Current species-identification methods require individual syllables of bird audio as input, but field-collected audio...
Throughout Europe, Northern Africa, and the Near East, hundreds of Roman ruins lie scattered about. Many Roman aqueducts, bridges, roads, and even buildings remain standing over two thousand years after their construction, as functional as the day they were built. In the modern United States, however, many public works projects...
In this work, I examine the problem of understanding American football in video. In particular, I present several mid-level computer vision algorithms that each accomplish a different sub-task within a larger system for annotating, interpreting, and analyzing collections of American football video. The analysis of football video is useful in...
There is a growing interest in bringing online and streaming content to the television. Gaming platforms such as the PS3, Xbox 360 and Wii are at the center of this digital convergence; platforms for accessing new media services. This presents a number of interface challenges, as controllers designed for gaming...
Spreadsheets are a widely used end-user programming tool. Field audits have found that 80-90% of spreadsheets created by end users contain textual and formula errors in spreadsheets. Such errors may have severe negative consequences for users in terms of productivity, credibility, or profits. To solve the problem of spreadsheet errors,...
Networks of distributed, remote sensors are providing ecological scientists with a view of our environment that is unprecedented in detail. However, these networks are subject to harsh conditions, which lead to malfunctions in individual sensors and failures in network communications. This behavior manifests as corrupt or missing measurements in the...
Researchers/engineers in the field of software testing have valued coverage as a testing metric for decades now. There have been various empirical results that have shown that as coverage increases the ability of the test program to detect a fault also increases. As a result numerous coverage techniques have been...
This project addresses the problems of manually placing facial landmarks on a portrait and finding a fast way to warp the annotated image of a face. While there are many approaches to automatically find facial landmarks, most of them provide insufficient results in uncontrolled environments. Thus I introduce a method...
Buses can be impractical for those who must adhere to a strict schedule or depend on them for emergencies. While variations from the official bus schedule are understandable and largely unavoidable, a lack of communication discourages adoption at a rate disproportionate with their actual likelihood. Even if a bus is...
Programmers often have to choose components online for reuse based on software quality. To help with this choice, most component repositories (SourceForge, CodeProject, etc.) provide information such as user ratings and reviews of components. However, the reusability of components is not immediately obvious from
this material. To make things worse,...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
The study of physical activity is important in improving people’s health as it can help people understand the relationship between physical activity and health. Accelerometers, due to its small size, low cost, convenience and its ability to provide objective information about the frequency, intensity, and duration of physical activity, has...
How can an agent generalize its knowledge to new circumstances? To learn
effectively an agent acting in a sequential decision problem must make intelligent action selection choices based on its available knowledge. This dissertation focuses on Bayesian methods of representing learned knowledge and develops novel algorithms that exploit the represented...
This project presents a new, and more versatile, method for performing Relief Mapping (also known as Parallax Occlusion Mapping), utilizing rates of change in texture coordinates across a polygon surface to calculate the texture sampling offsets used in the ray-tracing portion of the Relief Mapping algorithm. This new technique relies...
This thesis presents an efficient computational voxelization approach that utilizes the graphics pipeline. Our approach is hybrid in that it performs a precise gap-free computational voxelization, employs fixed-function components of the GPU, and utilizes the stages of the graphics pipeline to improve parallelism. This approach makes use of the latest...
Worst-case analysis is often meaningless in practice. Some problems never reach the anticipated worst-case complexity. Other solutions get bogged down with impractical constants during implementation, despite having favorable asymptotic running times. In this thesis, we investigate these contrasts in the context of finding maximum flows in planar digraphs. We suggest...
Partial programming is a field of study where users specify an outline or skeleton of a program, but leave various parts undefined. The undefined parts are then completed by an external mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method of partial programming that utilizes techniques from...
We develop efficient coordination techniques that support inelastic traffic in large-scale distributed dynamic spectrum access DSA networks. By means of any learning algorithm, the proposed techniques enable DSA users to locate and exploit spectrum opportunities effectively, thereby increasing their achieved throughput (or "rewards" to be more general). Basically, learning algorithms...
Object categorization is one of the fundamental topics in computer vision research. Most current work in object categorization aims to discriminate among generic object classes with gross differences. However, many applications require much finer distinctions. This thesis focuses on the design, evaluation and analysis of learning algorithms for fine- grained...
Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are...
End-user programmers face many barriers in programming. Research has seen many programming environments that attempted to lower or remove the barriers but despite these efforts, empirical studies continue to report barriers users face. To investigate this issue, we took a theory-informed approach. Using theories from design, creativity, and problem solving...
Object recognition is a fundamental problem in computer vision. Recognition is
required by many applications. This thesis presents a distance based approach to
recognize objects. We are interested in objects that belong to very similar classes,
where each class has large variations. This problem is called fine-grained object
recognition. Given...
In wireless sensor networks (WSNs) nodes are battery powered. Therefore, the available
energy resources of sensor nodes should be managed efficiently in order to increase
the network lifetime. As a result, researchers have proposed routing schemes in order to
maximize network lifetime. Even though these schemes increase the network lifetime,...
Free and open source software (FOSS) projects primarily rely on the efforts of volunteer contributors from around the world. For this reason, recruiting and retaining contributor is vital to the sustainability and growth of FOSS projects. This notion became the jumping-off point for this three-part investigation into the cultural structure...
Air traffic flow management over the U.S. airpsace is a difficult problem. Current management approaches lead to hundreds of thousands of hours of delay, costing billions of dollars annually. Weather and airport conditions may instigate this delay, but routing decisions balancing delay with congestion contribute significantly to the propagation of...
Free / Open Source Software developers come from a myriad of different backgrounds. While some contribute for personal reasons, many become involved because they receive compensation from corporations or foundations. The motivation for participating in a project can have dramatic impacts on how and what contribution an individual makes. These...
Professional software engineers have an arsenal of techniques such as unit testing and assertions to check their specifications, but these techniques require tools, motivation, experience and training that programmers without professional software engineering training may not have. As a result, professionals in other fields, such as scientific modelers, face greater...
Over the past few decades, the ratio of women to men in many traditionally maledominated fields has become much more equal. However, in science, technology, engineering, and math (STEM) fields the ratio has not improved at the same rate. In computer science the ratio is still very uneven. Today women...
Physical activity recognition using accelerometer data is a rapidly emerging field with many real-world applications. Much of the previous work in this area has assumed that the accelerometer data has already been segmented into pure activities, and the activity recognition task has been to classify these segments. In reality, activity...
In this thesis I present the choice calculus, a formal language for representing variation in software and other structured artifacts. The choice calculus is intended to support variation research in a way similar to the lambda calculus in programming language research. Specifically, it provides a simple formal basis for presenting,...
The study of the diversity of multivariate objects shares common characteristics across disciplines, including ecology and organizational management. Nevertheless, experts in these two disciplines have adopted somewhat separate diversity concepts and analysis techniques, limiting the ability of potentially sharing and cross comparing these concerns. Moreover, while complex diversity data may...
The art of software engineering inherently requires high-level problem solving and perseverance, as programmers and designers wrestle with complex design and implementation challenges in the process of turning loose concepts and ideas into working code. In the current developer ecosystem, engineers are commonly incentivized extrinsically by monetary rewards, approval or...
Protein-protein interactions underlie all biological processes and are a field of study that has wide implications throughout many other fields including medicine, genetics, biology, and ecology. Proteins are the building blocks and primary actors of life. They work together to accomplish virtually every task within a cell, including, metabolism, signal...
Realistic (ideally photorealistic) real-time rendering has remained an elusive goal in computer graphics. While photorealistic rendering has certainly been achieved at the expense of tremendous computational resources and corresponding rendering times; real-time rendering typically must accept a great number of compromises to achieve adequate performance, such as aliasing artifacts, the...
This M.S thesis presents an interactive software tool that I have developed in the course of the past two years. This interactive tool is called AISO. AISO is aimed at interactive image segmentation and annotation tool designed to allow users to segment an image – such as those produced with...
As non-renewable resources dwindle and costs increase, it becomes ever more important for people to understand and control their electricity usage. Eco-feedback devices are being developed to increase user awareness and reduce consumption. In order for feedback devices to be successfully adopted into the home, however, they must be appealing...
Software developers frequently need to perform code maintenance tasks, but doing so requires time-consuming navigation through code. A variety of tools are aimed at easing this navigation by using models to predict places in the code that a developer might want to visit, and then providing shortcuts so that the...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
Image classification is a difficult problem, often requiring large training sets to get satisfactory results. However this is a task that humans perform very well, and incorporating user feedback into these learning algorithms could help reduce the dependency on large amounts of labeled training data. This process has already been...
We consider the problem of supervised classification of bird species from audio recordings in a real-world acoustic monitoring scenario (i.e. audio data is collected in the field with an omnidirectional microphone, without human supervision). Obtaining better data about bird activity can assist conservation efforts, and improve our understanding of their...
Software maintenance tasks often require finding information within existing code, which is time-consuming and difficult even for professional programmers. For example, programmers may need to know what code implements certain functionality or what is the purpose of certain code. In response, researchers have developed tools to help programmers find information...
We investigate a search and coverage planning problem, where an area of interest has to be explored by a number of vehicles, given a fixed time budget. A good coverage plan has a low probability of a target remaining unobserved. We introduce a formal problem statement, suggest a greedy algorithm...
Auctions are used to solve resource allocation problem between many agents and many items in real-world settings. Unfortunately, in most cases, it is possible for selfish agents to manipulate the system for their own interest at the expense of the social welfare. Such manipulation can be prevented using the Vickrey-Clarke-Groves...
Routing from a single source node to multiple destination nodes using node disjoint paths (NDP) has many important applications in parallel systems. For example, if a source node wants to send distinct messages to distinct destination nodes, then the one-to-many NDP routing is useful.
Unlike parallel systems with shared-memory, each...
Maintaining variation in software is a difficult problem that poses serious challenges for the understanding and editing of software artifacts. Although the C preprocessor (CPP) is often the default tool used to introduce variability to software, because of its simplicity and flexibility, it is infamous for its obtrusive syntax and...
This thesis addresses a fundamental computer vision problem, that of action recognition. The goal of action recognition is to recognize a class of human actions in a given video. Action recognition has a wide range of applications, including automated surveillance, sports video analysis, internet-based searches etc. The main challenge is...
Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. This thesis studies the problem of learning...
We consider the problem of wireless spectrum management in cognitive wireless networks that maximizes the revenue for a spectrum operator. Specifically, we study the problem on how a wireless spectrum operator can optimally allocate its limited spectrum to various classes users/devices who pay differently for their spectrum per unit time....
Macrosomia is a medical term describing a new baby born with an excessive birth weight (greater than 4000g). Fetal macrosomia may lead to both pregnancy complications, and increased risk of mother's and baby's health problems after birth. But the potential complications may be mitigated by a cesarean delivery. As such,...
Software engineers often need help with discovering and learning how to use APIs. For example, software engineers who are starting to learn Java, and they want to implement a certain feature in a program, they might want to reuse existing APIs in order to save time versus rewriting it themselves...
This thesis considers the problem in which a teacher is interested in teaching action policies to computer agents for sequential decision making. The vast majority of policy
learning algorithms o er teachers little flexibility in how policies are taught. In particular,
one of two learning modes is typically considered: 1)...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...
The purpose of this study is to explore kernel machine learning methods for species distribution modeling. Previous studies have shown the success of Generalized Boosted Regression Models, however kernel methods have been unexplored for species distribution modeling. Using the eBird dataset, four machine learning methods were tested for accuracy and...
This thesis presents an interactive software tool for tracking a moving object in a video. In particular, we focus on the problem of tracking a player in American football videos. Object tracking is one of the fundamental problems in computer vision. It is one of the most important components in...
An age-wave is upon us where many older adults are reaching retirement. Technically experienced older adults have skills that could be directly applied to free/open source software (FOSS) communities, such as project management, programming, and/or knowledge of a rapidly growing end-user population. FOSS is a widely popular, low-cost way to...
This thesis addresses a basic problem in computer vision, that of semantic labeling of images. Our work is aimed at object detection in biological images for evolutionary biology research. In particular, our goal is to detect nematocysts in Scanning Electron Microscope (SEM) images. This biological domain presents challenges for existing...
End-user programming has become widespread. The increasing size of this population and the prevalence of barriers that they face has sparked the development of approaches that promote end-user programing by helping them overcome barriers and teaching them programming. Despite the fact that these approaches have done well in achieving those...
End-user programmers often struggle to create programs that run quickly and effectively, which can be a major deterrent in completing their tasks as desired. Current research has primarily focused on catching user mistakes, such as errors or misused formulas. However, end users deal with issues other than just correctness. In...
We are witnessing the rise of the data-driven science paradigm, in which massive amounts of data - much of it collected as a side-effect of ordinary human activity - can be analyzed to make sense of the data and to make useful predictions. To fully realize the promise of this...
The communication in MLS cross-domain environments faces many challenges. The three most important challenges are efficient key management, privacy preserving and covert channel. We propose an Efficient, Secure and Covert Channel Capacity Bounded Protocol which has three algorithms that addresses these challenges: The Efficient Attribute-based Fine-Grained Authentication (EAFA) algorithm, Anonymous...
One of the tasks that continues to prove difficult in robotics is the ability to grasp objects of varying shapes. It is time-consuming to acquire large amounts of real-world data in order to train accurate classifiers that can predict the success or failure of a grasp. To solve this issue,...