Maintaining variation in software is a difficult problem that poses serious challenges for the understanding and editing of software artifacts. Although the C preprocessor (CPP) is often the default tool used to introduce variability to software, because of its simplicity and flexibility, it is infamous for its obtrusive syntax and...
The gradient of a velocity vector field is an asymmetric tensor field which can provide critical insight that is difficult to infer from traditional trajectory-based vector field visualization techniques. I describe the structures in the eigenvalue and eigenvector fields of the gradient tensor and how these structures can be used...
In many traditional computer graphics applications, rendered scenes typically utilize 3D meshes to represent objects within an environment. As the demand to further improve the realism of graphics applications increases, such as for movies and games, it is becoming more important to represent the inner volumes of object meshes. In...
Recent studies have shown that novel continuous dropout methods can be viewed as a Bayesian interpretation of model parameters, though most such studies have shown results using normal distributions. As the posterior distributions over neural network nodes and parameters are intractable, given that they are a result of artificial construction...
Distributed Version Control Systems (DVCS) have seen an increase in popularity relative to traditional Centralized Version Control Systems (CVCS). Yet we know little on whether VCS tools meet the needs of software developers when managing software change or whether developers are benefitting from the extra power of DVCS. Without such...
The aim of this thesis is to study past 10 years of security vulnerabilities reported against Linux Kernel and all existing mitigation techniques that prevent the exploitation of those vulnerabilities. To systematically study the security vulnerabilities, they were categorized into classes and sub-classes based on their type.
This thesis first...
This thesis addresses the problem of temporal action segmentation in videos, where the goal is to label every video frame with the appropriate action class present. We focus on the domain of NFL football videos, where action classes represent common football play types. For action segmentation, we use a temporal...
In this thesis, we present semantic equivalence rules for an extension of the choice calculus and sound operations for an implementation of variational lists. The choice calculus is a calculus for describing variation and the formula choice calculus is an extension with formulas. We prove semantic equivalence rules for the...
The history of a software project plays a vital role in the software development process. Version control systems enable users of a software repository to look at the evolution of the source code, and see the changes that led to newer versions. Currently, version control systems provide commands that can...
Given k terminal pairs (s₁,t₁),(s₂,t₂),..., (s[subscript k],t[subscript k]) in an edge-weighted graph G, the k Shortest Vertex-Disjoint Paths problem is to find a collection P₁, P₂,..., P[subscript k] of vertex-disjoint paths with minimum total length, where P[subscript i] is an s[subscript i]-to-t[subscript i] path. As a special case of the...
Object categorization is one of the fundamental topics in computer vision research. Most current work in object categorization aims to discriminate among generic object classes with gross differences. However, many applications require much finer distinctions. This thesis focuses on the design, evaluation and analysis of learning algorithms for fine- grained...
”Until relatively recently, mankind was not aware that there was a separable binocular depth sense. Through the ages, people like Euclid and Leonardo understood that we see different images of the world with each eye. But it was Wheatstone who in 1838 explained to the world, with his stereoscope and...
Automatic music transcription (AMT) is the task, given an acoustic representation of music, to recover a symbolic notation of the written notes expressed by the sound. Transcribing music with multiple notes sounding simultaneously is difficult for both humans and machines. Much existing work on AMT has focused on suitable acoustic...
As of February 2012, approximately 46% of American adults own a smartphone. The graphics quality of these devices gets better each year. However, they still have many more limitations in graphics processing and storage space than desktop computers. This means that applications on these devices should focus on optimizing their...
Humans are remarkably efficient in learning by interacting with other people and observing their behavior. Children learn by watching their parents’ actions and mimic their behavior. When they are not sure about their parents demonstration, they communicate with them, ask questions, and learn from their feedback. On the other hand,...
Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are...
Partial programming is a field of study where users specify an outline or skeleton of a program, but leave various parts undefined. The undefined parts are then completed by an external mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method of partial programming that utilizes techniques from...
Heatmap regression has became one of the mainstream approaches to localize facial landmarks. As Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are becoming popular in solving computer vision tasks, extensive research has been done on these architectures. However, the loss function for heatmap regression is rarely studied. In...
The Internet is growing rapidly in terms of websites, users and uses. People use the Internet for reference, shopping, social networking, communications, business and much more. Though the Internet is useful, there are many risks associated with its use, like malicious websites, identity theft, hateful content and fraudulent practices. Online...
In this work we propose a curve approximation method that operates in the curvature
domain. The curvature is represented using one of several different types of
basis functions (linear, quadratic, spline, sinusoidal, orthogonal polynomial), and the
curve's geometry is reconstructed from that curvature basis. Our hypothesis is that
different curvature...
The relationship between the public and new technologies has historically been a tumultuous one, with public perceptions ranging from excited rapid adoption to standoffish pessimism. In 2011, IBM tried to use competition as a means of showcasing a new technology to the public. This thesis is a work of rhetorical...
Within the past several years the technology of high-throughput sequencing has transformed the study of biology by offering unprecedented access to life's fundamental building block, DNA. With this transformation's potential a host of brand-new challenges have emerged, many of which lend themselves to being solved through computational methods. From de...
This thesis considers the problem of training convolutional neural networks for online visual tracking. A major challenge for single object visual tracking is that most training sets with frame-level track annotations are quite small, due to the prohibitive cost of manual annotation. Current training approaches either supplement the annotations with...
We model the popular board game of Clue as an MDP and evaluate Monte-Carlo policy rollout in a simulated environment pitting different agents and policies against each other. We describe the choices we made in the representation, along with some of the problems we encountered along the way. We find...
Macrosomia is a medical term describing a new baby born with an excessive birth weight (greater than 4000g). Fetal macrosomia may lead to both pregnancy complications, and increased risk of mother's and baby's health problems after birth. But the potential complications may be mitigated by a cesarean delivery. As such,...
There is a growing interest in bringing online and streaming content to the television. Gaming platforms such as the PS3, Xbox 360 and Wii are at the center of this digital convergence; platforms for accessing new media services. This presents a number of interface challenges, as controllers designed for gaming...
A fundamental problem in computer vision is to partition an image into meaningful segments. While image segmentation is required by many applications, the thesis focuses on segmentation of computed tomography (CT) images for analysis and quality control of composite materials. The key research contribution of this thesis is a novel...
This M.S thesis presents an interactive software tool that I have developed in the course of the past two years. This interactive tool is called AISO. AISO is aimed at interactive image segmentation and annotation tool designed to allow users to segment an image – such as those produced with...
Anomaly detection has been used in variety of applications in practice, including cyber-security, fraud detection and detecting faults in safety critical systems, etc. Anomaly detectors produce a ranked list of statistical anomalies, which are typically examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, most...
Question answering forums like Reddit have been quite effective in improving social interaction and disseminating useful information. Community members ask a variety of questions related to a subject which are answered by other community members. The answers are given ratings by other members. In this thesis we study the problem...
As a general solution to the problem of managing structural and content variability in relational databases, in previous work we have introduced the Variational Database Management System (VDBMS). VDBMS consists of a representation of a variational database (VDB) and a corresponding typed query language (v-query). However, since this is a...
The topic of species distribution modelling has been on of increasing interest in
recent years. As climate change is becoming of even more interest to researchers,
more tools are needed to better analyze and predict various climate change scenarios.
One particular area of interest is that of species distribution modeling....
Tensor mathematics provides a powerful language to visualize and analyze physical phenomena. In the last three decades, tensors have been used in various application areas. The visualization and analysis of tensors fields have seen much advance, both in 2D and 3D. However, the physical interpretations of the topological analysis are...
Urban green space is associated with multiple physical and mental health outcomes. Several benefits of green space, such as stress reduction and attention restoration, are dependent on visual perception of green space exposures. However, traditional green space exposure measures do not capture street-level exposures. In this project, we apply deep...
There are growing interests in designing polynomial-time approximation schemes (PTAS) for optimization problems in planar graphs. Many NP-hard problems are shown to admit PTAS in planar graphs in the last decade, including Steiner tree, Steiner forest, two- edge-connected subgraphs and so on. We follow this research line and study several...
While there are many ways to evaluate a user interface design, the user's mental workload and situation awareness (SA) are particularly important considerations in the supervisory control of safety-critical systems. Typically, operators of these systems must monitor high-volume, time-sensitive status information. Interface design for this domain can be challenging and...
Programmers often have to choose components online for reuse based on software quality. To help with this choice, most component repositories (SourceForge, CodeProject, etc.) provide information such as user ratings and reviews of components. However, the reusability of components is not immediately obvious from
this material. To make things worse,...
Sports analytics is rapidly evolving today through the use of computer vision systems that automatically extract huge amount of information inherently present in multimedia data without much human assistance. This information can facilitate a better understanding of patterns and strategies in various sports. However, for non-professional teams, due to expense...
Generating solutions to Sokoban levels is an NP-hard problem that is difficult for even modern day computers to solve due to its complexity. This project explores the creation of a Sokoban solver by eliminating as many potential moves as possible to greatly limit the overall search space. This reduction is...
How can an agent generalize its knowledge to new circumstances? To learn
effectively an agent acting in a sequential decision problem must make intelligent action selection choices based on its available knowledge. This dissertation focuses on Bayesian methods of representing learned knowledge and develops novel algorithms that exploit the represented...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
The Intel Xeon Phi is a relative newcomer to the scientific computing scene. In the recent years, GPUs have been used extensively for mathematical simulations. The Xeon Phi is Intel’s response to the use of these cards. Like the GPU, it is highly parallelizable but can be programmed like a...
In data-centers, running multiple isolated workloads while getting the most performance out of available hardware is key. For many years Virtual Machines (VMs) have been an enabler, but native containers which offer isolation similar to virtual machines while reducing overhead costs associated with emulating hardware resources have become an increasingly...
Large databases and data warehouses are becoming prevalent for the storage and management of energy data. Accelerating the rates at which data can be retrieved is beneficial not only to allow for more efficient search of the data, but also to be integrated with other energy system tools. In this...
Branched covering spaces are a mathematical concept which originates from complex analysis and topology and has found applications in tensor field topology and geometry re-meshing. Given a manifold surface and an N-way rotational symmetry field, a branched covering space is a manifold surface that has an N-to-1 map to the...
Building software systems that adapt to the changing environment is challenging. Developers cannot anticipate all the changes in advance, and even if they could, the effort required to handle such situations is too onerous for practical purposes. Self Adaptive Software (SAS) adapts itself as per changing environment. The area of...
In this research, we address the problem of learning a single causal network structure from multiple dataset generated from different experiments. The experiments can be observational or interventional. We assume that each dataset is generated by an unknown causal network altered under different experimental conditions (interventions, manipulation or perturbation). As...
Many methods have been explored in the literature of multi-label learning, ranging from simple problem transformation to more complex method that capture correlation among labels. However, mostly all existing works do not address the challenge with incomplete label data. The goal of this project is to extend the work of...
Reasoning about 3D shape of objects is important for successful computer visionapplications in robotics, 3D rendering and modeling. In this thesis, we address twoproblems { First, given an image, we generate 3D shape of the foreground object thatappears in the image. Second, we predict the class label of the input...
We investigate a search and coverage planning problem, where an area of interest has to be explored by a number of vehicles, given a fixed time budget. A good coverage plan has a low probability of a target remaining unobserved. We introduce a formal problem statement, suggest a greedy algorithm...