Alignment of genomic sequences from different species is becoming an increasingly powerful method in biology, and is being used for many purposes. The result of sequence alignments is a list of pairs of matched locations between the pattern string and the text string. However, without any proper visualization tools to...
A large number of sequential decision-making problems in uncertain environments
can be modeled as Markov Decision Processes (MDPs). In such settings, an agent
can observe at each time step the state of the environment and then executes an
action, causing a stochastic transition to a new state of the environment...
Partial programming is a field of study where users specify an outline or skeleton of a program, but leave various parts undefined. The undefined parts are then completed by an external mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method of partial programming that utilizes techniques from...
We present a method for decentralized, multi-robot exploration in adverse environments where communication is minimal. A key conceptual feature of our method is enabling implicit coordination between robots by training a Convolutional Neural Network (CNN) as a heuristic for planning using Monte Carlo Tree Search (MCTS). Our method consists of...
Anomaly detection aims at detecting the points that appear different than the majority of the data, such that they are suspected to be generated from a different distribution. Anomaly detectors have been applied in many different fields, such as detecting fraudulent behaviors in bank transaction, finding broken sensors in a...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
We explore the application of deep learning to the disparate fields of natural language processing and computational biology. Both the sentences uttered by humans as well as the RNA and protein sequences found within the cells of their bodies can be considered formal languages in computer science, as sets of...
For a certain class of Z²-actions, we provide a proof of a conjecture that the ratio of the Perron eigenvalues of the transfer matrices of the free boundary restrictions converge to the entropy of that action. Also, a novel method for computing the entropy of Z²-actions is conjectured.
The thesis focuses on activity recognition from sensor data, which has spurred a great deal of interest due to its impact on health care and security. Previous work on activity recognition from multivariate time series data has mainly applied supervised learning techniques which require a high degree of annotation effort...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...
In this thesis, we introduce a novel Explanation Neural Network (XNN) to explain the predictions made by a deep network. The XNN works by embedding a high-dimensional activation vector of a deep network layer non-linearly into a low-dimensional explanation space while retaining faithfulness i.e., the original deep learning predictions can...
In this dissertation, we address action segmentation in videos under limited supervision. The goal of action segmentation is to predict an action class for each frame of a video. The limited supervision means ground truth labels of video frames are not available in training. We focus on three types of...
This dissertation addresses the problem of recognizing human activities in videos. Our focus is on activities with stochastic structure, where the activities are characterized by variable space-time arrangements of actions, and conducted by a variable number of actors. These activities occur frequently in sports and surveillance videos. They may appear...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
As one of the most popular data types, the point cloud is widely used in various appli- cations, including computer vision, computer graphics and robotics. The capability to directly measure 3D point clouds is invaluable in those applications as depth information could remove a lot of the segmentation ambiguities in...
Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction.
In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential...
This dissertation delves into understanding, characterizing, and addressing dataset shift in deep learning, a pervasive issue for deployed machine learning systems. Integral aspects of the problem are examined: We start with the use of counterfactual explanations in order to characterize the behavior of deep reinforcement learning agents in visual input...
The ability to extract uncertainties from predictions is crucial for the adoption of deep learning systems to safety-critical applications. Uncertainty estimates can be used as a failure signal, which is necessary for automating complex tasks where safety is a concern. Furthermore, current deep learning systems do not provide uncertainty estimates,...
This dissertation addresses object recognition in challenging settings, where distinct object classes are visually very similar (e.g., species of birds and insects) and/or access to training examples of object classes is limited (e.g., due to the associated high costs of data annotation). In this dissertation, we present a variety of...
Constructing a panorama from a set of videos is a long-standing problem in computer vision. A panorama represents an enhanced still-image representation of an entire scene captured in a set of videos, where each video shows only a part of the scene. Importantly, a panorama shows only the scene background,...
Correctness and efficiency are important properties of programs. However, to support maintenance and debugging, the programs should also be understandable. Program explanations also play a vital role in educational settings, enhancing the understanding of programs among students.
Proof trees provide a sound basis for generating dynamic explanations of programs. But...
In this work, we study network coding technique, its relation to random matrices, and their applications to communication systems. The dissertation consists of three main contributions. First, we propose efficient algorithms for data synchronization via a broadcast channel using random network coding. Second, we study the resiliency of network coding...
Learning novel concepts from relational databases is an important problem with applications in several disciplines, such as data management, natural language processing, and bioinformatics. For a learning algorithm to be effective, the input data should be clean and in some desired representation. However, real-world data is usually heterogeneous – the...
Until a few years ago, wireless-capable laptops were considered novelties by many. It is now hard to find a laptop or a hand-held computing device that is not wireless-ready. As wireless devices are becoming commodities, they have also become an indispensable part of the modern society. Not surprisingly, research in...
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Networks of distributed, remote sensors are providing ecological scientists with a view of our environment that is unprecedented in detail. However, these networks are subject to harsh conditions, which lead to malfunctions in individual sensors and failures in network communications. This behavior manifests as corrupt or missing measurements in the...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
Image classification is a difficult problem, often requiring large training sets to get satisfactory results. However this is a task that humans perform very well, and incorporating user feedback into these learning algorithms could help reduce the dependency on large amounts of labeled training data. This process has already been...
Pardoxes in voting has been an interest of voting theorists since the 1800's when Condorcet demonstrated the key example of a voting paradox: voters with individually transitive rankings produce an election outcome which is not transitive. With Arrow's Impossibility Theorem, the hope of finding a fair voting method which accurately...
Many large-scale data analysis applications involve data that can vary over both time and space. Often the primary goal of analyzing spatiotemporal data is identifying trends, movements, and sudden changes with respect to time, location, or both. This can include a variety of applications in economics (housing prices, unemployment, job...
Movement intent decoders, which interpret volitional movement intent from human bioelectric signals, can be incorporated into modern neuroprostheses to offer people living with limb loss or paralysis the potential to regain their lost motor control. Machine learning methods have become the research standard for continuous decoders with high degrees of...
Autonomous robotic agents are on their way to becoming in-home personal assistants, construction assistants, and warehouse workers. The degree of autonomy of such systems is reflected by the manner in which we specify goals to them; the abstraction of low-level commands to high-level goals goes hand-in-hand with increased autonomy. In...
Data variations are prevalent in real-world applications. For example, software vendors have to handle numerous variations in the business requirements, conventions, and environmental settings of a software product. In database-backed software, the database of each version may have a different schema and content. As another example, data scientists often need...
In this dissertation, we present a user-in-the-loop method for the design of an interactive motion data structure that benefits from the advantages of both motion graphs and blend-based techniques. Our novel approach automatically analyzes a traditional motion graph built from labeled motion clips. The result is a more condensed, coarser...
Coordinating multiple robots to achieve a complex task requires solving two distinct control problems: the high-level control problem of ensuring that each robot aims to perform a useful task (e.g., coordination) and the low-level control problem of ensuring that each robot actually performs the correct actions to achieve its task...
Traditionally, networking protocol designs have placed much emphasis on point-to-point reliability and efficiency. With the recent rise of mobile and multimedia applications, other considerations such as power consumption and/or Quality of Service (QoS) are becoming increasingly important factors in designing network protocols. As such, we present a new flexible framework...
This dissertation addresses few-shot object segmentation in images. The goal of segmentation is to label every image pixel with a class of the object occupying that pixel, where the class may represent a semantic object category or instance. In few-shot segmentation, training and test datasets have different classes. Every new...
In an increasingly computation-driven world, algorithms and mathematical models significantly impact decision making across various fields. To foster trust and understanding, it is crucial to provide users with clear and concise explanations of the reasoning behind the results produced by computational tools, especially when recommendations appear counterintuitive. Legal frameworks in...
Automated recognition of object categories in images is a critical step for many real-world computer vision applications. Interest region detectors and region descriptors have been widely employed to tackle the variability of objects in pose, scale, lighting, texture, color, and so on. Different types of object recognition problems usually require...
Multi-relation aggregation queries process the join operator before computing the aggregation function. This join is arguably the most costly operation since traditional join algorithms spend majority of their time trying to join the parts of the relations that do not generate any output tuples. This causes slow response times with...
We investigate a number of techniques for increasing throughput and quality of media applications over wireless networks. A typical media communication application such as video streaming imposes strict requirements on the delay and throughout of its packets, which unfortunately, cannot be guaranteed by the underlying wireless network due inherently to...
The advancement of artificial intelligence (AI) has led to transformative developments across multiple sectors, fostering innovation and redefining our interactions with technology. As AI matures and becomes integrated into society, it offers numerous opportunities to address global challenges and revolutionize a wide array of human endeavors. These advances are driven...
This dissertation addresses two fundamental problems in computer vision—namely,
multitarget tracking and event recognition in videos. These problems are challenging
because uncertainty may arise from a host of sources, including motion blur,
occlusions, and dynamic cluttered backgrounds. We show that these challenges can be
successfully addressed by using a multiscale,...
Given a video, we would like to recognize group activities, localize video parts where these activities occur, and detect actors involved in them. To this and, we propose a novel, mid-level feature, called control point, for representing group activities. The control points are aimed at summarizing visual cues, lifting from...
This dissertation addresses a number of inter-related and fundamental problems in computer vision. Specifically, we address object discovery, recognition, segmentation, and 3D pose estimation in images, as well as 3D scene reconstruction and scene interpretation. The key ideas behind our approaches include using shape as a basic object feature, and...
Most tasks in natural language processing (NLP) try to map structured input (e.g., sentence or word sequence) to some form of structured output (tag sequence, parse tree, semantic graph, translated/paraphrased/compressed sentence), a problem known as “structured prediction”. While various learning algorithms such as the perceptron, maximum entropy, and expectation-maximization have...
In recent years there have been many improvements in the reliability of critical infrastructure systems. Despite these improvements and despite targeted efforts to improve the operation and control of the electric grid, the power systems industry has seen relatively small advances in this regard. For instance, today's power system is...
There are nearly two million limb amputees living in the United States of America. Loss of limbs results in profound changes in one's life. However, the underlying neural circuitry and much of the ability to sense and control movements of their missing limb is retained even after limb loss. This...