Alignment of genomic sequences from different species is becoming an increasingly powerful method in biology, and is being used for many purposes. The result of sequence alignments is a list of pairs of matched locations between the pattern string and the text string. However, without any proper visualization tools to...
A large number of sequential decision-making problems in uncertain environments
can be modeled as Markov Decision Processes (MDPs). In such settings, an agent
can observe at each time step the state of the environment and then executes an
action, causing a stochastic transition to a new state of the environment...
Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
This thesis examines the mixing times for one-dimensional interacting particle systems. We use the coupling method to study the mixing rates for particle systems on the circle which move according to specific permutations e.g., transpositions and 3-cycles.
Building intelligent computer assistants has been a long-cherished goal of AI. Many intelligent assistant systems were built and fine-tuned to specific application domains. In this work, we develop a general model of assistance that combines three powerful ideas: decision theory, hierarchical task models and probabilistic relational languages. We use the...
Coordinating multiple robots to achieve a complex task requires solving two distinct control problems: the high-level control problem of ensuring that each robot aims to perform a useful task (e.g., coordination) and the low-level control problem of ensuring that each robot actually performs the correct actions to achieve its task...
This thesis addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree models of the conditional probability distributions of the DBNs. Existing algorithms presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise....
For a certain class of Z²-actions, we provide a proof of a conjecture that the ratio of the Perron eigenvalues of the transfer matrices of the free boundary restrictions converge to the entropy of that action. Also, a novel method for computing the entropy of Z²-actions is conjectured.
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to identify statistically significant anomalies, a more difficult task is to identify anomalies that are both interesting and statistically significant. Category detection is an emerging area of machine learning...
In this dissertation, we present a user-in-the-loop method for the design of an interactive motion data structure that benefits from the advantages of both motion graphs and blend-based techniques. Our novel approach automatically analyzes a traditional motion graph built from labeled motion clips. The result is a more condensed, coarser...
Automated recognition of object categories in images is a critical step for many real-world computer vision applications. Interest region detectors and region descriptors have been widely employed to tackle the variability of objects in pose, scale, lighting, texture, color, and so on. Different types of object recognition problems usually require...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
Until a few years ago, wireless-capable laptops were considered novelties by many. It is now hard to find a laptop or a hand-held computing device that is not wireless-ready. As wireless devices are becoming commodities, they have also become an indispensable part of the modern society. Not surprisingly, research in...
The problem of document classification has been widely studied in machine learning and data mining. In document classification, most of the popular algorithms are based on the bag-of-words representation. Due to the high dimensionality of the bag-of-words representation, significant research has been conducted to reduce the dimensionality via different approaches....
Sequential supervised learning problems arise in many real applications. This dissertation focuses on two important research directions in sequential supervised learning: efficient training and feature induction.
In the direction of efficient training, we study the training of conditional random fields (CRFs), which provide a flexible and powerful model for sequential...
This dissertation explores the idea of applying machine learning technologies to help computer users find information and better organize electronic resources, by presenting the research work conducted in the following three applications: FolderPredictor, Stacking Recommendation Engines, and Integrating Learning and Reasoning.
FolderPredictor is an intelligent desktop software tool that helps...
The use of autonomous robots in complex exploration tasks is rapidly increasing. Indeed, robots can provide speed and cost effectiveness in many tasks, as well as allow operation in environments that are hostile to humans. In this dissertation we: 1) provide two adaptive navigation algorithms; 2) develop a coordination mechanism;...
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high
stochasticity or "outcome space" explosion. Multiagent domains are particularly susceptible to these problems. This thesis describes ways to mitigate these curses in several different multiagent domains, including real-time delivery of products...
We investigate a number of techniques for increasing throughput and quality of media applications over wireless networks. A typical media communication application such as video streaming imposes strict requirements on the delay and throughout of its packets, which unfortunately, cannot be guaranteed by the underlying wireless network due inherently to...
This dissertation addresses two fundamental problems in computer vision—namely,
multitarget tracking and event recognition in videos. These problems are challenging
because uncertainty may arise from a host of sources, including motion blur,
occlusions, and dynamic cluttered backgrounds. We show that these challenges can be
successfully addressed by using a multiscale,...
This dissertation addresses a number of inter-related and fundamental problems in computer vision. Specifically, we address object discovery, recognition, segmentation, and 3D pose estimation in images, as well as 3D scene reconstruction and scene interpretation. The key ideas behind our approaches include using shape as a basic object feature, and...
Given a video, we would like to recognize group activities, localize video parts where these activities occur, and detect actors involved in them. To this and, we propose a novel, mid-level feature, called control point, for representing group activities. The control points are aimed at summarizing visual cues, lifting from...
Acting intelligently to efficiently solve sequential decision problems requires the ability to extract hierarchical structure from the underlying domain dynamics, exploit it for optimal or near-optimal decision-making, and transfer it to related problems instead of solving every problem in isolation. This dissertation makes three contributions toward this goal.
The first...
This thesis studies cooperative techniques that rely on femtocell user diversity to improve the downlink communication quality of macrocell users. We analytically analyze and evaluate the achievable performance of these techniques in the downlink of Rayleigh fading channels. We provide an approximation of both the bit-error rate (BER) and the...
A fundamental problem in computer vision is to partition an image into meaningful segments. While image segmentation is required by many applications, the thesis focuses on segmentation of computed tomography (CT) images for analysis and quality control of composite materials. The key research contribution of this thesis is a novel...
Networks of distributed, remote sensors are providing ecological scientists with a view of our environment that is unprecedented in detail. However, these networks are subject to harsh conditions, which lead to malfunctions in individual sensors and failures in network communications. This behavior manifests as corrupt or missing measurements in the...
The study of physical activity is important in improving people’s health as it can help people understand the relationship between physical activity and health. Accelerometers, due to its small size, low cost, convenience and its ability to provide objective information about the frequency, intensity, and duration of physical activity, has...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
Partial programming is a field of study where users specify an outline or skeleton of a program, but leave various parts undefined. The undefined parts are then completed by an external mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method of partial programming that utilizes techniques from...
Traditionally, networking protocol designs have placed much emphasis on point-to-point reliability and efficiency. With the recent rise of mobile and multimedia applications, other considerations such as power consumption and/or Quality of Service (QoS) are becoming increasingly important factors in designing network protocols. As such, we present a new flexible framework...
We consider the problem of wireless spectrum management in cognitive wireless networks that maximizes the revenue for a spectrum operator. Specifically, we study the problem on how a wireless spectrum operator can optimally allocate its limited spectrum to various classes users/devices who pay differently for their spectrum per unit time....
Image classification is a difficult problem, often requiring large training sets to get satisfactory results. However this is a task that humans perform very well, and incorporating user feedback into these learning algorithms could help reduce the dependency on large amounts of labeled training data. This process has already been...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
This thesis addresses a basic problem in computer vision, that of semantic labeling of images. Our work is aimed at object detection in biological images for evolutionary biology research. In particular, our goal is to detect nematocysts in Scanning Electron Microscope (SEM) images. This biological domain presents challenges for existing...
This dissertation addresses the problem of recognizing human activities in videos. Our focus is on activities with stochastic structure, where the activities are characterized by variable space-time arrangements of actions, and conducted by a variable number of actors. These activities occur frequently in sports and surveillance videos. They may appear...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...
This thesis presents an interactive software tool for tracking a moving object in a video. In particular, we focus on the problem of tracking a player in American football videos. Object tracking is one of the fundamental problems in computer vision. It is one of the most important components in...
Pardoxes in voting has been an interest of voting theorists since the 1800's when Condorcet demonstrated the key example of a voting paradox: voters with individually transitive rankings produce an election outcome which is not transitive. With Arrow's Impossibility Theorem, the hope of finding a fair voting method which accurately...
Constructing a panorama from a set of videos is a long-standing problem in computer vision. A panorama represents an enhanced still-image representation of an entire scene captured in a set of videos, where each video shows only a part of the scene. Importantly, a panorama shows only the scene background,...
In recent years there have been many improvements in the reliability of critical infrastructure systems. Despite these improvements and despite targeted efforts to improve the operation and control of the electric grid, the power systems industry has seen relatively small advances in this regard. For instance, today's power system is...
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do...
Counting problems are rich in opportunities for students to make meaningful mathematical connections and develop non-algorithmic thinking; their accessible nature and applications to computer science make counting problems a valuable part of mathematics curricula. However, students struggle in various ways with counting, and while previous studies have indicated that listing...
In this work, we study network coding technique, its relation to random matrices, and their applications to communication systems. The dissertation consists of three main contributions. First, we propose efficient algorithms for data synchronization via a broadcast channel using random network coding. Second, we study the resiliency of network coding...
Software testing is a very important task during software development and it can be used to improve the quality and reliability of the software system. One potential way to reduce the cost and increase the efficiency of software testing is to generate test data automatically. Search-based approaches successfully generate unit...
Recognizing human actions in videos is a long-standing problem in computer vision with a wide range of applications including video surveillance, content retrieval, and sports analysis. This thesis focuses on addressing efficiency and robustness of video classification in unconstrained real-world settings. The thesis work can be broadly divided into four...
Machine learning models for natural language processing have traditionally relied on large numbers of discrete features, built up from atomic categories such as word forms and part-of-speech labels, which are considered completely distinct from each other. Recently however, the advent of dense feature representations coupled with deep learning techniques has...
Markov Decision Processes (MDPs) are the de-facto formalism for studying sequential decision making problems with uncertainty, ranging from classical problems such as inventory control and path planning, to more complex problems such as reservoir control under rainfall uncertainty and emergency response optimization for fire and medical emergencies. Most prior research...
Most tasks in natural language processing (NLP) try to map structured input (e.g., sentence or word sequence) to some form of structured output (tag sequence, parse tree, semantic graph, translated/paraphrased/compressed sentence), a problem known as “structured prediction”. While various learning algorithms such as the perceptron, maximum entropy, and expectation-maximization have...
The thesis focuses on activity recognition from sensor data, which has spurred a great deal of interest due to its impact on health care and security. Previous work on activity recognition from multivariate time series data has mainly applied supervised learning techniques which require a high degree of annotation effort...