Automatic painterly rendering systems have been proposed but they opted for selecting a single style to generate paintings from images, which lacks the ability of creatively using multiple styles to focus important objects and deemphasize unimportant part of the scenes. We provide a multi-style painting framework to
address this issue...
The Line Integral Convolution (LIC) is a mainstay of flow visualization. It is, however, computationally intensive, which limits its interactivity. Also, when used to view three-dimensional (3D) vector fields, the resulting images are dense and cluttered, making it difficult to perceive the flow on the interior parts of the field....
In this dissertation, we address action segmentation in videos under limited supervision. The goal of action segmentation is to predict an action class for each frame of a video. The limited supervision means ground truth labels of video frames are not available in training. We focus on three types of...
Recognizing human actions in videos is a long-standing problem in computer vision with a wide range of applications including video surveillance, content retrieval, and sports analysis. This thesis focuses on addressing efficiency and robustness of video classification in unconstrained real-world settings. The thesis work can be broadly divided into four...
The advent of deep learning models leads to a substantial improvement in a wide range of NLP tasks, achieving state-of-art performances without any hand-crafted features. However, training deep models requires a massive amount of labeled data. Labeling new data as a new task or domain emerges consumes time and efforts...
Information about named entities (real-world objects) is usually harvested from different sources and organized as a multiple relational directed graph in Knowledge Bases (KBs). KBs play essential roles in many NLP modules including question answering, fact-checking, search engines, etc. KBs are big but still incomplete: relational information among entities is...
Narratives are central to communication and the human experience. For a computer system to understand a narrative, it must be able to identify the key facts or plot elements that describe what happened or how the world has changed. These element are called events;establishing a document’s events and the relationships...
Heatmap regression has became one of the mainstream approaches to localize facial landmarks. As Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are becoming popular in solving computer vision tasks, extensive research has been done on these architectures. However, the loss function for heatmap regression is rarely studied. In...
We are witnessing the rise of the data-driven science paradigm, in which massive amounts of data - much of it collected as a side-effect of ordinary human activity - can be analyzed to make sense of the data and to make useful predictions. To fully realize the promise of this...
The abilities of plant biologists to characterize the genetic basis of physiological traits are limited by their abilities to obtain quantitative data representing precise details of trait variation and mainly to collect this data on a high-throughput scale at low cost. Deep learning-based methods have demonstrated unprecedented potential to automate...
This dissertation addresses object recognition in challenging settings, where distinct object classes are visually very similar (e.g., species of birds and insects) and/or access to training examples of object classes is limited (e.g., due to the associated high costs of data annotation). In this dissertation, we present a variety of...
This thesis focuses on the problem of object tracking. Given a video, the general objective of tracking is to track the location over time of one or more targets in the image sequence. This is a very challenging task as algorithms need to deal with problems such as appearance variations,...
We explore the application of deep learning to the disparate fields of natural language processing and computational biology. Both the sentences uttered by humans as well as the RNA and protein sequences found within the cells of their bodies can be considered formal languages in computer science, as sets of...
Enabled by a rich ecosystem of Machine Learning (ML) libraries, programming using learned models, i.e., Software-2.0, has gained substantial adoption. However, we do not know what challenges developers encounter when they use ML libraries. With this knowledge gap, researchers miss opportunities to contribute to new research directions, tool builders do...
Within the past several years the technology of high-throughput sequencing has transformed the study of biology by offering unprecedented access to life's fundamental building block, DNA. With this transformation's potential a host of brand-new challenges have emerged, many of which lend themselves to being solved through computational methods. From de...
This thesis studies the problem of structured prediction (SP), where the agent needs to predict a structured output for a given structured input (e.g., Part-of-Speech tagging sequence for an input sentence). Many important applications including machine translation in natural language processing (NLP) and image interpretation in computer vision can be...
This dissertation addresses few-shot object segmentation in images. The goal of segmentation is to label every image pixel with a class of the object occupying that pixel, where the class may represent a semantic object category or instance. In few-shot segmentation, training and test datasets have different classes. Every new...
Most tasks in natural language processing (NLP) try to map structured input (e.g., sentence or word sequence) to some form of structured output (tag sequence, parse tree, semantic graph, translated/paraphrased/compressed sentence), a problem known as “structured prediction”. While various learning algorithms such as the perceptron, maximum entropy, and expectation-maximization have...
Secure Computation is a powerful tool that enables a set of parties to jointly compute any function over their private inputs, without a trusted third party. Private Set Intersection is a specific case of two-party Secure Computation, where Alice (with private set X) and Bob (with private set Y) specifically...
A distributed system is a network of multiple autonomous computational nodes designed primarily for performance scalability and robustness. The performance of a distributed system depends critically on how tasks and resources are distributed among the nodes. Thus, a main thrust in distributed system research is to design schemes for distributing...
Natural Language Comprehension is a challenging domain of Natural Language Processing. To improve a model’s language comprehension/understanding, one approach would be to enrich the structure of the model to enhance its capability in learning the latent rules of the language.
In this dissertation, we will first introduce several deep models...