This report presents an efficient method for semi-supervised video object segmentation – the problem of identifying foreground pixels occupied by a target object. The target is specified by the ground-truth mask in the first video frame. While the state of the art achieves a segmentation accuracy greater than 80%, it...
Heatmap regression has became one of the mainstream approaches to localize facial landmarks. As Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are becoming popular in solving computer vision tasks, extensive research has been done on these architectures. However, the loss function for heatmap regression is rarely studied. In...
Sports analytics is rapidly evolving today through the use of computer vision systems that automatically extract huge amount of information inherently present in multimedia data without much human assistance. This information can facilitate a better understanding of patterns and strategies in various sports. However, for non-professional teams, due to expense...
This thesis addresses a fundamental computer vision problem, that of action recognition. The goal of action recognition is to recognize a class of human actions in a given video. Action recognition has a wide range of applications, including automated surveillance, sports video analysis, internet-based searches etc. The main challenge is...
Object recognition is a fundamental problem in computer vision. Recognition is
required by many applications. This thesis presents a distance based approach to
recognize objects. We are interested in objects that belong to very similar classes,
where each class has large variations. This problem is called fine-grained object
recognition. Given...
This project covers the construction of a Stereo Camera System, integration with a Velodyne VLP-16 LIDAR and the creation of dataset intended to aid in the development of vision algorithms for forestry applications. This project is the first step in a future multi-stage project to implement computer vision systems for...
This dissertation addresses the problem of video labeling at both the frame and pixel levels using deep learning. For pixel-level video labeling, we have studied two problems: i) Spatiotemporal video segmentation and ii) Boundary detection and boundary flow estimation. For the problem of spatiotemporal video segmentation, we have developed recurrent...
A fundamental problem in computer vision is to partition an image into meaningful segments. While image segmentation is required by many applications, the thesis focuses on segmentation of computed tomography (CT) images for analysis and quality control of composite materials. The key research contribution of this thesis is a novel...
A general discrete-time, adaptive, multidimensional framework is introduced
for estimating the motion of one or several object features from their successive
non-linear projections on an image plane. The motion model consists of
a set of linear difference equations with parameters estimated recursively from
a non-linear observation equation. The model dimensionality...
This dissertation addresses two fundamental problems in computer vision—namely,
multitarget tracking and event recognition in videos. These problems are challenging
because uncertainty may arise from a host of sources, including motion blur,
occlusions, and dynamic cluttered backgrounds. We show that these challenges can be
successfully addressed by using a multiscale,...