Learning to recognize objects is a fundamental and essential step in human perception and understanding of the world. Accordingly, research of object discovery across diverse modalities plays a pivotal role in the context of computer vision. This field not only contributes significantly to enhancing our understanding of visual information but...
In open set recognition, a classifier must label instances of known classes while detecting instances of unknown classes not encountered during training. To detect unknown classes while still generalizing to new instances of existing classes, this thesis introduces a dataset augmentation technique called counterfactual image generation. This approach, based on...
Labeling videos is costly, time-consuming and tedious. These costs can escalate in applications such as medical diagnosis or autonomous driving where we need domain expertise for annotation. Few-shot action recognition aims to solve this problem by annotation-efficient learning mechanisms.
This thesis presents MetaUVFS as the first Unsupervised Meta-learning algorithm for...
The abilities of plant biologists to characterize the genetic basis of physiological traits are limited by their abilities to obtain quantitative data representing precise details of trait variation and mainly to collect this data on a high-throughput scale at low cost. Deep learning-based methods have demonstrated unprecedented potential to automate...
Deep neural networks currently comprise the backbone of many applications where safety is a critical concern, for example: autonomous driving and medical diagnostics. Unfortunately these systems currently fail to detect out-of-distribution (OOD) inputs and can be prone to making dangerous errors when exposed to them. In addition, these same systems...
In this thesis, we introduce a novel Explanation Neural Network (XNN) to explain the predictions made by a deep network. The XNN works by embedding a high-dimensional activation vector of a deep network layer non-linearly into a low-dimensional explanation space while retaining faithfulness i.e., the original deep learning predictions can...
As one of the most popular data types, the point cloud is widely used in various appli- cations, including computer vision, computer graphics and robotics. The capability to directly measure 3D point clouds is invaluable in those applications as depth information could remove a lot of the segmentation ambiguities in...
This thesis consists of two major components. The first part is concerned with video object instance segmentation (VOS), which is the task of assigning per-pixel labels perframe of a video sequence to indicate foreground object instance membership, given the first frame ground truth mask. VOS has myriad applications, from video...
The advancement of artificial intelligence (AI) has led to transformative developments across multiple sectors, fostering innovation and redefining our interactions with technology. As AI matures and becomes integrated into society, it offers numerous opportunities to address global challenges and revolutionize a wide array of human endeavors. These advances are driven...
Deep learning has recently revolutionized robot perception in many canonical robotic applications, such as autonomous driving. However, a similar transformation has yet to occur in more harsh environments including underwater and underground. This is due in part to the difficulty in deploying robots in these environments, which lack large real...