Learning to recognize objects is a fundamental and essential step in human perception and understanding of the world. Accordingly, research of object discovery across diverse modalities plays a pivotal role in the context of computer vision. This field not only contributes significantly to enhancing our understanding of visual information but...
In this thesis, we introduce a novel Explanation Neural Network (XNN) to explain the predictions made by a deep network. The XNN works by embedding a high-dimensional activation vector of a deep network layer non-linearly into a low-dimensional explanation space while retaining faithfulness i.e., the original deep learning predictions can...
In this thesis, a new learning algorithm is introduced that is targeted towards individual fairness. In order to be individually fair, mispredictions need to be avoided as each such prediction means the learning algorithm was unfair towards some individual. Therefore, achieving individual fairness implies having a perfect classifier, which is...
Deep learning and neural network has been widely used in research, deep learning has empowered many tasks such as point clouds segmentation and shape recognition. One of the main advantages of deep interaction point cloud segmentation is that it allows the feature extraction can be learned through neural network based...
This thesis consists of two major components. The first part is concerned with video object instance segmentation (VOS), which is the task of assigning per-pixel labels perframe of a video sequence to indicate foreground object instance membership, given the first frame ground truth mask. VOS has myriad applications, from video...
The performance of deep learning frameworks could be significantly improved through considering the particular underlying structures for each dataset. In this thesis, I summarize our three work about boosting the performance of deep learning models through leveraging structures of the data. In the first work, we theoretically justify that, for...
Deep learning has recently revolutionized robot perception in many canonical robotic applications, such as autonomous driving. However, a similar transformation has yet to occur in more harsh environments including underwater and underground. This is due in part to the difficulty in deploying robots in these environments, which lack large real...
As one of the most popular data types, the point cloud is widely used in various appli- cations, including computer vision, computer graphics and robotics. The capability to directly measure 3D point clouds is invaluable in those applications as depth information could remove a lot of the segmentation ambiguities in...
Semantic image segmentation is a relatively difficult task in computer vision. With the advent of deep learning, semantic image segmentation is increasingly of interest for researchers because of the excellent predictions from Convolutional Neural Network (CNN). However, CNNs have proven to struggle with obtaining global context of image due to...
Labeling videos is costly, time-consuming and tedious. These costs can escalate in applications such as medical diagnosis or autonomous driving where we need domain expertise for annotation. Few-shot action recognition aims to solve this problem by annotation-efficient learning mechanisms.
This thesis presents MetaUVFS as the first Unsupervised Meta-learning algorithm for...