The advent of deep learning models leads to a substantial improvement in a wide range of NLP tasks, achieving state-of-art performances without any hand-crafted features. However, training deep models requires a massive amount of labeled data. Labeling new data as a new task or domain emerges consumes time and efforts and needs domain expertise. As a result, the approaches that address the data scarcity are getting increasing attention in recent years, including, but not limited to, transfer learning, zero-shot learning, and weak supervision. We present three different methods to learn from limited labeled data. In the first work, we present a Transfer Learning method to transfer the knowledge between two domains (source and target) with disparate labels. Our approach exploits the relationship between the source and the target labels to enhance the transfer of the learned knowledge. We apply our methods to two NLP tasks: Event Typing and Text Classification. In our second work, we address the problem of modeling the tasks with evolving type ontologies. We present a Zero-shot Fine-Grained Entity Typing (ZS-FGET) approach that exploits the Wikipedia description of the type to construct the representation of that type. Then, the type can be recognized requiring zero training examples. Since FGET deals with a large number of types organized into a hierarchy, Distant Supervision is employed to automatically collect training data, leading to significant label noise. Several methods have been proposed to tackle the problem of FGET, some of them suggest special ways to learn robustly from the data with noisy labels. Most of these methods are evaluated using three publicly available benchmark datasets: FIGER, OntoNotes, and BBN. However, there are some fundamental issues in the empirical evaluation of these methods. Critically, most existing evaluation only reports the overall performance on all types, which can be dominated by the performance on coarse types and provides very little information regarding how well these methods work for the fine-grained types. This is further compounded by the fact that the testing sets for two of the three benchmarks actually have very poor coverage of the fine-grained types. In our final work, we present a new empirical study that re-evaluates the most recently proposed FGET methods by introducing new testing sets with significantly improved coverage for fine-grained types and examining not only the overall performance but also per-level, type-specific performance. Our analysis of the tested methods reveals new insights about these methods and suggests new directions for future improvement.