In supervised learning, label information can be provided at different levels of granularity. For small datasets, it is possible to acquire a label for each data instance. However, in the big-data regime, this fine granularity approach is prohibitively costly. For example, in semi-supervised learning, only a limited number of samples are labeled and there are a large number of unlabeled samples. In positive and unlabeled learning, due to the cost of obtaining negative samples, labels are provided for a subset of positive samples, meanwhile unlabeled negative samples are mixed with unlabeled positive samples. In multi-instance learning, labels are provided for groups of samples instead of individuals. All of the aforementioned approaches provide a practical solution for reducing labeling efforts but introduce machine learning challenges both in terms of the required methodology data as well as performance limitations for these methods. Recent works include maximum margin-based with potentially heuristic objectives leading to low accuracy performance and complicated probabilistic models leading to high computational complexity inference. This work focuses on probabilistic models with exact but efficient inference. In many cases, the proposed frameworks outperform, sometimes significantly, the current state-of-the-art methods. In addition to the development of the algorithmic framework for the aforementioned various weak supervision learning problems, an algorithm and the corresponding theory are developed for the setting of positive and unlabeled learning.