Multi-instance data, in which each object (e.g., a document) is a collection of instances
(e.g., word), are widespread in machine learning, signal processing, computer vision,
bioinformatic, music, and social sciences. Existing probabilistic models, e.g., latent
Dirichlet allocation (LDA), probabilistic latent semantic indexing (pLSI), and discrete
component analysis (DCA), have been...