Protein secondary structure prediction plays a pivotal role in predicting protein folding in three-dimensions. Its task is to assign each residue one of the three secondary structure classes helix, strand, or random coil. This is an instance of the problem of sequential supervised learning in machine learning. This thesis describes...
This thesis addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree models of the conditional probability distributions of the DBNs. Existing algorithms presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise....
Automated recognition of object categories in images is a critical step for many real-world computer vision applications. Interest region detectors and region descriptors have been widely employed to tackle the variability of objects in pose, scale, lighting, texture, color, and so on. Different types of object recognition problems usually require...
Knowledge workers are struggling in the information flood. There is a growing interest in intelligent desktop environments that help knowledge workers organize their daily life. Intelligent desktop environments allow the desktop user to define a set of “activities” that characterize the user’s desktop work. These environments then attempt to identify...
The problem of document classification has been widely studied in machine learning and data mining. In document classification, most of the popular algorithms are based on the bag-of-words representation. Due to the high dimensionality of the bag-of-words representation, significant research has been conducted to reduce the dimensionality via different approaches....
This dissertation explores the idea of applying machine learning technologies to help computer users find information and better organize electronic resources, by presenting the research work conducted in the following three applications: FolderPredictor, Stacking Recommendation Engines, and Integrating Learning and Reasoning.
FolderPredictor is an intelligent desktop software tool that helps...
Bayesian Optimization (BO) methods are often used to optimize an unknown function f(•) that is costly to evaluate. They typically work in an iterative manner. In each iteration, given a set of observation points, BO algorithms select k ≥ 1 points to be evaluated. The results of those points are...
Citizen Science is a paradigm in which volunteers from the general public participate in scientific studies, often by performing data collection. This paradigm is especially useful if the scope of the study is too broad to be performed by a limited number of trained scientists. Although citizen scientists can contribute...
Tensegrity structures are composed of pure compressional elements that are connected via a network of pure tensional elements. The concept of tensegrity promises numerous advantages to the field of robotics. Tensegrity robots are, however, notoriously difficult to control due to their oscillatory nature and nonlinear interaction between the components. Multiagent...
In real networks, identifying dense regions is of great importance. For example, in a network that represents academic collaboration, authors within the densest component of the graph tend to be the most prolific. Dense subgraphs often identify communities in social networks. And dense subgraphs can be used to discover regulatory...