Markov models are commonly used for joint inference of label sequences. Unfortunately, inference scales quadratically in the number of labels, which is problematic for training methods where inference is repeatedly preformed and is the primary computational bottleneck for large label sets. Recent work has used output coding to address this...
In recent years, model-free Deep Reinforcement Learning (RL) has become an increasingly popular alternative to more traditional model-based or optimization-based control methods in solving robotic legged locomotion. However, deploying RL in the real world can be a significant undertaking. Constructing reward functions which compel controllers to learn the desired behavior...
This paper describes a method for finding areas of interest on a two-dimensional grid map used in the real-time strategy engine Stratagus. The method involves discovering chokepoints where through all simulation agents must pass. Using a set of tunable parameters, a full set of chokepoints are located. The redundant and...
Dynamic bipedal locomotion is among the most difficult and yet relevant problems in modern robotics. While a multitude of classical control methods for bipedal locomotion exist, they are often brittle or limited in capability. In recent years, work in applying reinforcement learning to robotics has lead to superior performance across...
We are witnessing the rise of the data-driven science paradigm, in which massive amounts of data - much of it collected as a side-effect of ordinary human activity - can be analyzed to make sense of the data and to make useful predictions. To fully realize the promise of this...
We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...
How can an agent generalize its knowledge to new circumstances? To learn
effectively an agent acting in a sequential decision problem must make intelligent action selection choices based on its available knowledge. This dissertation focuses on Bayesian methods of representing learned knowledge and develops novel algorithms that exploit the represented...
A five year study of file-system metadata shows that the number of files increases by 200% and only a select few file-types contribute for over 35% of the files that exist on a file-system. It is difficult to point out a permanent selection of files that a user really cares...
This thesis presents a progression of novel planning algorithms that culminates in a new family of diverse Monte-Carlo methods for probabilistic planning domains. We provide a proof for performance guarantees and analyze how these algorithms can resolve some of the shortcomings of traditional probabilistic planning methods. The direct policy search...
Learning latent space representations of high-dimensional world states has been at the core of recent rapid growth in reinforcement learning(RL). At the same time, RL algo- rithms have suffered from ignored uncertainties in the predicted estimates of model-free or model-based methods. In our work, we investigate both of these aspects...