Index Catalog // ScholarsArchive@OSU

1. Revisiting output coding for sequential supervised learning

Creator:: Hao, Guohua
Abstract:: Markov models are commonly used for joint inference of label sequences. Unfortunately, inference scales quadratically in the number of labels, which is problematic for training methods where inference is repeatedly preformed and is the primary computational bottleneck for large label sets. Recent work has used output coding to address this...
Resource Type:: Masters Thesis
Full Text:: xit and the predictions of the previous h binary CRFs at sequence positions from t − l to t + r (l

2. Robust Reference-Free Sim-to-Real Reinforcement Learning for Bipedal Locomotion

Creator:: Siekmann, Jonah A.
Abstract:: In recent years, model-free Deep Reinforcement Learning (RL) has become an increasingly popular alternative to more traditional model-based or optimization-based control methods in solving robotic legged locomotion. However, deploying RL in the real world can be a significant undertaking. Constructing reward functions which compel controllers to learn the desired behavior...
Resource Type:: Masters Thesis
Full Text:: distribution. The reward r = R(s, a) is a scalar signal that expresses how good a particular state-action pair

3. Finding and Using Chokepoints in Stratagus

Creator:: Brewster, Benjamin
Abstract:: This paper describes a method for finding areas of interest on a two-dimensional grid map used in the real-time strategy engine Stratagus. The method involves discovering chokepoints where through all simulation agents must pass. Using a set of tunable parameters, a full set of chokepoints are located. The redundant and...
Resource Type:: Capstone Project
Full Text:: . Listing 6 shows each function in its entirety. void CoutRect(RECT *r) { cout << "RECT top: " << r

4. Recurrent Neural Networks for Robotic Control of a Human-Scale Bipedal Robot

Creator:: Siekmann, Jonah A.
Abstract:: Dynamic bipedal locomotion is among the most difficult and yet relevant problems in modern robotics. While a multitude of classical control methods for bipedal locomotion exist, they are often brittle or limited in capability. In recent years, work in applying reinforcement learning to robotics has lead to superior performance across...
Resource Type:: Honors College Thesis
Full Text:: following reward function: R =0.30 · exp(−orienterr) +0.20 · exp(−ẋerr) +0.20 · exp(−ẏerr) +0.10 · exp

5. Integrating learning and search for structured prediction

Creator:: Doppa, Janardhan Rao
Abstract:: We are witnessing the rise of the data-driven science paradigm, in which massive amounts of data - much of it collected as a side-effect of ordinary human activity - can be analyzed to make sense of the data and to make useful predictions. To fully realize the promise of this...
Resource Type:: Dissertation
Full Text:: , Drew Bagnell, Roni Khardon, Andrew McCallum, Lise Getoor, Jason Eisner, Hal Daumé III, John Langford

6. UCT for tactical assault battles in real-time strategy games

Creator:: Balla, Radha-Krishna
Abstract:: We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics...
Resource Type:: Masters Thesis
Full Text:: the best action a from state s. Finally, after the trajectory reaches a terminal state the reward R

7. Bayesian methods for knowledge transfer and policy search in reinforcement learning

Creator:: Wilson, Aaron
Abstract:: How can an agent generalize its knowledge to new circumstances? To learn effectively an agent acting in a sequential decision problem must make intelligent action selection choices based on its available knowledge. This dissertation focuses on Bayesian methods of representing learned knowledge and develops novel algorithms that exploit the represented...
Resource Type:: Dissertation
Full Text:: (Parameter set Ψ). There are N tasks. For each task the agent has made R observations. The class

8. Reinforcement Learning for P2P Backup Applications

Creator:: Mall, Shikhar
Abstract:: A five year study of file-system metadata shows that the number of files increases by 200% and only a select few file-types contribute for over 35% of the files that exist on a file-system. It is difficult to point out a permanent selection of files that a user really cares...
Resource Type:: Capstone Project
Full Text:: y te s /s e c o n d ) Q(s , a) = Q(s , a) + α×(R(s)+ β max a ' Q(s ' , a ' )−Q(s , a

9. Monte-Carlo planning for probabilistic domains

Creator:: Bjarnason, Ronald V.
Abstract:: This thesis presents a progression of novel planning algorithms that culminates in a new family of diverse Monte-Carlo methods for probabilistic planning domains. We provide a proof for performance guarantees and analyze how these algorithms can resolve some of the shortcomings of traditional probabilistic planning methods. The direct policy search...
Resource Type:: Dissertation
Full Text:: transition function and a reward, commonly represented as a 4-tuple (S,A, T ,R). All objects and features of

10. Investigating Latent State and Uncertainty Representations in Reinforcement Learning

Creator:: Koul, Anurag
Abstract:: Learning latent space representations of high-dimensional world states has been at the core of recent rapid growth in reinforcement learning(RL). At the same time, RL algo- rithms have suffered from ignored uncertainties in the predicted estimates of model-free or model-based methods. In our work, we investigate both of these aspects...
Resource Type:: Dissertation
Full Text:: QBN’s to quantize memory and observations. (3.) Insertion and Fine Tuning. The modules labeled O, R are

ScholarsArchive@OSU

1. Revisiting output coding for sequential supervised learning

2. Robust Reference-Free Sim-to-Real Reinforcement Learning for Bipedal Locomotion

3. Finding and Using Chokepoints in Stratagus

4. Recurrent Neural Networks for Robotic Control of a Human-Scale Bipedal Robot

5. Integrating learning and search for structured prediction

6. UCT for tactical assault battles in real-time strategy games

7. Bayesian methods for knowledge transfer and policy search in reinforcement learning

8. Reinforcement Learning for P2P Backup Applications

9. Monte-Carlo planning for probabilistic domains

10. Investigating Latent State and Uncertainty Representations in Reinforcement Learning

Limit your search

Academic Affiliation

Advisor

Commencement Year

Committee Member

Creator

Contributor

Date

Decade

Degree Field

Degree Level

Degree Name

Language

License

Non-Academic Affiliation

Peer Reviewed

Resource Type

Rights Statement

Subject

Search Constraints

Search Results

1. Revisiting output coding for sequential supervised learning

2. Robust Reference-Free Sim-to-Real Reinforcement Learning for Bipedal Locomotion

3. Finding and Using Chokepoints in Stratagus

4. Recurrent Neural Networks for Robotic Control of a Human-Scale Bipedal Robot

5. Integrating learning and search for structured prediction

6. UCT for tactical assault battles in real-time strategy games

7. Bayesian methods for knowledge transfer and policy search in reinforcement learning

8. Reinforcement Learning for P2P Backup Applications

9. Monte-Carlo planning for probabilistic domains

10. Investigating Latent State and Uncertainty Representations in Reinforcement Learning

Limit your search