Partial programming is a field of study where users specify an outline or skeleton of a program,
but leave various parts undefined. The undefined parts are then completed by an external
mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method
of partial programming that utilizes techniques from the field of reinforcement learning (RL), a
subfield of machine learning, to find good completions of those partial programs.
An ABP user writes a partial program in some host programming language. At various
points where the programmer is uncertain of the best course of action, they include choices
that non-deterministically select amongst several options. Additionally, users indicate program
success through a reward construct somewhere in their program. The resulting non-deterministic
program is completed by treating it as an equivalent RL problem and solving the problem with
techniques from that field. Over repeated executions, the RL algorithms within the ABP system
will learn to select choices at various points that maximize the reward received.
This thesis explores various aspects of ABP such as the semantics of different implementations,
including different design trade-offs encountered with each approach. The goal of all
approaches is to present a model for programs that adapt to their environment based on the
points of uncertainty within the program that the programmer has indicated.
The first approach presented in this work is an implementation of ABP as a domain-specific
language embedded within a functional language. This language provides constructs for common
patterns and situations that arise in adaptive programs. This language proves to be compositional
and to foster rapid experimentation with different adaptation methods (e.g. learning
algorithms). A second approach presents an implementation of ABP as an object-oriented library
that models adaptive programs as formal systems from the field of RL called Markov Decision
Processes (MDPs). This approach abstracts away many of the details of the learning algorithm
from the casual user and uses a fixed learning algorithm to control the program adaptation rather
than allowing it to vary. This abstraction results in an easier-to-use library, but limits the scenarios that ABP can effectively be used in. Moreover, treating adaptive programs as MDPs leads to some unintuitive situations where seemingly reasonably programs fail to adapt efficiently.
This work addresses this problem with algorithms that analyze the adaptive program's structure and data flow to boost the rate at which these problematic adaptive programs learn thus increasing the number of problems that ABP can effectively be used to solve.