|
There have been late breaking changes to this event (May 2008). We've simplified things to make it a little less scary. Please read the new details below. Creators: Adam White and Brian Tanner, University of Alberta; Shimon Whiteson, Universiteit van AmsterdamIn the future robots will be used in many homes, offices and construction sites. It would be useful if these robots could learn to perform new tasks on-the-job with little dependance on human guidance and training. The polyathlon is meant to simulate this senerio: the agent faces a series of unknown tasks. The agent must learn, online, how to solve each task without any prior task knowledge or pretraining. The polyathlon raises a number of interesting algorithmic challenges, such as transfer learning, feature construction, adaptive representations and parameter-free learning. The First Annual Reinforcement Learning Competition, at NIPS 2006, featured a pentathlon, where participants were allowed to train there agents on two environments. Agents were tested on five environments including the two known environments and three unknown environments. The team from Rutgers University won the 2006 Pentathlon. Technical DetailsObservation Space: k dimensional, continuous or discrete valued
k will not exceed 20
Action Space: 1 dimensional, discrete valued small set of descrete actions
Rewards: unknown
UPDATED MAY 2008
Observation Space: 4 dimensional, continuous valued in [0,1] Action Space: 4 discrete actions Rewards: Reward range (maybe loose) will be provided by the task spec. More documentation of the the task specification string can be found here. We've made these changes so that all MDPs look indentical to the participants, so the agent will not easily be able to identify what domain the MDP represents. Some MDPs WILL have redundant actions and unnecessary observations. Please direct any questions to the forums.
|