We propose a sequential learning algorithm with a focus on robot control. It is initialised by a teacher who directs the robot through a series of example solutions of a problem. Left alone, the control chooses its next action by prediction based on a variable order Markov chain model selected to minimise a MDL criterion based on generalised code length La of the past robot-environment interaction. The user specifies the parameter a and as a result, the robot can be directed towards exploratory behaviour if confidence in the teacher is low (a<0), and towards goal-seeking exploitive behaviour if confidence in the teacher is high (a>0). The novelty of the proposed method lies in the use of generalised code length in the MDL model selection criterion.
robomat 07: Coimbra, Portugal, 17-19 September, 2007, 2007, p. 51-57
Main Research Area:
<em>Workshop of Robotics and Mathematics (RoboMat 2007)</em>