Christensen, David Johan4; Schultz, Ulrik Pagh5; Stoy, Kasper5
1 Department of Electrical Engineering, Technical University of Denmark2 Automation and Control, Department of Electrical Engineering, Technical University of Denmark3 Centre for Playware, Automation and Control, Department of Electrical Engineering, Technical University of Denmark4 Copenhagen Center for Health Technology, Center, Technical University of Denmark5 University of Southern Denmark
In this paper, we present a distributed reinforcement learning strategy for morphology-independent lifelong gait learning for modular robots. All modules run identical controllers that locally and independently optimize their action selection based on the robot’s velocity as a global, shared reward signal.Weevaluate the strategy experimentally mainly on simulated, but also on physical, modular robots. We find that the strategy: (i) for six of seven configurations (3–12 modules) converge in 96% of the trials to the best known action-based gaits within 15 min, on average, (ii) can be transferred to physical robots with a comparable performance, (iii) can be applied to learn simple gait control tables for both M-TRAN and ATRON robots, (iv) enables an 8-module robot to adapt to faults and changes in its morphology, and (v) can learn gaits for up to 60 module robots but a divergence effect becomes substantial from 20–30 modules. These experiments demonstrate the advantages of a distributed learning strategy for modular robots, such as simplicity in implementation, low resource requirements, morphology independence, reconfigurability, and fault tolerance.
Robotics and Autonomous Systems, 2013, Vol 61, Issue 9