Mao, Hua1; Chen, Yingke1; Jaeger, Manfred7; Nielsen, Thomas Dyhre7; Larsen, Kim Guldstrand5; Nielsen, Brian4
1 Department of Computer Science, The Technical Faculty of IT and Design, Aalborg University, VBN2 The Faculty of Engineering and Science (TECH), Aalborg University, VBN3 Machine Intelligence, The Technical Faculty of IT and Design, Aalborg University, VBN4 Aalborg U Robotics, The Faculty of Humanities, Aalborg University, VBN5 CISS - Center for Embedded Software Systems, The Technical Faculty of IT and Design, Aalborg University, VBN6 Distributed Systems and Semantics, The Technical Faculty of IT and Design, Aalborg University, VBN7 Distributed, Embedded and Intelligent Systems, The Technical Faculty of IT and Design, Aalborg University, VBN
Constructing an accurate system model for formal model verification can be both resource demanding and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically learning system models based on observed system behaviors. In this paper we extend the algorithm on learning probabilistic automata to reactive systems, where the observed system behavior is in the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically learning a deterministic labeled Markov decision process model from the observed behavior of a reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions. The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned models.
Electronic Proceedings in Theoretical Computer Science, 2012, Vol 103, p. 49-63