Langseth, Helge5; Nielsen, Thomas Dyhre6; Pérez-Bernabé, Inmaculada7; Salmerón, Antonio7
1 Machine Intelligence, The Faculty of Engineering and Science (ENG), Aalborg University, VBN2 Aalborg U Robotics, The Faculty of Humanities, Aalborg University, VBN3 Department of Computer Science, The Faculty of Engineering and Science (ENG), Aalborg University, VBN4 The Faculty of Engineering and Science (TECH), Aalborg University, VBN5 NTNU6 Distributed, Embedded and Intelligent Systems, The Faculty of Engineering and Science (ENG), Aalborg University, VBN7 UAL
In this paper we investigate methods for learning hybrid Bayesian networks from data. First we utilize a kernel density estimate of the data in order to translate the data into a mixture of truncated basis functions (MoTBF) representation using a convex optimization technique. When utilizing a kernel density representation of the data, the estimation method relies on the specification of a kernel bandwidth. We show that in most cases the method is robust wrt. the choice of band- width, but for certain data sets the bandwidth has a strong impact on the result. Based on this observation, we propose an alternative learning method that relies on the cumulative distribution function of the data. Empirical results demonstrate the usefulness of the approaches: Even though the methods produce estimators that are slightly poorer than the state of the art (in terms of log-likelihood), they are significantly faster, and therefore indicate that the MoTBF framework can be used for inference and learning in reasonably sized domains. Furthermore, we show how a particular sub- class of MoTBF potentials (learnable by the proposed methods) can be exploited to significantly reduce complexity during inference.
International Journal of Approximate Reasoning, 2014, Vol 55, Issue 4, p. 940-966