posted on 2000-04-01, 00:00authored byScott Davies, Andrew W Moore
Recently developed techniques have made it possible to quickly learn accurate probability density functions from data in low-dimensional continuous spaces. In particular, mixtures of Gaussians can be fitted to data very
quickly using an accelerated EM algorithm that employs multi-resolution kdtrees (Moore, 1999). In this paper, we propose a kind of Bayesian network
in which low-dimensional mixtures of Gaussians over different subsets of the
domain's variables are combined into a coherent joint probability model over
the entire domain. The network is also capable of modelling complex dependencies between discrete variables and continuous variables without requiring
discretization of the continuous variables. We present efficient heuristic algorithms for automatically learning these networks from data, and perform
comparative experiments illustrating how well these networks model real scientific data and synthetic data. We also briefly discuss some possible improvements to the networks, as well as their possible application to anomaly
detection, classification, probabilistic inference, and compression.