Learning Mixtures of Product Distributions over Discrete Domains

We consider the problem of learning mixtures of product distributions over discrete domains in the distribution learning framework introduced by Kearns et al. [Proceedings of the 26th Annual Symposium on Theory of Computing (STOC), Montr´eal, QC, 1994, ACM, New York, pp. 273–282]. We give a poly(n/ε)-time algorithm for learning a mixture of k arbitrary product distributions over the n-dimensional Boolean cube {0, 1}n to accuracy ε, for any constant k. Previous polynomial-time algorithms could achieve this only for k = 2 product distributions; our result answers an open question stated independently in [M. Cryan, Learning and Approximation Algorithms for Problems Motivated by Evolutionary Trees, Ph.D. thesis, University of Warwick, Warwick, UK, 1999] and [Y. Freund and Y. Mansour, Proceedings of the 12th Annual Conference on Computational Learning Theory, 1999, pp. 183–192]. We further give evidence that no polynomial-time algorithm can succeed when k is superconstant, by reduction from a difficult open problem in PAC (probably approximately correct) learning. Finally, we generalize our poly(n/ε)-time algorithm to learn any mixture of k = O(1) product distributions over {0, 1, . . . , b − 1}n, for any b = O(1).