Learning to Reason with a Scalable Probabilistic Logic
Learning to reason and understand the world’s knowledge is a fundamental problem in Artificial Intelligence (AI). Traditional symbolic AI methods were popular in the 1980s, when first-order logic rules were mostly handwritten, and reasoning algorithms were built on top of them. In the 90s, more and more researchers became interested in statistical methods that deal with the uncertainty of the data, using probabilistic models.
While it is always hypothesized that both the symbolic and statistical approaches are necessary for building intelligent systems, in practice, bridging the two in a combined framework often brings intractability—most probabilistic first-order logics are simply not efficient enough for real-world sized tasks. For example, Markov Logics [83] integrate first-order logics with Markov random field theory, but when mapping the entities in a knowledge base (KB) to the propositional theory (i.e., grounding), the size of the network depends on the number of facts in the KB—i.e., O(nk ) where k is the arity of the predicate, and n is the number of KB constants.
In this thesis, we design a new probabilistic logic programming paradigm to address various scalability issues in probabilistic logics. We propose a group of scalable methods for inference, learning, and inducing the structure of probabilistic logics. More specifically, we propose a scalable probabilistic logic called ProPPR [105] to combine the best of the symbolic and statistical worlds. ProPPR can be viewed as a probabilistic version of Prolog, and we associate a feature vector for each clause to learn weights from data. The learned weights are used to control search during inference. ProPPR’s inference scheme is very special: instead of performing potentially intractable global inference, ProPPR uses a provably-correct approximate personalized PageRank to conduct local grounding, whose inference time is independent of the size of the KB. To test ProPPR for large, real-world relational learning problems, we show that it can be used as a recursive relational learning engine [108] for large scale KB inference tasks.
Another challenging problem in statistical relational learning is structure learning: learning logic programs from a KB. Prior approach can take up to a month to learn the theory for a small KB [50]. This thesis provides a scalable solution to finding inference rules: we create a higher-level of abstraction, and reduce the unknown structure learning problem to parameter learning in ProPPR. To refine the learning process, we also introduce an iterated structural gradient algorithm for pruning the search space. This thesis also connects structured sparsity in machine learning and predicate invention in inductive logic programming [107]. To improve structure learning and incorporate latent variable modeling, we have also designed a rather radical approach—learning low-dimensional continuous latent embeddings of logical formulas [103].
History
Date
2016-05-06Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)