Budgeted Distribution Learning of Belief Net Parameters

Li, Liuyang; Poczos, Barnabas; Szepesvari, Csaba; Greiner, Russ

doi:10.1184/R1/6475496.v1

file.pdf (813.48 kB)

Budgeted Distribution Learning of Belief Net Parameters

journal contribution

posted on 2010-01-01, 00:00 authored by Liuyang Li, Barnabas Poczos, Csaba Szepesvari, Russ Greiner

Most learning algorithms assume that a data set is given initially. We address the com- mon situation where data is not available ini- tially, but can be obtained, at a cost. We focus on learning Bayesian belief networks (BNs) over discrete variables. As such BNs are models of probabilistic distributions, we consider the “generative” challenge of learning the parameters for a fixed structure, that best match the true distribution. We focus on the budgeted learning setting, where there is a known fixed cost c_i for acquiring the value of the i_th feature for any specified instance, and a known total budget to spend acquiring all information. After formally defining this problem from a Bayesian perspective, we first consider non-sequential algorithms that must decide, before seeing any results, which features of which instances to probe. We show this is NP-hard, even if all variables are independent, then prove that the greedy allocation algorithm iga is optimal when the costs are uniform and the features are independent, but can otherwise be sub-optimal. We then show that general (sequential) policies per- form better, and explore the challenges of learning the parameters for general belief net- works in this setting, describing conditions for when the obvious round-robin algorithm will, versus will not, work optimally. We also explore the effectiveness of this and various other heuristic algorithms

History

Publisher Statement

Date

2010-01-01

Usage metrics

Keywords

Machine Learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Budgeted Distribution Learning of Belief Net Parameters

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports