Improved Guarantees for Learning via Similarity Functions

Balcan, Maria-Florina; Blum, Avrim; Srebro, Nathan

doi:10.1184/R1/6606368.v1

Improved Guarantees for Learning via Similarity Functions

journal contribution

posted on 2002-09-01, 00:00 authored by Maria-Florina Balcan, Avrim Blum, Nathan Srebro

We continue the investigation of natural conditions for a similarity function to allow learning, without requiring the similarity function to be a valid kernel, or referring to an implicit high-dimensional space. We provide a new notion of a “good similarity function” that builds upon the previous definition of Balcan and Blum (2006) but improves on it in two important ways. First, as with the previous definition, any large-margin kernel is also a good similarity function in our sense, but the translation now results in a much milder increase in the labeled sample complexity. Second, we prove that for distribution-specific PAC learning, our new notion is strictly more powerful than the traditional notion of a large-margin kernel. In particular, we show that for any hypothesis class C there exists a similarity function under our definition allowing learning with O(log |C|) labeled examples. However, in a lower bound which may be of independent interest, we show that for any class C of pairwise uncorrelated functions, there is no kernelwith margin γ ≥ 8√|C| for all f ∈ C, even if one allows average hinge-loss as large as 0.5. Thus, the sample complexity for learning such classes with SVMs is Ω(|C|). This extends work of Ben- David et al. (2003) and Forster and Simon (2006) who give hardness results with comparablemargin bounds, but at much lower error rates.

Our new notion of similarity relies upon L₁ regularized learning, and our separation result is related to a separation result between what is learnable with L₁ vs. L₂ regularization.

History

Date

2002-09-01

Usage metrics

Keywords

computer sciences Information and Computing Sciences not elsewhere classified

Licence

In Copyright

Improved Guarantees for Learning via Similarity Functions

History

Date

Usage metrics

Categories

Keywords

Licence

Exports