Fast training for large-scale one-versus-all linear classifiers using tree-structured initialization

Abstract

We consider the problem of training one-versus-all (OVA) linear classifiers for multiclass or multilabel classification when the number of labels is large. A naive extension of OVA to this problem, even with hundreds of cores, usually requires hours for training on large real world datasets. We propose a novel algorithm called OVA-Primal++ that speeds up the training of OVA by using a tree-structured training order, where each classifier is trained using its parent’s classifier as initialization. OVA-Primal++ is both theoretically and empirically faster than the naive OVA algorithm, and yet still enjoys the same highly parallelizability and small memory footprint. Extensive experiments on multiclass and multilabel classification datasets validate the effectiveness of our method.

Publication
In SIAM International Conference on Data Mining, 2019

Bibtex

@InProceedings{2019HuangFast,
  author =       {H. Fang and M. Cheng and C.-J. Hsieh and M. P. Friedlander},
  title =        {Fast training for large-scale one-versus-all linear
                  classifiers using tree-structured initialization},
  year =         2019,
  booktitle =    {Proc. SIAM Inter. Conf. Data Mining (SDM19)}
}
Avatar
Huang Fang
Researcher

My research interests include optimization, learning theory, algorithm design and data mining.

Related