Coordinate descent converges faster with the Gauss-Southwell rule than random selection

Julie Nutini, Mark Schmidt, Issam H. Laradji, Michael Friedlander, Hoyt Koepke

Links

Paper Python codePresentation slidesPoster

Summary

The paper presents new analysis of the coordinate descent algorithm that uses the Gauss Southwell (GS) rule, which can be much faster than random selection; it shows that using exact coordinate optimization and/or lipschitz constants can improve the GS convergence rate. The paper also contains analysis on using fast approximation techniques for the GS rule and analysis for different proximal-gradient GS rules.

Using the code

The code serves two main purposes,