A MATLAB interface for L-BFGS-Bby Peter Carbonetto
Dept. of Computer Science
University of British Columbia
L-BFGS-B is a collection of Fortran 77 routines for solving large-scale nonlinear optimization problems with bound constraints on the variables. One of the key features of the nonlinear solver is that knowledge of the Hessian is not required; the solver computes search directions by keeping track of a quadratic model of the objective function with a limited-memory BFGS (Broyden-Fletcher-Goldfarb-Shanno) approximation to the Hessian.1 The algorithm was developed by Ciyou Zhu, Richard Byrd and Jorge Nocedal. For more information, go to the original distribution site for the L-BFGS-B software package.
I've designed an interface to the L-BFGS-B solver so that it can be called like any other function in MATLAB.2 See the text below for more information on installing and calling it in MATLAB. Along the way, I've also developed a C++ class that encapsulates all the "messy" details in executing the L-BFGS-B code. See below for instructions on how to use this class for your own C++ code.
If you have any questions, praise, or comments, or would like to report a bug, do not hesitate to contact the author. I've tested this software in MATLAB version 7.2 on both the Mac OS X and Linux operating systems.
LicenseThis work is licensed under the GNU Public license. Please cite the source as Peter Carbonetto, Department of Computer Science, University of British Columbia.
These installation instructions assume you have a UNIX-based operating system, such as Mac OS X or Linux. (If you have a Windows operating system, Guillaume Jacquenot has generously provided instructions for compiling the software on Windows with gnumex and Mingw.) These instructions also assume you have GNU make installed.
Download the source. First, download and unpack the tar archive. It contains some MATLAB files (ending in .m), some C++ source and header files (ending in .h and .cpp), a single Fortran 77 source file solver.f containing the L-BFGS-B routines, and a Makefile. I've included a version of the Fortran routines that is more recent than what is available for download at the distribution site at Northwestern University.
What we will do is create a MEX file, which is basically a file that contains a routine that can be called from MATLAB as if it were a built-it function. To learn about MEX files, I refer you to this document at the MathWorks website.
Install the C++ and Fortran 77 compilers. In order to build the MEX file, you will need both a C++ compiler and a Fortran 77 compiler. Unfortunately, you can't just use any compiler. You have to use the precise one supported by MATLAB. For instance, if you are running Mac OS X 7.2 on a Linux operating system, you will need to install the GNU compiler collection (GCC) version 3.4.4. Even if you already have a compiler installed on Linux, it may be the wrong version, and if you use the wrong version things could go horribly wrong! It is important that you use the correct version of the compiler, otherwise you will encounter linking errors. Refer to this document to find out which compiler is supported by your version of MATLAB.
Configure MATLAB. Next, you need to set up and configure MATLAB to build MEX Files. This is explained quite nicely in this MathWorks support document.
Simple compilation procedure. (Thanks to Stefan Harmeling.) You might be able to build and compile the MEX File from source code simply by calling MATLAB's mex program with the C++ and Fortran source files as inputs like so:
If that doesn't work (you get linking errors or it makes MATLAB crash when you try to call it) then you may have to follow the complicated installation instructions below.
Modify the Makefile. You are almost ready to build the MEX
file. But before you do so, you need to edit the Makefile to coincide
with your system setup. At the top of the file are several variables
you may need to modify. The variable
On my laptop running Mac OS X 10.3.9 with MATLAB 7.2, my settings for the Makefile are
On my Linux machine, I set the variables in the Makefile like so:
It may be helpful to look at the GCC documentation in order to understand what these various compiler flags mean.
Build the MEX file. If you are in the directory containing all
the source files, typing
A brief tutorial
I've written a short script
The basic MATLAB function call is
The first input argument
The script starts by generating a bunch of canonical images that represent topics; if a pixel is white then a word at that position is more likely to appear in that topic. Note that the order of the pixels in the image is unimportant, since LDA is a "bag of words" model. In another figure we show a small sample of the data generated from the topics. Each image represents a document, and the pixels in the image depict the word proportions in that document.5
Next, the script runs the L-BFGS-B solver to find a local minimum
to the variational objective. Buried in the M-file
This example will give us the opportunity to demonstrate some of
the more complicated aspects of the our MATLAB interface. The first
It is instructive to examine the callback routines. They look like:
Notice that each entry to the cell array is its own input argument to the callback routines. The same applies for the output arguments of the gradient callback function.
The sixth input argument is a cell array that contains auxiliary data. This data will be passed to the MATLAB functions that evaluate the objective and gradient. The seventh input argument specifies a callback routine that is called exactly once for every iteration. This callback routine is most useful for examining the progress of the solver. The remaining input arguments are label/value pairs that override some of the default settings of L-BFGS-B.
After converging to a solution, we display topic samples drawn from the variational approximation. A good solution should come close to recovering the true topics from which the data was generated (although they might not be in the same order).
Don't be surprised if it takes many iterations before the optimization algorithm converges on a local minimum. (In repeated trials, I found it as many as two thousand iterations to converge to a stationary point of the objective.) This particular problem demonstrates the limitations of limited-memory quasi-Newton approximations to the Hessian; the low storage requirements can come at the cost of slow convergence to the solution.
The C++ interface
One of the nice byproducts of writing a MATLAB interface for
L-BFGS-B is that I've ended up with a neat little C++ class that
encapsulates the nuts and bolts of executing the solver. A brief
description of the Program class can be found in the header file
Program is an abstract class, which means that its impossible to
instantiate a Program object. This is because the class has member
functions that aren't yet implemented. These are called pure virtual
functions. In order to use the Program class, one needs to define
another class that inherits the Program class and that implements the
pure virtual functions. In fact, the class MatlabProgram (in
The new child class must declare and implement these two functions:
See the header file for a detailed description of these functions. The only remaining detail is calling the Program constructor. After that, it is just a matter of declaring a new object of type MyProgram then calling the function runSolver.
Contents of the tar archive
1 Ciyou Zhu, Richard H. Byrd, Peihuang Lu and Jorge Nocedal (1997). Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on Mathematical Software, Vol. 23, No. 4., pp. 550-560.
3 Willi Hock and Klaus Schittkowski (1981). Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems, Vol. 187, Springer-Verlag.
4 David M. Blei, Andrew Y. Ng, Michael I. Jordan (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, Vol. 3, pp. 993-1022.
5 Tom L. Griffiths and Mark Steyvers (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, Vol. 101, pp. 5228-5235.
March 3, 2007