blogreg

Mark Schmidt (2006)

This is a set of Matlab routines I wrote for the course STAT535D: Statistical Computing and Monte Carlo Methods by A. Doucet. It implements different Markov Chain Monte Carlo (MCMC) strategies for sampling from the posterior distribution over the parameter values for binary Probit and Logistic Regression models with a Gaussian prior on the parameter values. Specifically, we are sampling from:

P(y=1|w,x) = f(w'x)
w ~ N(0,v)

In the above, x is a set of p features, y is the class label (-1 or 1), w is the parameters we want to estimate, N(0,v) denotes the prior Normal distribution on w (with mean 0 and inverse covariance matrix v). For Logistic Regression, f(a) is the sigmoid function 1/(1+exp(-a)), while for Probit Regression it is the Gaussian cumulative distribution function.

Methods

Probit Regression: There are 2 strategies implemented for sampling from the Probit model. The first strategy (probit2GibbsSample.m) is the auxiliary variable method of Albert and Chib ("Bayesian Analysis of Binary and Polychotomous Response Data", JASA 1993). This method introduces a set of auxiliary variables z to allow efficient block-Gibbs sampling. In this model, the auxiliary variables z (conditioned on w) follow truncated Normal distributions, while the parameters w (conditioned on z) follow a multivariate Normal.

The second Probit Regression sampling strategy (probit2Sample.m) uses the same model, but implements the Composition sampler of Holmes and Held ("Bayesian auxiliary variable models for binary and polychotomous regression", 2004). This model jointly samples w and z, by directly sampling z from its marginal distribution (integrating over w).

Logistic Regression: There are 3 strategies implemented for sampling from the Logistic model. The first strategy (logist2SampleMH.m) uses the Metropolis-Hastings algorithm outlined in Johnson and Albert ("Ordinal Data Modeling", Springer 1999). The Iteratively-Reweighted Least Squares algorithm is used to find the Maximum a Posteriori (MAP) estimate of w, and this value is used to initialize the Markov Chain. The Asymptotic Covariance Matrix and an adaptively updated kernel width parameter are used to make proposals.

The 2nd strategy for the Logistic model (logist2Sample.m) is the Logistic variant of the Holmes and Held Probit Regression sampler. Rather than having a unit variance as in the Probit model, in the Logistic model the variances of the z variables lambda are obtained in this model by sampling from a Kolmogorv-Smirnov distribution. This block-Gibbs sampler updates z and w jointly conditioned on lambda (as in the Probit model), then samples lambda conditioned on z and w.

The 3rd strategy for the Logistic model (logist2Sample2.m) is the 2nd block-Gibbs sampling strategy of Holmes and Held. In this second approach, z and lambda are updated jointly given w (z is sampled from a truncated Logistic distribution), then w is sampled conditioned on z and lambda.

Sparse Logistic Regression: A 4th strategy was implemented for a slightly different Logistic model (logist_FS_Sample.m). In this model, we have an additional set of variables gamma that indicate whether a variable is included in the model. The effect of this is that each sample only depends on a subset of the variables, and sampling gamma lets us examine a posterior distribution over whether each variable is 'relevant' to the classification. This function implements the method described in Holmes and Held, which augments the 2nd Logistic strategy above with reversible-jump trans-dimensional moves to update gamma.

Usage

All the methods have a common interface:
s = *(X,y,v,nSamples)
(where * is one of the methods above, and the output will be nVariables by nSamples)

Also included is IRLS code that returns the MAP estimate of the Logistic model (and optionally the Asymptotic Covariance matrix). This code has the interface:
w = L2LogReg_IRLS(X,y,v)

Example

Running 'example_ProbRegSamp' will load a data set, generate 500 samples from the 2 Probit Regression samplers, then display histograms of the samples for the first 9 variables. Running 'example_LogRegSamp' will apply the 3 Logistic Regression samplers, and running 'example_LogRegRJSamp' will apply the Reversible-Jump Sparse Logistic Regression sampler.

Download

The complete set of .m files are available here. The report for this class project is available here. Some of the samplers also use RANDRAW.

The blogreg package contains many sub-directories that must be present on the Matlab path for the files to work. You can add these sub-directories to the Matlab path by typing (in Matlab) 'addpath(genpath(blogerg_dir))', where 'blogreg_dir' is the directory that the zip file is extracted to.

Note that a bug in the sampleLambda.m function was fixed on July 18, 2013 (thanks to Sharon Chiang for pointing this out).

Citations

If you use this code in a publication, please cite the work using the following information:

M. Schmidt. Bayesian Logistic Regression through Auxiliary Variables. CS535D Project Report, 2006.

Multinomial Logistic Regression

Hongxia Yang wrote a version of this code for multinomial logistic regression. This was added to the archive (in the 'multinomial' directory) on September 17, 2009. The file 'MLogist_example.m' shows how to run this code.

Mark Schmidt > Software > blogreg