LCI Forum Talk

PMTK is a new Matlab package for probabilistic modeling of data, being developed by Matt Dunham and Kevin Murphy, with additional contributions from Mark Schmidt, Cody Severinski and others. The toolkit is built around the "holy trinity" of Bayesian statistics, graphical models and machine learning. It provides a unified framework which encompasses a large fraction of the most widely used statistical models, including multivariate Gaussians, mixture models, (sparse) linear and logistic regression models, directed and undirected graphical models, etc. Also, a large variety of algorithms are supported, for both Bayesian inference (including exact computation, deterministic and stochastic approximations) and MAP/ML estimation (including EM, bound optimization, conjugate and projected gradient methods, etc.)

PMTK is primarily designed to accompany Murphy's textbook "Machine learning: a probabilistic approach" (work in progress), but can also be used independently of it. Consequently, PMTK provides a simple, unified interface to a large variety of methods. It uses the latest object oriented features of Matlab (introduced in 2008a) to control the complexity in a disciplined way. In addition to providing this layer of common "syntactic sugar" on top of existing code, it aims to provide readable implementations of the most commonly used models and algorithms, while simultaneously being reasonably efficient.

PMTK version 1, which was started in August 2008, is already available as open source on Google code (*). However, we are currently in the middle of a complete reimplementation, based on a cleaner design. In this talk, we will describe the design principles behind the new version (PMTK 2), as well as providing a few live demos, and a comparison to other existing toolkits, such as BNT, BUGS, Weka, etc.

* http://www.cs.ubc.ca/~murphyk/pmtk/