Machine learning textbook
Machine Learning: a Probabilistic Perspective
A new textbook by Kevin Murphy (in preparation, to be published
by MIT Press, August 2012)
My book (MLaPP) is similar
to Bishop's
Pattern recognition and machine learning,
Hastie et al's
The Elements of Statistical Learning,
and to Wasserman's
All of statistics,
with the following key differences:
- MLaPP is more accessible to undergrads.
It pre-supposes a background in probability, linear algebra,
calculus, and programming;
however, the mathematical level ramps up slowly, with more difficult
sections clearly denoted as such. This makes the book suitable for
both undergrads and grads.
Appendices provide summaries of the relevant mathematical background,
on topics such as linear algebra, optimization and classical
statistics, making the book self-contained.
- MLaPP is more practically-oriented.
In particular, it comes with Matlab software
to reproduce almost every figure, and to implement
almost every algorithm, discussed in the book.
It includes many worked examples of the methods applied to real
data, with readable source code online.
- MLaPP covers various important topics that are not discussed in
these other books, such as conditional random fields,
deep learning,
etc.
- MLaPP is "more Bayesian" than the Hastie or Wasserman books,
but "more frequentist" than the Bishop book. In particular, in MLaPP,
we make extensive use of MAP estimation, which we regard as "poor
man's Bayes". We prefer this to the regularization interpretation of
MAP, because then all the methods in the book (except cross
validation...) can be viewed as probabilistic inference,
or some approximation thereof. The MAP interpretation also allows for
an easy "upgrade path" to more accurate methods of approximate
Bayesian inference, such as empirical Bayes, variational Bayes, MCMC,
SMC, etc.
- The emphasis is on simple parametric models (linear and logistic
regression, discriminant analysis/ naive Bayes, mixture models, factor
analysis, graphical models, etc.), which are the ones most often used
in practice.
However, we also briefly discuss non-parametric models, such as Gaussian
processes and Dirichlet processes.