About the course

The analysis of data (DNA, music, images, video, news, blogs, medical records, software, computer game logs, multimedia, social networks, environmental signals) is an important frontier in computer science. This frontier is expanding vastly thanks to new developments in mathematical modelling, algorithms, data management and computing infrastucture. It is having a profound impact not only in science and medicine, but also in e-commerce, marketing and business in general. Inference and learning with massive datasets is the key ingredient of the intelligent machines of the future.

This course will provide an introduction to this exciting growing field. It will teach the basic principles and skills required for analysing data in a principled way: finding statistical patterns, dimensionality reduction, clustering, classification and prediction. Students will also have the opportunity of learning Python, a widely used programming language.

Logistics

Time: Mon Wed Fri 1:00-2:00pm

Location: Dempster 301

Instructor: Nando de Freitas (nando@cs)

Office hours: Fri 2:00-3:00 (ICICS 104)

TAs: Nimalan Mahendran (nimalan@cs) and Eric Brochu (ebrochu@cs)

Tutorial 1: Mon 9:00-10:00am ( Frank Forward 519)

Tutorial 2: Fri 3:00-4:00pm ( FSC 1613)

Newsgroup: ubc.courses.cpsc.340

About the course

There's no official textbook. I will provide lots of handouts, but do recommend the following books:
  • Information Theory, Inference, and Learning Algorithms. It's free and good!!!
  • The elements of Statistical Learning.
  • Pattern Recognition and Machine Learning.
  • All of Statistics.
  • Pattern Classification.

    Grading

  • Assignments: 20%
  • Midterm 1: 20%. (Wed Oct 21)
  • Midterm 2: 20%. (Wed Nov 18)
  • Final: 40% (3:30pm, December 9)
  • There will also be a special research project for advanced students.
  • The instructor has the right to change the marking scheme under reasonable circumstances.

    Assignments

  • Assignments will involve both written and python programming problems.
  • All assignments are due on the specified time. 20% off for each day late. Assignments will not be accepted after 5 days late.
  • LATEST :

    • The machine learning book of Hastie, Tibshirani and Friedman is now online: The elements of statistical learning.
    • Chapters 14,15 and 20 of the artificial intelligence book Stuart Russell and Peter Norvig is strongly recommended reading for this course. I'll provide partial photocopies of chapters 14 and 15 in class. Chapter 20 is available online.
    • This AIspace page at UBC has lots of videos and applets about inference in directed probabilistic graphical models (aka Bayesian networks or belief networks).
    • For graphical models and Beta-Bernoulli models, I recommend A Tutorial on Learning with Bayesian Networks David Heckerman.
    • Kevin Murphy has compiled a nice page about Bayesian learning.
    • Wikipedia tutorial on the: SVD
    • The following handout should help you with linear algebra revision: PDF
    • The homework should be handed in on Wednesday at the beginning of the class. Please note that messy homeworks will be penalized - it is your responsibility to ensure that the material is presented in a clear written form. All pseudocode must be handed in. Please don't forget to add your name and student number.

    USEFUL LINKS :