CPSC 540: Machine learning

CPSC 540

About the course

Machine learning (ML) is one of the fastest growing areas of science. It is largely responsible for the rise of giant data companies such as Google, and it has been central to the development of lucrative products, such as Microsoft’s Kinect, Amazon’s recommender system, the spam detection systems of Facebook, and the advertising engines of these and many other companies. ML is the key enabling technology behind face detection in consumer cameras, news personalization, book and movie recommender systems, image and video search, credit card fraud detection, speech recognition systems, and many more applications that most people have begun to take for granted. ML has also begun to make it possible to have automatically-driven cars, more efficient energy management systems, and improved systems for health-care management.

Academically, ML is one of the fastest growing fields in all fronts: Theory, methodology and application. ML for historical reasons is strongly connected to computer science and statistics departments in North America. However, it is also revolutionizing biology, astrophysics, engineering, and all other areas of science. ML innovations, such as boosting and SVMs among others, have strongly impacted statistics in recent years, and the interplay of statistics and ML has left us with tools such as random forests (a key component of the kinect sensor). Tools from bandits and reinforcement learning are impacting operations research in business and health-care.

Logistics

Time: Tue Th 12:30-2:00pm

Location: Dempster 110

Instructor: Nando de Freitas (nando@cs)

Office hours: Wednesday 3:00-4:00pm (ICICS 146) Thursday 2:00-3:00pm (ICICS 204)

TA: Bobak Shahriari (bshahr@cs)

Bobak's office hours: Friday 2:00-4:00pm (ICICS 146)

Online discussion: cpsc540 discussion group

Lecture videos

Textbook

My favourite book for this course is the one by Kevin Murphy titled Machine Learning: a Probabilistic Perspective. If you're serious about ML, buy a copy. Kevin Murphy has also developed a matlab toolbox for his book. It is called PMTK. However, the language for this course will be Python.

The machine learning book of Hastie, Tibshirani and Friedman is also a great resource and it is free online: The elements of statistical learning. The book of Rajaraman and Ullman on Mining of Massive Datasets is also available online and is a great source of ideas for large scale implementation of machine learning and recommender systems.

Grading

Assignments: 30%

Exam: 30% (April 4)

Project: 40%

The instructor has the right to change the marking scheme under reasonable circumstances agreed to by majority voting in class.

Assignments

Assignments will involve both written and python programming problems.

All assignments are due on the specified date at 12:30pm. They are to be handed in at the classroom where the lecture takes place.

If the assignment is due on Tuesday and you hand it in the subsequent Thursday, it is penalized 20%. If the assignment is due on Thursday and you hand in the next Tuesday, it is penalized 40%. Hence, If it is due on Tuesday and it is handed in the next Tuesday, it will be penalized 60%. Delaying the hand in to the Thursday after that would increase the penalty to 80% and beyond that the assignment will receive a zero mark.

Messy homeworks will be penalized - it is your responsibility to ensure that the material is presented in a clear written form. All pseudocode must be handed in always. Please don't forget to add your name and student number. Please staple your homework.

Academic honesty is important. If you find the answer to the homework on a website, book, etc. please acknowledge this in the front page of your homework. You will not be penalized. We like people who acknowledge the source and we don't mind if you seek help from friends to solve the homework problems. On the other hand, we encourage team work. However you must ensure that you understand what you are doing, otherwise you're missing on the learning experience. Not doing the homeworks will also impact your performance on the exams.