Topics in Computer Graphics / AI (CPSC 533R/532R)

Deep Learning models for Computer Graphics and Computer Vision

Winter Term 2, 2019/2020 - Course Information

This page is from a past lecture. If you are looking for the ongoing/upcoming course, select it here.

Computer Graphics (CG) enables artists to realize their creative visions. Technically, one is concerned with efficiently simulating light transport for photorealistic rendering, finding the right parametrization for the shape and appearance of objects, and making these digital models accessible to the artist. Creating digital content is a tedious process, which is alleviated by Computer Vision (CV) by reconstructing real scenes from natural images.

This course will explore the recent trend of using Deep Learning (DL) for the above-mentioned tasks; neural rendering, generative networks, disentangled representation learning, and reconstruction using neural networs. The first third of the course will introduce the essential tools (CNN architectures, attention models, output representations, and batch-norm layers), those needed to understand the advanced topics covered in the main part (self-supervised and generative models, such as VAEs and GANs).

A central theme is the learning of neural representations from labeled and unlabeled data. At one end, we will discuss approaches that can reconstruct a person photorealistic, from a single image and up to the scale of wrinkles, by using a volumetric grid with thousands of parameters. On the other end, we will study techniques for learning compact and meaningful embeddings, such as a parametric face model that decomposes expression, appearance, and pose and makes these latent dimensions controllable by the artist. At the end of the course, students will be able to work with a diverse spectrum of DL tools and be prepared to conduct research on how to create models that yield both, high visual quality and have artistic utility.

Changes due to COVID-19, running online from March 16th

Summary: Most things (course times, office hours, reviews, presentations) will stay the same, except that meetings are online and grading was refined to reduce workload.

Details: We will meet at the usual times, now on Collaborate Ultra on Canvas. You can access Collaborate via
Canvas -> CPSC 532R/533R 201 -> Collaborate Ultra Menue item -> Join Course Room.
At the start of Collaborate Ultra your sound and video settings will be checked, use and select a headset (e.g., one you use on your phone) to ensure good sound quality. Once your input devices are set, you can join the running session and participate in class as usual. For privacy reasons, we will not use video for now. There is a 'raise hand' button on the bottom of the screen, use that to indicate as you would do in a real class. Instead of using "Join Course Room" in the instructions above, there is also the option to join by phone, for those that have an unlreliable internet connection. Please be in time (the room opens 10 minutes in advance!) for the class to not making the others wait.

Online Peer Discussion: Please send in the reviews of the papers we read as usual. Because classroom discussion will be hampered in the online format, will exploit the Canvas assignment and evaluation system for peer discussion. You will receive a peer evaluation request on the day of the presentation. Everybody will be assigned one review at random. It would be great if you could 1) answer the questions posed by your fellow and 2) comment on the future work idea until the next day (Wednesday evening for the one on Tuesday and Friday evening for the one on Thursday). You may also 3) give feedback on the quality of the paper review, which may help and encourage the author. For all three points, use the evaluation comment box. We recommend using another editor and copy+paste, because this edit box is small. Moreover, post particularly difficult and interesting questions and your thoughts on them via Piazza, such that we can discuss it all together. While the reviews are still mandatory, the peer review is optional and will not be graded.

Preparing your presentation: You can either present live (error-prone in case of unreliable internet connection/software failure) or pre-record their presentation as described in the following slides (recording instructions) and we will watch them together during the normal lecture time.

The presentation should still be 20 minutes long. Export your presentation (voice and slides) as a video. Let us know if you have any issues following the instructions above. Powerpoint and keynotes are easy to use. They let you record a normal presentation and update the speech for individual slides, should you not be happy about the initial outcome. You can still develop slides in google slides and other tools, simply export them to the Powerpoint format, and use Powerpoint only for recording.

If you haven't done such a recording before, try it out as soon as possible.

New grading scheme: The final project will contribute either 5% or 40% of the course grade:

Original Grading Scheme from the syllabus: Course project (40%), Assignments (25%), Presentation (35%)
Revised Scheme: Course project (5%), Assignments (50%), Presentation (45% = 15% reviews + 30% presentation)

Students will automatically receive the weighting that maximizes their course grade, you do not need to contact me or fill out a form.

In the refined scheme, the assignments receive a larger (relative) weight since they have been finished before COVID-19 pressure. We will make the presentation grading available soon and publish the mean score. This will enable all of you to make an informed decision on the time/grade tradeoff when working on the final project. Note that the Presentation score depends on the presentation as well as the reviews you submit.

End changes due to COVID-19

Registration and official listing: CPSC 532R CPSC 533R

Instructor: Helge Rhodin (rhodin@cs.ubc.ca)
Office hours: Wednesday 11 am – noon, room ICCS X653

Teaching Assistant: Yuchi Zhang
Office hours: Tuesday 1 pm – 2 pm, room ICCS X341

Classes: TR 9:30-11:00 (winter term 2)
Room: ICCS 246

Piazza: Piazza.com/ubc.ca/winterterm22019/cs532533

Compute: The assignments and course projects will deal with images and require CUDA-capable GPUs for training. You can either use your own GPU, open cloud services like google colab, or UBC machines lin01.students.cs.ubc.ca to lin25.students.cs.ubc.ca.

Prerequisites

Machine Learning. It is recommended to first study CPSC 340 or an equivalent course. Before starting this course, the following ML terms should be familiar to you: stochastic gradient descent and momentum; classification and regression; weight initialization and data whitening; training, validation and test set; convolution and translational invariance; and regularization and weight decay. Please contact me if you have any doubts.

Python. The exercises are designed to prepare students for the practical project and provide a step-by-step introduction to the PyTorch machine learning framework. While no prior knowledge on PyTorch is required, essential python experience is expected; no python programming tutorial will be offered.

Prior courses in CG and CV are a plus and course projects that build upon these are encouraged.

Grading

Below describes the original scheme, changes due to COVID-19 are explained above.
Assignments (25%):

A1. Playing with PyTorch (5%)
A2. 3D human pose estimation, end-to-end training (10%)
A3. Learning shape models, unsupervised training (10%)

Presenting research (35%):

Reviewing papers (10%)
Presenting one paper (25%)

Project (40%)

Proposal and experiment design
Coding and evaluation
LaTeX report
Short project presentation

Auditing

Audits are permitted on a case-by-case basis. Auditing students have to participate actively in the reading part of the class, i.e., reading of the papers, writing a review and participating in the discussion. Audits are more than welcome to also work on the assignments.

Lectures (preliminary schedule)

Date		Content	Reading
W1	Jan 7	Introduction lecture slides - Challenges in using deep learning for creative tasks - Course expectations and grading - First steps in PyTorch Homework 1 release assignment1_V1.1.zip	SIGGRAPH program / video Pytorch intro
W1	Jan 9	Deep learning basics and best practices lecture slides suppl. slides lecture2.ipynb - regression/classification, objective functions - stochastic gradient descent, vanishing and exploding gradients.	Deep Learning Book - Chapter 8 Adam Optimizer
W2	Jan 14	Network architectures for image processing lecture slides - Which neural network architectures work, why and how? - ResNet, DenseNet, UNet, FlowNet, MaskRCNN	Deep Learning Book - Chapter 9 ResNet, Unet
W2	Jan 16	Advanced architectures and representing sparse 2D keypoints lecture slides lecture4.ipynb - heat maps, part-affinity fields - regression vs. classification Homework 1 due. Homework 2 release Assignment2.zip (updated)	Heat Maps Part Affinity Fields
W3	Jan 21	Representing 3D skeletons and point clouds lecture slides - PointNet, articulated skeletons - Chamfer distance and other metrics (MPJPE, PCK) - Affine and perspective transformations	PointNet
W3	Jan 23	Representing and learning shapes lecture slides - voxels, implicit functions, location maps - uv-coordinates, graph CNN, spiral convolution	Dense Pose Location Maps Spiral convolution
W4	Jan 28	Representation learning I (deterministic) lecture slides - principal component analysis (PCA) - auto-encoder (AE) Homework 2 due. ~~Homework 3 release~~	PCA face model Deep Learning Book - Chapter 14
W4	Jan 30	Representation learning II (probabilistic) lecture slides - variational autoencoder (VAE) - generative adversarial network (GAN) Homework 3 release Assignment3.zip (posted Feb. 1)	Deep Learning Book - Chapter 20
W5	Feb 4	Sequential decision making lecture slides - Monte Carlo methods - reinforcement learning	Deep Learning Book - Chapter 17
W5	Feb 6	GANs and unpaired image translation lecture slides - cycle consistency - style transfer ~~Homework 3 due~~	Cycle Gan Style transfer
W6	Feb 11	Attention models lecture slides - spatial transformers, RoI pooling, attention maps - camera models and multi-view Homework 3 due (new deadline)	RoI pooling, Spatial Transformer Multi-view Geometry
W6	Feb 13	Project Pitches (3 min pitch) Project proposal due
W7		Midterm Break (no class)	-
W8	Feb 25	Conditional content generation Park et al., Semantic Image Synthesis with Spatially-Adaptive Normalization paper Li et al., Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments paper	Read one of the two papers listed for each course. Submit review on the day before.
W8	Feb 27	Motion transfer Chan et al, Everybody Dance Now paper Gao et al., Automatic Unpaired Shape Deformation Transfer paper
W9	March 3	Hints on writing the paper review Character animation Rhodin et al., Interactive Motion Mapping for Real-time Character Control paper Holden et al., Phase-Functioned Neural Networks for Character Control paper
W9	March 5	Self-supervised learning Vondrick et al., Tracking Emerges by Colorizing Videos paper Doersch et al., Unsupervised visual representation learning by context prediction paper
W10	March 10	Recording presentations Novel view synthesis Hinton et al., Transforming Auto-encoders, paper Rhodin et al., Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation, paper
W10	March 12	Differentiable rendering Rhodin et al., A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation, paper Liu et al., Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning, paper Sitzmann et al., DeepVoxels: Learning Persistent 3D Feature Embeddings paper
W11	March 17	Learning person models Lorenz et al., Unsupervised Part-Based Disentangling of Object Shape and Appearance paper Rhodin et al., Neural Scene Decomposition for Human Motion Capture paper
W11	March 19	Object parts and physics Li et al., GRASS: Generative Recursive Autoencoders for Shape Structures, paper Xie et al., tempoGAN: A Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow paper
W12	March 24	Objective functions and log-likelihood Christopher Bishop, Mixture Density Networks paper Jonathan T. Barron, A General and Adaptive Robust Loss Function paper
W12	March 26	Self-supervised object detection Crawford et al., Spatially invariant unsupervised object detection with convolutional neural networks. paper Bielski and Paolo Favaro, Emergence of Object Segmentation in Perturbed Generative Models, paper
W13	March 31	Mesh processing Bagautdinov et al., Modeling Facial Geometry using Compositional VAEs paper Verma et al., Feastnet: Feature-steered graph convolutions for 3d shape analysis paper
W13	April 2	Neural rendering Saito et al., PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization paper How to write a paper/reportlecture slides
W14	April 7	Project Presentations. (10 min talk per group, first half of groups)	-
	April 9	Project Presentations. (10 min talk per group, second half of groups)	-
	April 14	Project Report submission. (8 page PDF document, 11:59 pm)	-

Assignments

While the assignments add points to the final grade, their primary goal is to prepare you for the project. The first assignment is to set up a general training framework that lets one experiment with different datasets, neural network architectures, and objective functions. This framework will be extended and used in the remaining two assignments and also the final project. Instead of step-by-step instructions, exercises are posed as high-level tasks. Questions are set up to train independent thinking, which is integral to solving actual problems in research and industry settings. This does not mean that you are without support; you are highly encouraged to search online and consult the abundance of deep learning tutorials as a basis for your solution. Furthermore, office hours are set up to provide personal assistance.

Academic Integrity. Assignments must be solved individually using the available course material and other online sources. You are neither allowed to copy nor look at parts of any of the assignments from anyone else. Accordingly, it is not allowed to post solutions online or distribute them in (private) forums. The university policies on academic integrity are rigorously applied.

Submission. Solutions must be handed in through the Canvas system and be kept private.

Deadline and grading. Assignments will be due on the dates specified in the schedule, always at 11:59 pm PST. A late submission by one day will still be accepted but reduces the score by -25%. The grading is based on correctness and completeness, as detailed in each exercise description.

Research papers

The second half of the course will constitute a reading group, focussing on seminal and latest research. Each student will write reviews every week and is expected to participate actively in the discussion, along with presenting once during the course. Preferences will be considered for the presentation topic assignment.

Reviews. All students are expected to read one of the two papers that will be presented and discussed in each class. Nevertheless, it is recommended to at least skim through both the papers. Each presentation will be followed by an in-depth discussion on technical aspects, relations to other works, and practical utility. To initiate and start new arguments, a short review of the selected paper will have to be submitted, on the day before the class. Try to raise non-obvious aspects with the following:

A summary of the main contribution
One question that is well suited for discussion
One strength
One weakness

These reviews will be distributed across all course participants. Keep your review concise (less than half a page) and self-contained.

Presentation. The presentation should be compact and precisely 20 minutes long. Focus on the most crucial points, those needed to understand the content, novelty, and impact of the given work. Most importantly, draw links to previous presentations, lecture content, and the related work you know. Such an embedding into a broader context will make your presentation unique and stimulating, especially for those who have read the underlying paper in detail. You can take the paper illustrations and other available online material, such as videos supplementing the paper, as a basis for your presentation, as long as you reference the source adequately on each slide. You will be the expert and be required to study the paper in sufficient depth to answer questions and trigger a rich discussion. Practice your talk before, ideally in front of a small audience, to get the timing and the messages right. Note that a part of your audience will have read the paper in detail and others will only have skimmed through. Therefore, try to address experts and newcomers alike. The tutor will assist you in this and give feedback on your planned presentation once.

Deadlines and grading.

The review must be submitted on the evening before the class, by 11:59 pm via Canvas. Missing out on a total of two reviews will be tolerated. Full points for a review are given if all four of the above-listed bullet points are answered. On the day of your presentation, send in your sldies exported as PDF instead of the review (see below).
The presentation slides must be handed in and be discussed with the tutor latest by two working days before the presentation. It is your responsibility to set up a meeting (~30 min duration) with the TA in advance. This session is to your own benefit and will not be graded. Submit your final slides on Canvas, the Slide upload/Presentation Assignment.
Presentation. Arrive early on the day of your presentation and ensure that you have all the necessary equipment with you and that your presentation runs smoothly. We grade the presentation based on quality, talk structure, and the context established to related work.

Project

The whole course is structured to prepare for the final research project. The goal is to either extend an existing approach, combine existing tools for solving a new problem, or to come up with an entirely new algorithm. Some topics and open research problems are given in class, but we highly encourage you to come up with your own idea on the basis of your unique background and experience. Projects can deviate from the core lecture topics, as longs as they are within CG, CV, and ML. You are encouraged to work in groups of two. Joining forces will allow you to complement your strengths and work on more challenging and rewarding problems. Nevertheless, both of you must have unique responsibilities, which have to be explicitly listed, and be involved in all of the following project stages:

Project proposal. You must deliver a 3-minute pitch and an accompanying written plan (one page, 11pt font). The proposal must cover the research idea, the possible algorithmic contributions, and an outline of the evaluation.
Literature review. Study the most related work. For instance, you can start with those papers discussed in class that are similar to your task, and search for more papers that cite and are cited by them. Google Scholar has the appropriate tools for that. This will form the base of your related work section.
Development and coding. Make use of the frameworks developed during the course assignments and utilize other online sources as a basis. The project is not meant to re-invent the wheel but to develop something new on top of the existing works. The code must be submitted alongside the report, and your contributions and the places from where the remaining parts are taken must be documents in the code and report.
Evaluation. Design and run experiments that validate the correctness of your contribution and, ideally, show qualitative or quantitative improvements on existing solutions. Showing that an approach does not work well can also be a contribution, but is generally harder to get "waterproof".
Report writing. In an case, the report becomes an expanded version of the proposal plan that fills out the motivation, related work, method description, and evaluation sections. It should follow the overall structure of a research paper and be 8 pages long in the double-column format of CVPR [link]. Start early on the writing, e.g., compile the related work section while you do the literature review. Writing is an incremental process that requires many iterations until the outcome is mathematically sound, is explained clearly, and reads well.
Supplemental document (optional). In case the 8 pages are insufficient, proofs and implementation details can be given in an appendix, and a video can be submitted to give additional qualitative results in motion. However, the main report must remain self-contained.

Deadlines and grading.

Project proposal, pitch, report, and final presentations will be due on the dates and times specified in the schedule above (proposal and pitch before the mid-term break, presentations in the last two weeks, submitting the report on the last day of classes). A late submission of one day reduces the gained points by -25%. The grade will mainly depend on the project definition, how well you present them in the report, how distinctly you position your work in the related literature, how thorough your experiments are, and how thoughtfully your contribution is derived.