\documentclass{article}
\usepackage{fullpage}
\usepackage{color}
\usepackage{amsmath}
\usepackage{url}
\usepackage{verbatim}
\usepackage{graphicx}
\usepackage{parskip}
\usepackage{amssymb}
\usepackage{nicefrac}
\usepackage{listings} % For displaying code
% Answers
\def\ans#1{\\\gre{Answer: #1}}
% Colors
\definecolor{blu}{rgb}{0,0,1}
\def\blu#1{{\color{blu}#1}}
\definecolor{gre}{rgb}{0,.5,0}
\def\gre#1{{\color{gre}#1}}
\definecolor{red}{rgb}{1,0,0}
\def\red#1{{\color{red}#1}}
\def\norm#1{\|#1\|}
% Math
\def\R{\mathbb{R}}
\def\argmax{\mathop{\rm arg\,max}}
\def\argmin{\mathop{\rm arg\,min}}
\newcommand{\mat}[1]{\begin{bmatrix}#1\end{bmatrix}}
\newcommand{\alignStar}[1]{\begin{align*}#1\end{align*}}
% LaTeX
\newcommand{\fig}[2]{\includegraphics[width=#1\textwidth]{a0f/#2}}
\newcommand{\centerfig}[2]{\begin{center}\includegraphics[width=#1\textwidth]{#2}\end{center}}
\def\items#1{\begin{itemize}#1\end{itemize}}
\def\enum#1{\begin{enumerate}#1\end{enumerate}}
\begin{document}
\title{CPSC 340 Assignment 0 (due Friday September 15 ATE)}
% \author{Warm-up}
\date{}
\maketitle
\vspace{-4em}
\emph{Rationale for Assignment 0}: CPSC 340 is tough because it combines knowledge and skills across several disciplines. To succeed
in the course, you will need:
\begin{itemize}
\item Math to the level of the course prerequisites: linear algebra, multivariable calculus, some probability.
\item Basic Julia programming, and the ability to translate from math to programming and back.
\item Statistics, algorithms, and data structures to the level of the course prerequisites.
\item Some basic LaTeX skills so that you can typeset equations and submit your assignments.
\end{itemize}
The purpose of this assignment is to make sure you are prepared for this course. I anticipate that each
of you will have different strengths and weaknesses, so don't be worried if you struggle with \emph{some} aspects
of the assignment. But if you find this assignment
to be very difficult overall, that is an early warning sign that you may not be prepared to take CPSC 340
at this time. Future assignments will be more difficult than this one.
\textbf{IMPORTANT!!!!! Before proceeding, please carefully read the homework instructions}:\\ \url{www.cs.ubc.ca/~schmidtm/courses/340-F17/assignments.pdf}.
You may also want to read the answers to this Quora question as motivation:\\
{\scriptsize \url{https://www.quora.com/Why-should-one-learn-machine-learning-from-scratch-rather-than-just-learning-to-use-the-available-libraries}}
We use \blu{blue} to highlight the deliverables that you must answer/do/submit with the assignment.
\section{Linear Algebra Review}
For these questions you may find it helpful to review these notes on linear algebra:\\
\url{http://www.cs.ubc.ca/~schmidtm/Documents/2009_Notes_LinearAlgebra.pdf}
\subsection{Basic Operations}
Use the definitions below,
\[
\alpha = 2,\quad
x = \left[\begin{array}{c}
0\\
1\\
2\\
\end{array}\right], \quad
y = \left[\begin{array}{c}
3\\
4\\
5\\
\end{array}\right],\quad
z = \left[\begin{array}{c}
1\\
2\\
-1\\
\end{array}\right],\quad
A = \left[\begin{array}{ccc}
3 & 2 & 2\\
1 & 3 & 1\\
1 & 1 & 3
\end{array}\right],
\]
and use $x_i$ to denote element $i$ of vector $x$.
\blu{Evaluate the following expresions} (you do not need to show your work).
\enum{
\item $\sum_{i=1}^n x_iy_i$ (inner product).
\item $\sum_{i=1}^n x_iz_i$ (inner product between orthogonal vectors).
\item $\alpha(x+y)$ (vector addition and scalar multiplication).
\item $\norm{x}$ (Euclidean norm of $x$).
\item $x^T$ (vector tranpose).
\item $A^T$ (matrix transpose).
\item $Ax$ (matrix-vector multiplication).
}
\subsection{Matrix Algebra Rules}
Assume that $\{x,y,z\}$ are $n \times 1$ column vectors and $\{A,B,C\}$ are $n \times n$ real-valued matrices. \blu{State whether each of the below is true in general} (you do not need to show your work).
\begin{enumerate}
\item $x^Ty = \sum_{i=1}^n x_iy_i$.
\item $x^Tx = \norm{x}^2$.
\item $x^T(y+z) = z^Tx + x^Ty$.
\item $x^T(y^Tz) = (x^Ty)^Tz$.
\item $AB=BA$.
\item $A(B + C) = AB + AC$.
\item $(AB)^T = A^TB^T$.
\item $x^TAy = y^TA^Tx$.
\item $\det A \neq 0 \iff A$ is invertible.
\end{enumerate}
\subsection{Special Matrices}
\blu{In one sentence, write down the defining properties of the following special types of matrices}:
\enum{
\item Symmetric matrix.
\item Identity matrix.
\item Orthogonal matrix.
}
\section{Probability Review}
\subsection{Rules of probability}
\blu{Answer the following questions.} You do not need to show your work.
\begin{enumerate}
\item You flip 4 fair coins. What is the probability of observing 3 heads?
\item You are offered the opportunity to play the following game: your opponent rolls 2 regular 6-sided dice. If the difference between the two rolls is at least 3, you win \$12. Otherwise, you get nothing. What is a fair price for a ticket to play this game once? In other words, what is the expected value of playing the game?
\item Consider two events $A$ and $B$ such that $\Pr(A, B)=0$. If $\Pr(A) = 0.4$ and $\Pr(A \cup B) = 0.95$, what is $\Pr(B)$? Note: $p(A, B)$ means
``probability of $A$ and $B$'' while $p(A \cup B)$ means ``probability of $A$ or $B$''. It may be helpful to draw a Venn diagram.
\end{enumerate}
\subsection{Bayes Rule and Conditional Probability}
\blu{Answer the following questions.} You do not need to show your work.
Suppose a drug test produces a positive result with probability $0.95$ for drug users, $P(T=1|D=1)=0.95$. It also produces a negative result with probability $0.99$ for non-drug users, $P(T=0|D=0)=0.99$. The probability that a random person uses the drug is $0.0001$, so $P(D=1)=0.0001$.
\begin{enumerate}
\item What is the probability that a random person would test positive, $P(T=1)$?
\item In the above, do most of these positive tests come from true positives or from false positives?
\item What is the probability that a random person who tests positive is a user, $P(D=1|T=1)$?
\item Are your answers from part 2 and part 3 consistent with each other?
\item What is one factor you could change to make this a more useful test?
\end{enumerate}
\subsection{Bayes Rule and Independence}
On a game show, a contestant is told the rules as
follows:
\begin{quote}
There are three doors, labelled 1, 2, 3. A single
prize has been hidden behind one of
them. You get to select one door. Initially your chosen door will {\em not\/}
be opened. Instead, the gameshow host will open one of the other two doors,
and {\em he will do so in such a way as not to reveal the prize.}
For example, if you first
choose door 1, he will then open {one\/} of doors 2 and 3, and it
is guaranteed that he will choose which one to open so that
the prize will not be revealed.
At this point, you will be given a fresh choice of door:
you can either stick with your first choice,
or you can switch to the other
closed door. All the doors will then be opened and
you will receive whatever is behind your final
choice of door.
\end{quote}
Imagine that the contestant chooses door 1 first; then the gameshow host
opens door 3, revealing nothing behind the door, as promised.
\blu{Should the contestant (a) stick with door 1, or (b)
switch to door 2, or (c) does it make no difference?}
Assume that initially the prize is equally likely to be
behind any of the 3 doors, that the host always opens a door that doesn't contain a prize, and that if the prize is behind the selected door that the host is equally likely to choose door 2 or 3.
\section{Calculus Review}
\subsection{One-variable derivatives}
\blu{Answer the following questions.} You do not need to show your work.
\begin{enumerate}
\item Find the minimum value of the function $f(x) = 3x^2 -2x + 5$ for $x \in \R$.
\item Find the maximum value of the function $f(x) = x(1-x)$ for $x$ in the interval $[0,1]$.
\item Find the minimum value of the function $f(x) = x(1-x)$ for $x$ in the interval $[0,1]$.
\item Let $p(x) = \frac{1}{1+\exp(-x)}$ for $x \in \R$. Compute the derivative of the function $f(x) = -\log(p(x))$ and simplify it by using the function $p(x)$.
\end{enumerate}
Remember that in this course we will $\log(x)$ to mean the ``natural'' logarithm of $x$, so that $\log(\exp(1)) = 1$. Also, obseve that $p(x) = 1-p(-x)$ for the final part.
\subsection{Multi-variable derivatives}
\blu{Compute the gradient $\nabla f(x)$ of each of the following functions.}
\begin{enumerate}
% \item $f_1(x) = \sin(x)$
\item $f(x) = x_1^2 + \exp(x_2)$ where $x \in \R^2$.
\item $f(x) = \exp(x_1 + x_2x_3)$ where $x \in \mathbb{R}^3$.
\item $f(x) = a^Tx$ where $x \in \R^2$ and $a \in \R^2$.
\item $f(x) = x^\top A x$ where $A=\left[ \begin{array}{cc}
2 & -1 \\
-1 & 1 \end{array} \right]$ and $x \in \mathbb{R}^2$.
\item $f(x) = \frac{1}{2}\norm{x}^2$ where $x \in \R^d$.
\end{enumerate}
Hint: it is helpful to write out the linear algebra expressions in terms of summations.
\subsection{Derivatives of code}
The zip file \texttt{a0.zip} contains a Julia file named \texttt{grads.jl} which defines several functions. \blu{Complete the functions \texttt{grad1}, \texttt{grad2}, and \texttt{grad3} (which compute the gradients of \texttt{func1}, \texttt{func2}, and \texttt{func3})}. Include the code in PDF file for this section, and also in your zip file.
Hint: for many people it's easiest to first understand on paper what the code is doing, then compute
the gradient, and then translate this gradient back into code. We've given you \texttt{func0} and \texttt{grad0} as an example. Also, we've provided the function \texttt{numGrad} which approximates the gradient numerically. Below is an example of using these functions:\\
%\fig{.6}{grads.png}
Note: do not worry about the distinction between row vectors and column vectors here.
For example, if the correct answer is a vector of length 5, we'll accept vectors of size $5 \times 1$ or $1 \times 5$. In future assignments we will be more careful to always use column vectors.
\section{Algorithms and Data Structures Review}
\subsection{Trees}
\blu{Answer the following questions} You do not need to show your work.
\begin{enumerate}
\item What is the maximum number of \emph{leaves} you could have in a binary tree of depth $l$?
\item What is the maximum number of \emph{internal nodes} (excluding leaves) you could have in a binary tree of depth $l$?
\end{enumerate}
\subsection{Common Runtimes}
\blu{Answer the following questions using big-$O$ notation} You do not need to show your work.
\begin{enumerate}
\item What is the cost of running the mergesort algorithm to sort a list of $n$ numbers?
\item What is the cost of finding the third-largest element of an unsorted list of $n$ numbers?
\item What is the cost of finding the smallest element greater than 0 in a \emph{sorted} list with $n$ numbers.
\item What is the cost of computing the matrix-vector product $Ax$ when $A$ is $n \times d$ and $x$ is $d \times 1$.
\item How does the answer to the previous question change if $A$ has only $z$ non-zeroes.
\end{enumerate}
\subsection{Running times of code}
Included in \texttt{a0.zip} is file named \texttt{bigO.jl}, which defines several functions
that take an integer argument $n$. For each function, \blu{state the running time as a function of $n$, using big-O notation}.
\end{document}