CS Theses & Dissertations 2020

For 2020 graduation dates (in alphabetical order by last name):

Dara the explorer: coverage based exploration for model checking of distributed systems in Go
Anand, Vaastav

DOI : 10.14288/1.0392970
URI : http://hdl.handle.net/2429/75692
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Ivan Beschastnikh

Practical optimization methods for machine learning models
Babanezhad Harikandeh, Reza

DOI : 10.14288/1.0387209
URI : http://hdl.handle.net/2429/72845
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-05
Supervisor : Dr. Mark Schmidt

This work considers optimization methods for large-scale machine learning (ML). Optimization in ML is a crucial ingredient in the training stage of ML models. Optimization methods in this setting need to have cheap iteration cost. First-order methods are known to have reasonably low iteration costs. A notable recent class of stochastic first-order methods leverage variance reduction techniques to improve their convergence speed. This group includes stochastic average gradient (SAG), stochastic variance reduced gradient (SVRG), and stochastic average gradient am´elior´e (SAGA). The SAG and SAGA approach to variance reduction use additional memory in their algorithm. SVRG, on the other hand, does not need additional memory but requires occasional full-gradient evaluation. We first introduce variants of SVRG that require fewer gradient evaluations. We then present the first linearly convergent stochastic gradient method to train conditional random fields (CRFs) using SAG. Our method addresses the memory issues required for SAG and proposes a better non-uniform sampling (NUS) technique. The third part of this work extends the applicability of SAGA to Riemannian manifolds. We modify SAGA with operations existing in the manifold to improve the convergence speed of SAGA in these new spaces. Finally, we consider the convergence of classic stochastic gradient methods, based on mirror descent (MD), in non-convex setting. We analyse the MD with more general divergence function and show its application for variational inference models.

Schedule data, not code
Best, Micah

DOI : 10.14288/1.0394747
URI : http://hdl.handle.net/2429/76292
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-11
Supervisor : Dr. Arvind Gupta

Parallel programming is hard and programmers still struggle to write code for shared memory multicore architectures that is both free of concurrency errors and efficient. Tools have advanced, but for tasks that are not embarrassingly parallel, or suitable for a limited model such as map/reduce, there is little help. We aim to address some major aspects of this still underserved area. We construct a model for parallelism, Data not Code (DnC), by starting with the observation that a majority of performance and problems in parallel programming are rooted in the manipulation of data, and that a better approach is to schedule data, not code. Data items don’t exist in a vacuum but are instead organized into collections, so we focus on concurrent access to these collections from both task and data parallel operations. These concepts are already embraced by many programming models and languages, such as map/reduce, GraphLab and SQL. We seek to bring the excellent principles embodied in these models, such as declarative data-centric syntax and the myriad of optimizations that it enables, to conventional programming languages, like C++, making them available in a larger variety of contexts. To make this possible, we define new language constructs and augment proven techniques from databases for accessing arbitrary parts of a collection in a familiar and expressive manner. These not only provide the programmer with constructs that are easy to use and reason about, but simultaneously allow us to better extract and analyze programmer intentions to automatically produce code with complex runtime optimizations. We present Cadmium, a proof of concept DnC language to demonstrate the effectiveness of our model. We implement a variety of programs and show that, without explicit parallel programming, they scale well on multicore architectures. We show performance competitive with, and often superior to, fine-grained locks, the most widely used method of preventing error-inducing data access in parallel operations.

Integrators for elastodynamic simulation with stiffness and stiffening
Chen, Yu Ju

DOI : 10.14288/1.0384559
URI : http://hdl.handle.net/2429/72045
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-05
Supervisosr : Dr. Dinesh Pai, Dr. Uri Ascher

The main goal of this thesis is to develop effective numerical algorithms for stiff elastodynamic simulation, a key procedure in computer graphics applications. To enable such simulations, the governing differential system is discretized in 3D space using a finite element method (FEM) and then integrated forward in discrete time steps. To perform such simulations at a low cost, coarse spatial discretization and large time steps are desirable. However, using a coarse spatial mesh can introduce numerical stiffening that impede visual accuracy. Moreover, to enable large time steps while maintaining stability, the semi-implicit backward Euler method (SI) is often used; but this method causes uncontrolled damping and makes simulation appear less lively. To improve the dynamic consistency and accuracy as the spatial mesh resolution is coarsened, we propose and demonstrate, for both linear and nonlinear force models, a new method called EigenFit. This method applies a partial spectral decomposition, solving a generalized eigenvalue problem in the leading mode subspace and then replacing the first several eigenvalues of the coarse mesh by those of the fine one at rest. We show its efficacy on a number of objects with both homogeneous and heterogeneous material distribution. To develop efficient time integrators, we first demonstrate that an exponential Rosenbrock-Euler (ERE) integrator can avoid excessive numerical damping while being relatively inexpensive to apply for moderately stiff elastic material. This holds even in challenging circumstances involving non-convex elastic energies. Finally, we design a hybrid, semi-implicit exponential integrator, SIERE, that allows SI and ERE to each perform what they are good at. To achieve this we apply ERE in a small subspace constructed from the leading modes in the partial spectral decomposition, and the remaining system is handled (i.e., effectively damped out) by SI. We show that the resulting method maintains stability and produces lively simulations at a low cost, regardless of the stiffness parameter used.

Comparing haptic application design communities: characterizing differences and similarities for future design knowledge sharing
Chun, Matthew (Jungho)

DOI : 10.14288/1.0389974
URI : http://hdl.handle.net/2429/74163
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Karon Maclean

Team LSTM: player trajectory prediction in basketball games using graph-based LSTM networks
Cohan, Setareh

DOI : 10.14288/1.0388468
URI : http://hdl.handle.net/2429/73403
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisors : Dr. Jim Little, Dr. Leonid Sigal

PolyFit: perception-aligned vectorization of raster clip-art via intermediate polygonal fitting
Dominici, Edoardo Alberto

DOI : 10.14288/1.0389752
URI : http://hdl.handle.net/2429/73924
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Alla Sheffer

Somatic mutation analysis for the study of clonal evolution in cancer
Dorri, Fatemeh

DOI : 10.14288/1.0390286
URI : http://hdl.handle.net/2429/74261
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-05
Supervisors : Dr. Anne Condon, Dr. Sohrab Shah (BC Cancer Foundation)

Financial knowledge graph construction
Elhammadi, Sarah Habashi

DOI : 10.14288/1.0392614
URI : http://hdl.handle.net/2429/75344
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Laks Lakshmanan

Visual grounding through iterative refinement
Fan, Zicong

DOI : 10.14288/1.0391964
URI : http://hdl.handle.net/2429/74769
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisors : Dr. Jim Little, Dr. Leonid Sigal

An indexed type system for faster and safer WebAssembly
Geller, Adam Timothy

DOI : 10.14288/1.0392977
URI : http://hdl.handle.net/2429/75703
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisors : Dr. Ivan Beschastnikh, Dr. William Bowman

Graph-based food ingredient detection
Ghotbi, Borna

DOI : 10.14288/1.0387154
URI : http://hdl.handle.net/2429/72773
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Leonid Sigal

Is your time well spent? Reflecting on knowledge work more holistically
Guillou, Hayley

DOI : 10.14288/1.0389624
URI : http://hdl.handle.net/2429/73797
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisors : Dr. Joanna McGrenere, Dr. Thomas Fritz

Predicting landslides using contour aligning convolutional neural networks
Hajimoradlou, Ainaz

DOI : 10.14288/1.0385548
URI : http://hdl.handle.net/2429/72317
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. David Poole

Measurement and estimation of material parameters of real garments
Hansen, Jan

DOI : 10.14288/1.0394092
URI : http://hdl.handle.net/2429/75798
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Dinesh Pai

Sufficiency condition for output-oblivious chemical reaction networks and run-time analysis
Hashemi, Hooman

DOI : 10.14288/1.0385125
URI : http://hdl.handle.net/2429/72223
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Anne Condon

Designing an eyes-reduced document skimming app for situational impairments
Khan, Taslim Arefin

DOI : 10.14288/1.0384519
URI : http://hdl.handle.net/2429/72100
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisors : Dr. Joanna McGrenere, Dr. Dongwook Yoon

Designers characterize naturalness in voice user interfaces: their goals, practices, and challenges
Kim, Yelim

DOI : 10.14288/1.0389688
URI : http://hdl.handle.net/2429/73856
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisors : Dr. Dongwook Yoon, Dr. Joanna McGrenere

Where are the objects? Weakly supervised methods for counting, localization and segmentation
Laradji, Issam Hadj

DOI : 10.14288/1.0390386
URI : http://hdl.handle.net/2429/74356
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-05
Supervisor : Dr. Mark Schmidt

In 2012, deep learning made a major comeback. Deep learning started breaking records in many machine learning benchmarks, especially those in the field of computer vision. By integrating deep learning, machine learning methods have became more practical for many applications like object counting, detection, or segmentation. Unfortunately, in the typical supervised learning setting, deep learning methods might require a lot of labeled data that are costly to acquire. For instance, in the case of acquiring segmentation labels, the annotator has to label each pixel in order to draw a mask over each object and get the background regions. In fact, each image in the CityScapes dataset took around 1.5 hours to label. Further, to achieve high accuracy, we might need millions of such images. In this work, we propose four weakly supervised methods. They only require labels that are cheap to collect, yet they perform well in practice. We devised an experimental setup for each proposed method. In the first setup, the model needs to learn to count objects from point annotations. In the second setup, the model needs to learn to segment objects from point annotations. In the third setup, the model needs to segment objects from image level annotations. In the final setup, the model needs to learn to detect objects using counts only. For each of these setups the proposed method achieves state-of-the-art results in its respective benchmark. Interestingly, our methods are not much worse than fully supervised methods. This is despite their training labels being significantly cheaper to acquire than for the fully supervised case. In fact, in fixing the time budget for collecting annotations, our models performed much better than fully supervised methods. This suggests that carefully designed models can effectively learn from data labeled with low human effort.

Optimal algorithms for experts and mixtures of Gaussians
Liaw, Christopher Vui

DOI : 10.14288/1.0392911
URI : http://hdl.handle.net/2429/75663
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-11
Supervisor : Dr. Nick Harvey

This thesis makes contributions to two problems in learning theory: prediction with expert advice and learning mixtures of Gaussians. The problem of prediction with expert advice can be cast as a sequential game between an algorithm and an adversary as follows. At each time step, an algorithm chooses one of n options (or experts) and the adversary sets a cost for each expert. The algorithm's goal is to minimize its regret, i.e. its cost relative to the best expert in hindsight. The celebrated multiplicative weights algorithm is known to be optimal if the game is terminated at a fixed, known time and the number of experts is large. Optimal algorithms are also known when the number of experts is 2, 3, or 4. If the game does not terminate at a known time or is run indefinitely, the optimal algorithm is not known for any number of experts. We contribute to this problem by giving the optimal algorithm when there are 2 experts. Our algorithm is designed by considering a continuous analogue of the problem, which is solved using ideas from stochastic calculus. In the second part of the thesis, we look at distribution learning, which is a fundamental task in statistics that has been studied for over a century. We consider such a problem where the distribution is a mixture of k Gaussians in d dimensions. The objective is density estimation: given i.i.d. samples from the unknown distribution, produce a distribution whose total variation from the unknown distribution is within some desired accuracy. We contribute to this problem by designing an algorithm with near-optimal sample complexity.

Shortest paths in line arrangements
Likhtarov, Anton

DOI : 10.14288/1.0389809
URI : http://hdl.handle.net/2429/74003
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Will Evans

Learning through exploration: how children, adults, and older adults interact with a new feature-rich application
Mahmud, Shareen

DOI : 10.14288/1.0384824
URI : http://hdl.handle.net/2429/72137
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Joanna McGrenere

Stochastic Second-Order Optimization for Over-parameterized Machine Learning Models
Meng, Si Yi

DOI : 10.14288/1.0394117
URI : http://hdl.handle.net/2429/75763
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Mark Schmidt

Interpolation, growth conditions, and stochastic gradient descent
Mishkin, Aaron Philip

DOI : 10.14288/1.0394494
URI : http://hdl.handle.net/2429/76150
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Mark Schmidt

Prediction and anomaly detection in water quality with explainable hierarchical learning through parameter sharing
Mohammad Mehr, Ali

DOI : 10.14288/1.0394253
URI : http://hdl.handle.net/2429/75910
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. David Poole

On amortized inference in large-scale simulators
Naderiparizi, Saeid

DOI : 10.14288/1.0388330
URI : http://hdl.handle.net/2429/73351
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Frank Wood

Investigating the impact of normalizing flows on latent variable machine translation
Przystupa, Michael Vincent

DOI : 10.14288/1.0388739
URI : http://hdl.handle.net/2429/73621
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisors : Dr. Muhammad Abdul-Mageed, Dr. Mark Schmidt

Toward XAI for Intelligent Tutoring Systems: A Case Study
Putnam, Vanessa Ann

DOI : 10.14288/1.0389817
URI : http://hdl.handle.net/2429/73996
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Cristina Conati

Understanding the role of averaging in non-smooth stochastic gradient descent
Randhawa, Sikander

DOI : 10.14288/1.0392916
URI : http://hdl.handle.net/2429/75626
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Nick Harvey

[no title]
Rashtchian, Arya

Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisors : Dr. Joanna McGrenere, Dr. Leonid Sigal

Synchronizer analysis and design tool: an application to automatic differentiation
Reiher, Justin James

DOI : 10.14288/1.0388329
URI : http://hdl.handle.net/2429/73349
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Mark Greenstreet

Designing CAST: a computer-assisted shadowing trainer for self-regulated foreign language listening practice
Reza, Mohi

DOI : 10.14288/1.0394090
URI : http://hdl.handle.net/2429/75794
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Dongwook Yoon

Measuring, modelling, simulating, and predicting human tissue properties
Rothwell, Austin Caulfield

DOI : 10.14288/1.0387124
URI : http://hdl.handle.net/2429/72777
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Dinesh Pai

Automatic identification and description of software developers tasks
Satterfield, Christopher David

DOI : 10.14288/1.0390001
URI : http://hdl.handle.net/2429/74189
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Gail Murphy

Rapid mold prototyping : creating complex 3D castables from 2D cuts
Shakeri, Hanieh

DOI : 10.14288/1.0392978
URI : http://hdl.handle.net/2429/75706
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisors : Dr. Karon Maclean, Dr. Robert Xiao

Biscotti - a ledger for private and secure peer to peer machine learning
Shayan, Muhammad

DOI : 10.14288/1.0387042
URI : http://hdl.handle.net/2429/72699
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Ivan Beschastnikh

Simulation of incompressible elastic material using zonal volume constraints
Sheen, Seung Heon

DOI : 10.14288/1.0394823
URI : http://hdl.handle.net/2429/76377
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Dinesh Pai

A neural architecture for detecting user confusion in eye-tracking data
Sims, Shane

DOI : 10.14288/1.0394251
URI : http://hdl.handle.net/2429/75900
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Cristina Conati

Spatio-temporal relational reasoning for video question answering
Singh, Gursimran

DOI : 10.14288/1.0384578
URI : http://hdl.handle.net/2429/72033
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Jim Little

Personal data curation in the cloud age: individual differences and design opportunities
Vitale, Francesco

DOI : 10.14288/1.0392427
URI : http://hdl.handle.net/2429/75184
Supplementary material available at: http://hdl.handle.net/2429/77281
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-11
Supervisor : Dr. Joanna McGrenere

People are creating and storing a growing amount of personal data, from photos and documents to messages and applications, on a growing number of devices. Storage space, often in the cloud, is cheap or free. But previous research shows that a degree of selectivity and curation is necessary to build personal archives that have value over time. In this dissertation, we ask: How do different people decide what personal data to keep or discard? What drives their decisions? And how can data management tools better support individual preferences? We used a qualitative and design-based approach to conduct four studies consisting of 64 interviews in total and a survey (n=349). First, we identified a spectrum of tendencies that informed how participants (n=23) decided what to keep or discard, with two extremes: “hoarding” (keeping most of data), and “minimalism” (keeping as little as possible). We extended this spectrum with a set of five behavioral styles that capture contextual curation patterns: taking a casual approach to data, feeling overwhelmed, collecting data, purging data, and trying to be frugal. This model of behaviors (based on the 64 interviews) highlights a key role for data curation: what people keep or discard informs how they think about their own identity. We used these insights to map a design space for data curation and create five design concepts for different user needs, exploring automation and other key design dimensions. Participants’ reactions (n=16) varied: some welcomed technology and automation, others opposed it, with context informing their reactions. Inspired by these results and using a taxonomy of data types and decluttering criteria based on the survey (n=349), we designed Data Dashboard, a tool that aggregates data from a user’s multitude of devices and cloud platforms, providing customizable functions for different goals. We evaluated a prototype of the system with 18 participants and found that a personalized approach to data curation is promising, so long as it respects users’ boundaries. Our work outlines key design directions and opportunities that can help envision new tools, prioritize user needs, and redefine our relationship with personal data in a world full of it.

Extracting Synthetic RGB-D from Video Games
Woo, Kevin Wai-Mun

Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Jim Little

Consistent multiple sequence decoding
Xu, Bicheng

DOI : 10.14288/1.0392691
URI : http://hdl.handle.net/2429/75418
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Leonid Sigal

[no title]
Yin, Zixuan

Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Margo Seltzer

Dynamic race detection for non-coherent accelerators
Young, May

DOI : 10.14288/1.0390303
URI : http://hdl.handle.net/2429/74290
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Alan Hu

Generative adversarial networks for pose-guided human video generation
Zablotskaia, Polina

DOI : 10.14288/1.0389697
URI : http://hdl.handle.net/2429/73869
Supplementary material available at: http://hdl.handle.net/2429/77282
Degree : Master of Science – MSc
Graduation Date : 2020-05
Supervisor : Dr. Leonid Sigal

Scheduling queries to moving entities to certify many are distant from a region
Zheng, Da Wei

DOI : 10.14288/1.0392883
URI : http://hdl.handle.net/2429/75616
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Will Evans

Regret bounds without Lipschitz continuity: online learning with relative-Lipschitz losses
Zhou, Yihan

DOI : 10.14288/1.0394127
URI : http://hdl.handle.net/2429/75799
Degree : Master of Science – MSc
Graduation Date : 2020-11
Supervisor : Dr. Nick Harvey

Facilitating user interaction with data
Zolaktaf Zadeh, Zeinab

DOI : 10.14288/1.0387205
URI : http://hdl.handle.net/2429/72858
Degree : Doctor of Philosophy - PhD
Graduation Date : 2020-05
Supervisor : Dr. Rachel Pottinger

In many domains, users interact with data stored in large, and often structured, data sources. This thesis addresses three phases of user interaction: (1) data exploration, (2) query composition, (3) and query answer analysis. It provides methods to assist in each of these phases, though, of course, no single thesis could be broad enough to cover all possible user interaction in these phases. The first part of the thesis focuses on improving data exploration with recommender systems. Standard recommendation models are biased toward popular items in their suggestions. Our approach is to analyze past interaction logs to estimate user preference for exploration and novelty. We present a generic framework that increases the novelty of recommendations based on each user's novelty preference. The next part of the thesis examines ways of facilitating query composition. We study models that analyze past query logs to model and estimate query properties, such as answer size or error type. By predicting these properties prior to query execution, we can help the user tune and optimize their query. Empirical results show that the data-driven machine learning models can accurately perform several of the prediction tasks. The final part of this thesis studies methods for improving the analysis of large or conflicting query answers. This problem is common in integration contexts where data is segmented across several sources with overlapping and conflicting data values. Depending on which combination of sources and values are used, a simple query can have an overwhelming number of correct and conflicting answers. The approach presented is based on efficiently estimating a query answer distribution. Further, it offers a suite of methods for extracting statistics that convey meaningful information about the answer set. Overall, the solutions developed in this thesis aim to increase the efficiency and decision quality of users. Empirical results on real-world datasets show that the proposed problems and solutions are important steps in the general direction of making information easily accessible to users.