Previous Undergrad Research Awards

Meta-humanoid: Evaluating the Effects of Realistic AR Avatar in Social Interaction with Humanoid Robot

Meta-humanoid is a novel telepresence application that synthesizes a suite of cutting-edge technologies, including humanoid robots, life-like avatars, and augmented reality to reduce the user's perceptual discordance between the robot's body and the human teleoperator's identity. In meta-humanoid, the user wears an AR HMD that overlays the teleoperator’s lifelike avatar on a humanoid robot. The lifelike avatar moves and animates according to the humanoid’s motion. This can reduce the user’s perceptual discordance.

In canonical robot-mediated telepresence applications, a teleoperator controls a robot to offer physical services to a user. However, the machine-like appearance of the robot is a critical barrier for the user to perceive the robot as the embodied self of the teleoperator. The user’s perceptual discordance between the actor’s embodiment (i.e., the robot’s body) and its identity (i.e., the human teleoperator) leads to diminished quality of emotional and cognitive experiences, such as trust, likeability, and presence. This gap hampers the potential of telepresence robots in domains like telemedicine and tele-training where emotional and perceptual quality is critical.

In this research, we aim to evaluate the impact of the lifelike avatar overlay on the perceptual quality of social interaction in meta-humanoid applications. We leverage a Wizard of Oz approach, where the user is immersed in a fully VR environment in which a confederate researcher plays the role of the humanoid robot controlled by the teleoperator.

Keywords: Augmented Reality, Tele-presence Robot, Social Interaction, Usability Evaluation

Qualifications:

(optional) HCI evaluation skills or successful completion of the introductory HCI courses (e.g., CPSC 344)
(optional) Experience or strong interest in UX evaluation methods (e.g., user study, survey, interview)

(optional) Successful completion of the Computer Graphics course (e.g., CPSC 314)

Augmenting Zoom Meeting Experiences with an AI Moderator

Zoom meetings have become a widespread form of communication, but the video conferencing system itself offers little more than the transmission of pixels. The aim of this project is to create an "AI moderator" agent that aids meeting participants in browsing, comprehending, and annotating video content from Zoom meetings. The intern will work on constructing the system by utilizing a combination of AI techniques such as computer vision, speech recognition, and natural language processing.

Qualifications:

Strong technical skills including OOP, data structures, and algorithms
Experience in building computer vision, speech processing, and/or natural language processing applications

(optional) Experience in web front-end development

Automatically synthesizing efficient algorithms for solving SAT

Please contact Chris Cameron at cchris13@cs.ubc.ca if interested and to set up a time to talk.

Designing efficient algorithms for solving the Boolean satisfiability (SAT) problem is crucial for practical applications such as scheduling, planning, and software testing. Machine learning is an important tool in algorithm design for SAT, however SAT practitioners have yet to exploit the representational power of deep learning. We are working on training neural networks to learn to make variable ordering decisions for the widely used DPLL-style SAT solvers. DPLL SAT solvers search for a solution in a tree-like fashion, where a "branching policy" dictates how to iteratively partition the search space. Existing heuristics for deciding how to partition the search space don't use machine learning to leverage the structure of the internal solver state. Inspired by AlphaGo, we developed Monte Carlo Forest Search, a reinforcement learning approach for learning branching policies that lead to small proof trees of unsatisfiability.

To date, we have achieved a working implementation with performance gains on small-size SAT distributions. The student will join in to help us scale up our method to work on much larger and more difficult SAT distributions that are more typical in industrial applications. We see two main bottlenecks to achieving this. The first is memory requirements for model training. A few ideas we would like to explore with the student are parallelizing across GPUs, memory-efficient gradient approximations, and smaller state representations. The second challenge is exploring an action space that is many orders of magnitude larger. We think a promising direction is to learn an action representation and reduce the action space by collapsing similar actions together. To manage the challenge of jumping from a few hundred variables to a few million, we plan to build a curriculum-learning
pipeline to gradually increase problem size and iteratively leverage policies from smaller problems.

The student will work closely with Leyton-Brown's graduate students. They have primary responsibility for well defined tasks (usually implementing aspects of a system, running computational experiments, and analyzing the results) but also often get involved in problem (re)formulation and generating new research ideas.

Qualifications:

The student should have a basic understanding of machine learning concepts and should be competent in C++ (for SAT solver) and Python (for ML). Knowledge of deep learning (pytorch), cluster computing, statistics, and CDCL SAT solvers will be an asset.

Student responsibilities:

The student will learn how to (1) design large computational experiments (2) leverage neural networks (specifically graph neural nets) and (3) modify and implement monte-carlo tree search. Neural nets are ubiquitous in industry and Monte Carlo Tree Search is becoming a major method with many potential applications in the coming decade. The student will also get exposure to the research process which will be beneficial for any future graduate program they might attend. The student will have close interaction with Leyton-Brown and his graduate students and learn research methodology, including how to come up with research questions, design experiments to test hypotheses, and effectively communicate research arguments.

Work Setting: Mostly on campus with some online.

Thomas Pasquier

Understanding the performance of graph-based anomaly detection

The history of a system execution can be represented as a directed graph. Nodes in the graph represent states of system objects (e.g., processes, files, sockets, etc.) and edges represent interactions between those objects (e.g., read, write, fork, etc.). This graph can be analyzed to understand how the states of two objects relate. In practice, each implementation of this concept represents system execution history differently, leading to graphs with very different properties.

Over the last ten years, many papers have looked at machine learning techniques to analyze those graphs and automatically identify anomalous patterns. More attention should be paid to the underlying graph semantics and how it affects detection performance.

In this project, you will systematically study the interplay of the graph representation semantics and the detection performance. The ideal student will be comfortable with applied machine learning techniques, and systems development and management. Time permitting you will leverage this knowledge to design a better anomaly detection system.

Proof Construction from Rust Compiler Analyses

The Rust programming language incorporates a unique type system for governing, in a fine-grained way, the management of aliases, mutation and allocations, forcing programmers to provide much richer information about their intentions than in other mainstream languages. This type system is supported by a number of complex compiler analyses: a combination of classical program analyses such as liveness information with Rust-specific analyses such as borrow and reference lifetime tracking. The Prusti project (along with many other subsequent works) observed that this information could be leveraged to simplify formal reasoning about what Rust programs actually do, and in particular to query the compiler for information that helps to construct a mathematically-precise modelling of a Rust program into another language (Viper) and toolset suitable for program verification.

While this research direction has been successful, Prusti’s current approach intertwines the modelling of Rust into Viper with the information we need to extract from the Rust compiler. This makes the code difficult to develop, and means that the current techniques cannot be easily used to experiment with other back-end encodings and verification techniques. In recent work, we have been exploring the possibility of defining a fully-elaborated form of a Rust program enriched with information from compiler analyses, explaining its interactions with the type system in a way which is fully decoupled from our eventual verification language and goals.

In this internship, we plan to explore three main objectives. Firstly, we will extend this elaboration of a Rust program to support the interaction between struct type definitions and lifetime parameters in a general way, which is not possible in our current approach. Secondly, we will develop an encoding to reconnect this elaborated representation to the Viper verification infrastructure (ideally supplanting the existing codebase). Thirdly, we will consider at least one alternative/variant encoding of the Rust program into Viper, to demonstrate the flexibility of our new model and its independence from the backend approach.

The project affords opportunities to learn and practice Rust programming, and to gain familiarity with type system features, compiler design and program verification. Prusti is a collaboration between several faculty and students, and this project will include exposure to the wider research directions of the project and the chance to collaborate with and present ideas to a large research team.

Experience with Rust and (ideally) formal reasoning or proofs is desirable.

Automated Testing of Haskell Compiler Rewrite Rules

Optimizing compilers employ a number of techniques that transform code to yield the same result with better runtime characteristics. To support such optimizations, the Haskell programming language (as implemented in the Glasgow Haskell Compiler, GHC) allows library authors to implement their own compile-time optimizations in the form of rewrite rules. Many well-known Haskell libraries make heavy use of rewrite rules to achieve better runtime performance.

However, incorrect rewrite rules (those that define invalid program transformations) are difficult to debug and can change program behaviour in unexpected ways. Unfortunately, in general, there is no easy way to test Haskell rewrite rules for correctness.

In recent work we developed a first prototype tool, Rulecheck, that automatically generates and executes property-based tests for Haskell rewrite rules, with the goal of identifying incorrect rules in real-world Haskell libraries. Although we have developed a proof-of-concept, there is still much more work to be done; in particular, we have the following next aims for this research internship:

1. Extend Rulecheck to support a wider range of Haskell type system features

To test a rewrite rule, Rulecheck must generate well-typed Haskell expressions that match the arguments of the rule. Due to the expressiveness of Haskell's type system, this is not trivial. Our prototype tool only supports a subset of Haskell types; and therefore cannot generate expressions for some rewrite rules (for example, those operating on expressions with higher-kinded types). Extending Rulecheck to support more of Haskell's type system will enable more rewrite rules to be testable, and improve Rulecheck's ability to find bugs.

2. Reduce the rate of false positives in generated tests

The tests generated by our current implementation of Rulecheck sometimes fail, even when the underlying rewrite rule is correct. For example, some test failures occur because Rulecheck automatically generates expressions that would never be actually used in the application of a rule. Manual analysis is necessary to determine if test failures correspond to actual issues. We would like to reduce the amount of manual intervention required, by generating better tests in the first place.

3. Evaluate Rulecheck against real-world Haskell libraries

Rulecheck is intended to test the rewrite rules in real-world Haskell libraries; as yet, we have only used Rulecheck on a handful of libraries. However, Hackage (the Haskell package repository) contains over 100 libraries that define rewrite rules. To prove the efficacy of our technique, we would like to apply Rulecheck to these libraries; identifying incorrect rewrite rules and reporting any problematic rules to library authors. Doing so will likely require extending Rulecheck to support a wider range of Haskell language features.

The project affords opportunities to learn and practice Haskell programming, and to gain familiarity with property-based testing, advanced type system features and compiler design. In addition, this project will provide experience working on a research project in collaboration with other faculty and students.

Ideal candidates for this project should have an interest in functional programming. Previous experience with Haskell and QuickCheck is a plus.

Reengineering a Debugger for Quantifier Reasoning in SMT

SMT solvers have a wide variety of applications across Computer Science, including program analysis and synthesis tools, automated planning and constraint solving, optimisation problems and software verification. Advanced tools such as program verifiers are often built around SMT encodings of their problems. However, designing these encodings to perform reliably and fast is a challenging task. In previous work, we developed the Axiom Profiler tool to serve as a first debugging tool for quantifier-rich SMT problems. While vastly more useful than trying to debug by hand, this tool has a number of limitations in practice, and requires a substantial redesign and reimplementation with the following objectives:

Build an application with a modern and OS-independent GUI, capable of mixing both simple button-press interactions with more-sophisticated functionality for navigating graph visualisations.
Design a modular, maintainable codebase in a suitable programming language, so that the tool can be robustly developed in the future.
Understand and reimplement the core analysis algorithms which support user navigation and explanation of the information visualised.
(optionally) interact with modern proof logging formats as generated by newer versions of the Z3 SMT solver.
(optionally) consider how the profiler can be generalised to debug runs of other SMT solvers (if possible).

The project affords opportunities to learn about formal reasoning and the automation of logical proofs using state-of-the-art SMT solvers. Prior experience with GUI application development, algorithms, SMT Solving and/or formal reasoning would all be advantageous but not necessarily essential; good analytical skills and expertise with imperative programming of some kind are needed.

Applying the Prusti Verifier to a Rust model of an AWS File System

In discussions with collaborators at AWS, we have a greatly-simplified model of a file system they are developing; as part of their software engineering process they are building models of the eventual systems at various degrees of accuracy/complexity all in Rust: each version of the models is designed to behave similarly to the previous, but with more efficiency/real-world concerns added. At the simplest end, there is a sequential implementation of various core data structures and algorithms relevant for the file system’s unique design. Verifying desirable properties even of this simplest model is potentially of value in such a design process, since any bugs in these simplified models would likely persist in the more-complex variants of the software.

In this project, we will investigate the application of Prusti to this file system model as a case study. We will consider a variety of desirable properties, ranging from simple panic (crash) freedom, to data structure invariants and potentially even to crash-consistency properties that show that data will be recovered during unexpected failures. Depending on the outcomes of our first experiences using Prusti on this codebase, we may develop additional features in the Prusti tool itself to make this kind of verification more direct, easier or more efficient. If we are successful at verifying one model, we might also consider the challenges involved in working with a more accurate software model including concurrency, and how to relate the two versions.

This project offers experience with system design and formal verification, as well as experience with Rust and modern verification tools. Prior experience with Rust and/or formal reasoning about programs are advantageous but not essential; expertise with imperative programming of some kind is needed.

A DSL and Query Engine for Large-Scale Analysis of Open-Source Rust Code

Rust is a rapidly-growing systems language attracting ever-growing interest from engineers and researchers alike. There is already a huge corpus of open-source Rust software available via the crates.io standard repository. This collection of real-world software is a rich source of information about how developers are using this new language in the wild. For example, in a recent paper, “How do Programmers use Unsafe Rust?”, we performed an empirical evaluation of the entirety of this corpus to answer a number of questions about the “unsafe” features of the Rust languages, and how they are actually employed in the wild.

To perform this analysis, we developed infrastructure called “qrates” for assembling a database of facts related to the available online repositories and presenting these facts for analysis in Jupyter notebooks. However, the information extracted into this database was specific to the empirical study in question, and changing (or even refining slightly) the information extracted over the corpus of software projects is currently a manual process. Rerunning this database extraction process is also cumbersome and requires excessive computing power, since all information is temporarily stored in memory, making it difficult to scale this work up to the increasingly huge amount of open-source Rust code available online.

In this project, we will aim for a generalisation and reengineering of the qrates approach and tools, with the following objectives:

Design a query DSL suitable for defining the desired facts to be assembled into a database, as well as the selection of which Rust projects (crates) should be included. This should make it possible to assemble multiple different databases without making manual changes to the qrates code itself.
Reengineer the underlying qrates framework to support pluggable queries via your newly-developed DSL, and to avoid the current need to hold all data in memory; supporting concurrency in the assembly of the database would also be interested to explore.
Evaluate possible technical solutions for deploying the qrates framework via either cloud computing or dedicated server solutions, so that evaluation can be performed remotely by users of typical computing hardware (for example, we might setup a specific machine to run these corpus evaluation workloads).
Demonstrate the usability and flexibility of the new framework by both rerunning prior evaluations and running new ones over the current corpus of Rust crates.

The project affords opportunities in language design and implementation, remote computing and concurrency in a practical setting, as well as programming experience in Rust. Prior experience with Rust and database / query languages or other data analysis frameworks would be advantageous.

Helge Rhodin

3D Gaze Estimation for View-Dependent Display

Eye contact and gaze are significant components of in-person interactions which are difficult to simulate virtually. Often, a monocular view of a RGB video stream is limited in portraying the dimensionality of a 3D object - the viewer can only observe the object in a fixed position and changing the head's position and orientation of the viewer will not result in varying perspectives of the object being viewed.

By using a head-mounted display (HMD)'s built-in tracker or other means of head position and orientation trackers, one could calculate the view direction of the user and simulate the view[1]dependent rendering by varying the virtual camera's position and orientation. However, this would require the purchase and possession of such devices, while equipping the devices may serve as an intrusive barrier in virtual communications.

In this project, the student will aim to create a deep-learning based solution for gaze estimation from a consuler-level depth camera, in particular the 3D calibration of the camera, display, and user position. Building upon previous works on 3D gaze estimation and view-dependent rendering, the goal is to build a real-time 3D view-dependent display based on an RGBD camera input. The software will serve as a non-intrusive addition to telecommunication and aid in an immersive 3D interaction experience.

Nick Harvey

Design and analysis of randomized algorithms

Many of the tasks performed by modern computers involve randomization. Examples include estimating properties of a data set by sampling, or privatizing aspects of a data set by adding random noise.

One of the simplest examples of such a task is drawing a random sample from an array without replacement. The standard approaches for this task either require modifying the array, or require an auxiliary hash table data structure. The former is undesirable in many settings, such as multi-threading environments; the latter can involve quite a lot of overhead and may lack provable guarantees. In this research, we aim to explore simple, efficient approaches to this problem that still have provable guarantees.

This project offers experience with mathematical proofs, tools from probability theory, reading relevant research papers, and an opportunity to write a research paper.

Strong performance in CPSC 436R (Introduction to Randomized Algorithms) and honours mathematics courses is required.

Summer 2022

Think Like an Edge or Think Like a Vertex?

Graphs are near-ubiquitous for representing information and there's a pressing need to create systems that can store and process larger graphs, more quickly. Most such systems are based on the Bulk Synchronous Parallel (BSP) computing paradigm and the many processing paradigms derived from it. In most of these models, the overall graph computation (e.g., performing BFS or computing PageRank) is decomposed into many steps, each of which might be separated by a barrier phase. Each step performs a single iteration of the algorithm on a subset of the graph; in the barrier phase, the results of these sub-computations are collated and propagated to the next subset of the graph. This process is repeated several times until the results converge.

Several graph processing systems define superstep computations in terms of "Vertex Programs" — Other systems define these computations in terms of "Edge Programs" -- computations performed on each edge of the graph. These two approaches are interchangeable in terms of the correctness of the decomposition involved but can have significant impact on the time needed for completion. Most systems select either a vertex-centric or an edge-centric system deep in the design of the system, which means that it is practically impossible to fairly compare them. More importantly, there is no (as far as we know) heuristic to decide which paradigm might perform better for a given workload (algorithm) and dataset.

This project is about deciding which style of computation - edge-centric or vertex-centric - performs better for a given workload + dataset. We have developed a system that lets us experiment with different styles of processing large graphs, and we want to conduct an empirical study to develop guidelines for which processing model is better suited to different scenarios.

This project provides a good opportunity to learn about parallel software architecture (i.e., if you really liked CPSC 418 by Prof. Greenstreet or really want to take it) and how to design large scale data processing systems.

Introspection in Dynamic Graph Processing

Dynamic graphs are graphs that are subjected to the addition and deletion of edges and vertices. These systems are important as most real-world applications such as social networks or web graphs are constantly changing. There are many different systems for processing dynamic graphs, but it is difficult to figure out which of these systems is better for any particular task. More fundamentally, it is nearly impossible to figure out why one system is faster/slower than another system for a given workload. Results in publications are inconsistent!

This project is about developing tools, techniques, and evaluation platforms that provide deep introspection of these systems. We want to answer research questions such as: 1- Is there any way to measure the performance of analytic tasks with a small set of micro-benchmark results or graph metrics, e.g, get_common_neighbours(x,y), get_neighbours(x), or measure betweenness centrality? 2- Can we construct an interpretable model for identifying the best system for a particular algorithm, dataset, and hardware configuration? 3- Can we generate add/delete queries based on real-world recognized behaviours on real-world dynamic graphs (e.g., Twitter Graph)? 4- How do different systems utilize hardware resources in their computation (Memory Footprint, Cache usage)?

The research will involve both theoretical and empirical work.

Hardware modeling meets an SMT Solver

Translation hardware units (such as MMUs) provide isolation and protection to software; these components are security critical! The Velosiraptor project’s goal is to formally describe translation hardware in a domain specific language, from which we can automatically synthesize software and/or hardware.

In this project, we wish to remove one tool from our tool chain and instead translate directly from our domain specific language into a format that can then be fed into the Z3 theorem prover. This should give us more control over the synthesis process. We will begin by exploring how to encode our specifications and express constraints that can be interpreted by Z3.

The Velosiraptors in the Kernel

Instead of manually writing (potentially buggy) code to interface between software and translation hardware, programmers write a specification of the translation semantics. We then have developed a tool chain that then automatically produces the corresponding code. In this project we wish to generate code that can be integrated into an operating system. This involves adapting the code generation backend to the architecture-specific parts of the OS kernel (e.g., Linux or Barrelfish) that interface with translation hardware (e.g., the MMU or IOMMU). The goal is that we can replace parts of the OS code with the synthesized variants.

Diving into Device Drivers

Device drivers are one of the largest sources of bugs in today’s systems! This project is part of a larger effort to assemble device drivers from a collection of small building blocks. Specifically, we're trying to identify and classify the interactions of existing drivers with the rest of the operating system/kernel: what subsystems functions are used and how. The goal is then to have a model that expresses device drivers in terms of these functions and ultimately, make it possible to easily port a driver from one system to another.

Dafny, meet GOSDT; GOSDT, meet Dafny.

GOSDT is an algorithm and implementation for producing a provably optimal decision tree with respect to a regularized loss function and data set. GOSDT is based on a (large) collection of mathematical theorems, but we have no proof that the code correctly implements each of these theorems. Dafny is a verification aware programming language that makes it possible to prove that an implementation correctly implements a specification. If we could re-implement GOSDT in Dafny, we could directly connect the mathematical theorems to the implementation. This would be ground breaking work!

* Students will be developing software in Dafny.

* Students will be expected to read and discuss research papers, with our guidance and support.

* Students will be part of the Systopia lab, which holds weekly reading groups, skills development sessions, and social events as public health allows.

* Students will have at least one group meeting per week with the faculty advisor and will have a direct graduate student or Postdoc mentor

* Students will also participate in a Zoom-based weekly meeting with collaborators at Duke University

Enhancing Pull Request Interface with Interactive Referencing Features

A design concept of typed-referencing in our Pull Request tool.

Qualifications

Strong technical skills including OOP, data structures, and algorithms.
Experience in web front-end web development
Interested in studying human collaboration in software development (e.g., conducting interviews, analyzing data about software development projects, etc);
(optional) HCI design skills or successful completion of the introductory HCI courses (e.g., CPSC 344)

Designing for Fair Payment and Reducing Invisible Labor in Online Freelance Work

Online freelance platforms, such as Upwork, support millions of business tasks to be completed by workers from the globe. While Upwork provides freelancers with flexibility in managing their schedule and making supplemental income, freelancers also experience a large amount of unpaid work which are often invisible to their clients. This is particularly the case for novice freelancers, who need more guidance in securing fair work while building reputation. To support novice freelancers, based on our qualitative insights, we design interfaces with novice freelancers to 1) identify jobs and practices that lead to excessive unpaid work; 2) communicate the unpaid work to the client while maintaining the freelancer's relationship with the client and their reputation. We evaluate the designed prototype with clients and freelancers to understand its efficacy in supporting freelancers reducing unpaid work and clients’ receptance.

Qualifications:

Strong HCI design skills or successful completion of the introductory HCI courses (e.g., CPSC 344)
Strong verbal and written communication skills and ability to independently conduct user interviews and design evaluations
(Optional) Experience in web development

AI-augmented Video Interfaces

This project aims to build and evaluate an AI-augmented interface to improve viewers’ experience in video-based learning. Despite the abundance of educational content on online video platforms, such as YouTube, the platforms are short of providing ways for the learners to regulate their learning process, such as semantic exploration of videos, an agent-based learning guide, and mediated social interaction between learners/instructors. We will employ an ensemble of AI techniques, such as computer vision, speech recognition, and natural language processing to develop intelligent support for navigating, browsing, authoring, and annotating video content.

Qualifications:

Strong technical skills including OOP, data structures, and algorithms.
Experience in building computer vision, speech processing, and/or natural language processing applications
(optional) Experience in web front-end web development

Andrew Roth

Cancer is a complex disease driven by genetic mutations that follow an evolutionary process. This evolutionary process produces genetically distinct populations of cells which can be analyzed over time to evaluate treatment efficacy or detect minimal residual disease (MRD). Liquid biopsies are an emerging minimally invasive sampling method that allows for serial sampling. They contain samples of biological fluid (e.g. a blood sample) that contain DNA fragments known as cell-free DNA (cfDNA). Importantly, liquid biopsies of cancer patients contain tumour-derived DNA fragments, called cell-tumour DNA (ctDNA), which yields genetic information about the cancer itself and have the potential to direct personalized cancer care.

The successful applicant will work alongside Dr. Roth and his graduate students on a Bayesian model to both estimate the relative proportions of existing clones, and discover emergences of novel clones over time. This model will elucidate the temporal heterogeneity of clonal populations, enabling practitioners to make informed treatment decisions to combat resistance, metastasis, or recurrence in cancer.

The student will help implement bioinformatic pipelines for processing high throughput sequencing data and gain familiarity with state-of-the-art Bayesian modeling techniques based on Markov Chain Monte Carlo (MCMC) and Variational Inference (VI) utilizing Probabilistic Programming Languages (PPLs) such as Pyro and PyMC3. Students with a strong computational background in Computer Science and/or Statistics would be great candidates; at least one course in probability (e.g. STAT 302) and basic programming (CPSC 210) are required. Experience in bioinformatics and machine learning with Python and R would be valuable assets.

Developing Effective Visualization Methods for Large-Scale High-Dimensional Data

Visualizing large-scale high-dimensional data is essential for developing better machine learning algorithms and for data science. Currently, t-SNE and UMAP are wildly used for visualizing high-dimensional data. Although the t-SNE objective function is widely used, it may not be ideal for visualizing the cluster structure. The research objective of this study is to investigate different objective functions for better visualizations. For testing, we will use classic datasets such as MNIST and large datasets from the single-cell genomics field, e.g., scRNA-seq data. The student will learn currently widely used data-visualization tools for machine learning and data science, get the hands-on experience of using GPUs for research, and will be well-equipped for further study and research in the field of dimension reduction and single-cell genomics.

Graph Neural Network to Analyze Spatial Transcriptomics Data

In model machine learning and data science, we have datasets in different formats. One example is the spatial transcriptomic data that can be naturally described as a graph. The problem is how to develop more tailored graph neural networks for such data. The student has the opportunity to explore the frontier of single-cell genomics and graph neural networks for knowledge discovery.

Single-Cell Studying of Esophageal Conditions

If you want to develop computational algorithms to better understand human diseases, this is a perfect project for you. Here we will study esophageal cancer, Barrette’s esophagus, and chronic acid reflux using single-cell genomics. The student will learn the best practice for single-cell RNA-seq data processing and uses/extends powerful deep generative models for integrating data from different conditions/sources and/or modalities to better understand the disease tissue microenvironment.

Learning Disentangled Representations

If you have a strong background in statistical modeling and latent variable models and want to look for some challenging problems, this is a project for you. Although you may hear of disentangled representations, this project is different, and we will explore some special properties of our data and also other approaches that have not been explored for learning disentangled presentations.

Karon MacLean

Predictive Models of Real-time Affective Touch

Building a computational model of human emotions is difficult, and raises philosophical and engineering questions about what emotions are and how we can capture them through behavioural and biometric signals. Current efforts in computational emotion modeling use simplistic machine learning classification schemes (i.e., recognize “happy” or “sad”), poor behaviour classification schemes, and ignore context (i.e., a smile could imply friendliness or sarcasm depending on context). Our goal is to create more complex emotion models using machine learning techniques on affective touch. We target (1) identifying transitions between emotion states; and (2) incorporating contextual data into personalized models to inform classification schemes. In conjunction with a team of grad students, the student researcher will help to develop a process for building dynamic personalized machine learning models. These models will be built into a system that classifies emotion transitions from touch data collected through an emotion recall task.
Qualifications:

Experience with real-time machine learning applications
Strong interest in/knowledge about the projects currently being conducted at SPIN lab.
Strong technical skills including OOP, data structures, and algorithms.
Strong verbal and written communication skills and ability to independently conduct user interviews and design evaluations
(optional) HCI design skills or successful completion of the introductory HCI courses (e.g., CPSC 344)

Contact: Laura Cang ; Rubia Guerra

Karon MacLean

Web application for Predicting and Interpreting Real-time Affect

Figure 1. Screenshots of the final design of the interface in low fidelity.

UBC SPIN lab is examining affective, responsive touch, in contexts including haptic robots and handheld objects such as pliable, actuated mobile devices. We have already established validated evidence of recognizable affective patterns in how people touch objects.

We are currently working to recognize patterns in users' touch to drive affective interaction design, with sensing that is less intrusive than physiological sensors. Our work centers around a custom haptic therapy robot where applications include pediatric and adult therapy for anxiety, autism and other emotion disorders; and more broadly, designing responsiveness into touched objects. In order to provide useful emotion information to emoters and/or caregivers, we are building a system that allows for dynamic prediction editing and label correction. In conjunction with a team of grad students, the student researcher will develop a web-based application that allows study participants to visualize results of a real time prediction engine and edit machine-generated emotion labels as defined by a pre-validated interface. This work is proceeding in coordination with ongoing graduate research on personalized real-time touch classification using advanced machine learning techniques, and robot prototyping.

Qualifications:

Experience in front-end web development
Strong technical skills including OOP, data structures, and algorithms.
Strong verbal and written communication skills
(optional) Experience in end-to-end application development. Framework flexible but preference for experience with HTML/CSS/Javascript, git (or any version control repository), C++, Python, or Java
(optional) HCI design skills or successful completion of the introductory HCI courses (e.g., CPSC 344)

Contact: Laura Cang ; Rubia Guerra

Designing Efficient Algorithms

Designing efficient algorithms for solving the Boolean satisfiability (SAT) problem is crucial for practical applications such as scheduling, planning, and software testing. Machine learning is an important tool in algorithm design for SAT, however SAT practitioners have yet to exploit the representational power of deep learning. We propose to train a neural network to learn to make variable ordering decisions for the widely used CDCL-style SAT solvers.

CDCL SAT solvers search for a solution in a tree-like fashion, where decisions are made to iteratively partition the search space. Existing heuristics for deciding how to partition the search space don't use machine learning to leverage the structure of the internal solver state. We propose to use Monte-Carlo Tree Search (MCTS) reinforcement learning to train a neural network policy to map from internal solver state to partitioning decisions in order to minimize the number of steps required to prove SAT/UNSAT.

To date, we have achieved a working implementation with promising performance gains on small-size SAT distributions. The student will join in to help us scale up our method to work on larger and more difficult SAT distributions. They will work on (1) improving efficiency of Monte-Carlo tree search for generating training data by sharing data across similar instances; (2) training neural networks to learn policies for a CDCL SAT solver, (3) amortizing cost of queries to neural nets over many consecutive decisions, and (4) leveraging existing solvers for small subproblems where a neural network solution is too slow.

The student will work closely with Leyton-Brown's graduate students. They have primary responsibility for well defined tasks (usually implementing aspects of a system, running computational experiments, and analyzing the results) but also often get involved in problem (re)formulation and generating new research ideas.

Qualifications:

The student should have a basic understanding of machine learning concepts and should be competent in C++ (for SAT solver) and Python (for ML). Knowledge of deep learning (pytorch), cluster computing, statistics, and CDCL SAT solvers will be an asset.

Please contact Chris Cameron at cchris13@cs.ubc.ca if interested and to set up a time to talk.

Specification Resolution for Polymorphic Features of the Rust Type System

The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. Prusti supports equipping Rust program functions with specifications (formal pre- and post-conditions) as annotations attached to a function declaration; those same specifications (along with Prusti’s translation of the ownership aspects of the Rust type system) provide a means of reasoning modularly about calls to the function (reasoning in terms of the specification rather than the function’s body).

However, in the presence of traits, multiple specifications can be relevant for a particular function call: those attached to its declaration in known-implemented traits, as well as though attached at its implementation. Resolving an appropriate effective specification for such calls becomes correspondingly trickier, although initial support for such scenarios is implemented. Once one incorporates Rust’s (bounded) polymorphic types, type members, and alternative implementations, the appropriate set of rules is harder still to pin down. In this project, we aim to develop a clear set of rules for resolving usages of functions (and other program members) to corresponding specifications. Furthermore, we will design and implement a clean, separate phase of the Prusti verifier which implements these rules and (via appropriately designed programmatic representations) assembles the information required to make specification resolution straightforward for later stages of the verifier’s implementation.

Enriched Rust Program Representations for Program Analysis and Verification

The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. Prusti performs verification of Rust programs via translation from a Rust compiler intermediate representation (MIR) into a program in a different intermediate verification language (Viper). This translation is performed after type and borrow-checking, and heavily depends on and exploits Rust’s ownership type system: in practice Prusti extracts information from a number of static analyses and interfaces from the compiler.

However, Prusti’s translation current intertwines Rust-level concerns (extracting compiler analysis information), with Viper-level concerns (how Rust language features are ultimately embedded into Viper, modelled and reasoned about). This makes explaining and formalizing the conceptual model behind the verification difficult, extending the translation more-challenging than needed, and makes the code impact of (frequent) changes to unstable compiler APIs much greater than necessary. In this project, we’ll collaborate with other members of the Prusti team to design a new intermediate representation of a Rust program, elaborated with full information from the compiler to explain why and how type and borrow-checking were successful. We’ll implement new translation stages which cleanly separate the Rust/compiler-related aspects from the underlying Viper translation, enabling a new core verification path for the tool.

Enhanced Verification Support for Rust Slices and Pointwise Specifications

The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. Prusti includes basic support for reasoning natively about Rust’s array and slice types, the latter of which provides a type-representation for a partial version of the underlying data structure.

Verification of programs manipulating such slices must combine the ability to specify and track values potentially modified via a slice type, while preserving for free (“framing”) information about other disjoint portions of the underlying data. Encoding these verification requirements in ways which can be automated with decent performance raises many challenges, due to the prevalence of logical quantifiers in the underlying conditions. In addition, many useful functional specifications of such programs require comprehensions of data structure contents (sums, sets, sortedness properties, and the like), which present further challenges for specification and automation. In this project we will design and implement several extensions to Prusti’s existing support for slices, and apply the new support to verify important algorithmic examples such as sorting functions.

Summer 2021

Ian M. Mitchell

Project 1. Software development, motion modeling, sensor integration, collaborative control scheme design, and testing for a highly dynamic ground robot

There are many types of vehicles and applications for which it may be desirable to share control between a human and robot. The Verification, Control and Robotics (VCR) lab has worked with a shared control powered wheelchair platform for many years, but is now developing a testbed based on small, fast autonomous race cars (specifically the NVIDIA Jetracer). The goal of this project is to continue development of this testbed, which will (a) reduce the overhead of running user studies on new shared control strategies compared to the cumbersome wheelchair platform and (b) allow us to compare and contrast the requirements and possible solutions to these two very different shared control use cases.

The student will learn ROS (the robot operating system) and the basics of human subject trials (including the TCPS2 CORE). The student will work with one or more graduate students to identify and/or learn a model of racer motion; integrate sensors into the system; build a robust and flexible version of the shared control software and/or design a shared conrol interface. Many of the subprojects could be completed remotely if the pandemic situation makes lab access infeasible. The faculty supervisor has extensive experience training undergraduate student researchers -- including 23 in the past five years -- and the primary doctoral student supervisor previously held an NSERC USRA in the VCR lab before starting his graduate studies.

Applicants must be comfortable with C++, and will be expected to learn and use ROS (robot operating system) to control the vehicles. Applicants should be prepared to work independently, to express themselves clearly in discussions, and to give at least two brief presentations to the research group.

Familiarity with machine learning (such as CPSC 330 or 340), computer vision (such as CPSC 425), human-computer interaction (such as CPSC 344), computational optimization (such as CPSC 406), parallel programming (such as CPSC 418), system identification, electronics, mechatronics or autonomous vehicles would be useful for some but not all potential subprojects. It is expected that applicants may know few if any of these optional topics.

Update: The position has been filled.

Aastha Mehta

Project 1. PanCast: Listening to Bluetooth Beacons for Epidemic Risk Mitigation

Contact tracing is an important part of an epidemic mitigation strategy. During the ongoing COVID-19 pandemic, there have been burgeoning efforts around the world to develop and deploy smartphone apps to expedite contact tracing and risk notification. These apps rely on various technologies ranging from QR-code scanning to GPS tracking and pairwise Bluetooth exchanges between individuals in prolonged proximity of each other. Unfortunately, these apps have not yet proven to be sufficiently effective due to low adoption rates, privacy concerns, and low accuracy in predicting infection risk. Indeed, B.C. has not adopted the Candian national COVID Alert app due to some of these very reasons.

We propose PanCast, a privacy-preserving and inclusive system for epidemic risk assessment and notification that scales gracefully with adoption rates, utilizes location and environmental information to increase utility without tracking its users, and that can also be effective in identifying superspreading events. To this end, PanCast utilizes Bluetooth encounters between beacons placed in strategic locations (e.g., where superspreading events are most likely to occur) and inexpensive, zero-maintenance, small devices that users can attach to their keyring. PanCast allows healthy individuals to use the system in a purely passive "radio" mode, and can assist and benefit from other digital and manual contact tracing systems. Finally, PanCast can be gracefully dismantled at the end of the pandemic, minimizing abuse from any malevolent government or entity.

To know more about the project, visit https://pancast.mpi-sws.org/. You can participate in this project by getting involved in building a prototype of the PanCast system using Bluetooth devices and a deployment of the system in a small part of the university campus. If you are interested, reach out to aasthakm@cs.ubc.ca.

Alex Summers

Project 1. Extensible parsing and type-checking for Prusti

Prusti’s specification language includes a large subset of Rust itself, but adds additional features such as logical quantifiers and connectives. This creates a tension in the handling of parsing and type-checking of specifications: to avoid duplicating unnecessary code we would like to reuse the compiler’s existing parser and type-checker, but this cannot be applied to specifications in general. The current solution implemented leads to ad hoc restrictions and can cause poor error reporting for incorrect inputs

Tasks: In this project you will develop and implement an approach for parsing and type-checking a generalisation of Prusti’s current specification language, allowing for inputs which mix and match both Rust statements and Prusti-specific features while simultaneously reusing the Rust compiler’s existing codebase for parsing and type-checking the Rust aspects of the input. If successful, this will provide a significant improvement to the generality and usability of the Prusti verifier.

Qualifications: Prior experience with Rust and/or compiler implementation are advantageous but not essential; expertise with imperative programming of some kind is needed.

Project 2. Automated Debugging for SMT Solving

SMT solvers have a wide variety of applications across Computer Science, including program analysis and synthesis tools, automated planning and constraint solving, optimisation problems and software verification. Advanced tools such as program verifiers are often built around SMT encodings of their problems. However, designing these encodings to perform reliably and fast is a challenging task. In recent work, we developed the Axiom Profiler tool to serve as a debugging tool for quantifier-rich SMT problems.

Tasks: In this project, you will work to extend the Axiom Profiler to provide new automated techniques for helping a user to zoom in on, diagnose and solve inadequacies in their SMT encodings.

Qualifications: Prior experience with SMT Solvers, visual tools and/or formal reasoning is an advantage, but is not essential; expertise with imperative programming of some kind is needed.

Project 3. Sufficient preconditions for panic-freedom of Rust functions

The Rust programming language is a relatively-new systems language, designed to be a more-robust alternative to C++ and similar languages. The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, allowing programmers to attach specifications to functions describing e.g. how they should be called (preconditions). An important property of Rust code is panic-freedom: a “panic” in Rust is a runtime exception, and one would typically like to rule out such program behaviours at compile time. Prusti offers to possibility to statically prove that *for calls to a function satisfying its precondition*, the function’s body cannot cause panics. In general, however, this requires annotating functions with preconditions strong enough to rule out potential panics.

Tasks: In this project, we will first analyse existing codebases (published on crates.io) to identify how many functions can already be proven panic-free by Prusti with no preconditions (for example, because if-conditions in the code conservatively check for problematic cases before a panic can be caused). For the remaining functions, we will investigate how complex and difficult (or not) the task is of providing sufficient specifications for Prusti to be able to show panic-freedom. As a result, we should understand better the question of how rich a function’s specification needs to be in practice to at least eliminate these basic kinds of runtime errors.

Qualifications: Prior experience with Rust and/or compiler implementation are advantageous but not essential; expertise with imperative programming of some kind is needed.

Project 4. A standard library for Rust verification

The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. Prusti’s verification is function-modular, meaning that calls to other functions are reasoned about via their specifications; for this reason, it’s necessary to provide good specifications for commonly-used standard library functions.

Tasks: In this project, we will work on equipping as much of the Rust standard library as possible with suitable Prusti specifications. This will exercise some relatively new features of the verifier, and will likely uncover the need for further extensions. Furthermore, we’ll investigate whether it’s clear that a single specification is suitable/convenient for all clients of a library, or whether we need support for specifications of different levels of complexity, depending on the usage and application.

Qualifications: Prior experience with Rust and/or formal reasoning about programs are advantageous but not essential; expertise with imperative programming of some kind is needed.

Project 1. Ethics in Human Surrogates and Virtual Humans

Human surrogates (HSs, a.k.a virtual humans), what was once a sci-fi fantasy, is becoming a reality thanks to advances in deep fake, speech synthesis, and robotics technology. A virtual reunion of a South Korean mother and a virtual recreation of a deceased daughter was a particularly dramatic event. While many of these efforts are well intended, there are unforeseen blind spots that could cause dire consequences. A recent example being the fake Obama speech video created using AI video tools. While these advances push the boundaries of technology, we know very little about its unintended and long-term implications to individuals and society. It is quintessential to engage with the ethical issues of HSs research, asking critical research questions: How do experts and public stakeholders perceive the costs and benefits of HSs? How do we define boundaries and off limits of ethical issues in HSs? How do these users and stakeholders think of the use and adoption of HSs? Through these questions, we look to provide an ethical framework and guidelines to inform the ethical considerations for future research in HSs. We will (1) read books and research papers on technology ethics and (2) run a qualitative study interviewing researchers and practitioners in AI, Robotics, and Computer Graphics.

Qualifications:
- Passionate in ethics and equity issues in socio-technical systems.
- Ability to deeply engage with academic literature and articulate in presenting ideas
- (optional) Experience or strong interest in qualitative methods, such as conducting semi-structured interviews

Please apply by January 27. For application details please check the following link: https://docs.google.com/document/d/1iRUvK5wGmIZY7ZKeWg-XWW3P1w1qF5rfMQhnbw2IQ7Y/edit.

Project 2. Learning to Play a Song with Online Videos: Strategies and Challenges of Musical Instrument Learners

Learning to play a musical instrument by watching Youtube videos is becoming increasingly popular. Especially in situations such as the COVID-19 pandemic, playing a musical instrument is a meaningful way to spend time in isolation, as it promotes self-development and helps reduce anxiety and depression. These videos offer easy access to tutorials by expert musicians (or even the original artists, like how the famous guitarist, Carlos Santana, teaches his own song!) and diverse teaching styles (you can find an instructor who matches your taste). In this project, we will study how people learn musical expressions on YouTube and what are their specific challenges in doing so. Our findings will establish a basis for designing an intelligent interface (e.g., a browser plugin for Youtube) to help the learners find videos that are suitable for their learning styles and goals, utilize the videos effectively, and monitor their progress systematically.

Qualifications
- Successful completion of the HCI courses (e.g., CPSC 344 or 544)
- (optional) Interests in musical instruments and musical experience
- (optional) Experience in front-end web development

Please apply by January 27. For application details please check the following link: https://docs.google.com/document/d/1iRUvK5wGmIZY7ZKeWg-XWW3P1w1qF5rfMQhnbw2IQ7Y/edit.

Project 3. Enhancing Pull Request Interface with Interactive Referencing Features

The productivity of software engineering practices are increasingly dependent on the efficacy of cooperative work between multiple stakeholders. Pull Request (PR) is an important software development lifecycle tool that these stakeholders use to contribute changes, feedback, and suggestions into their shared software codebase. Discussing a PR involves referring to the code changes within the PR itself and also the underlying context (e.g., who made the changes, for what, associated documentation, system logs, etc.). It is time- consuming and error-prone for developers to refer to and understand such contextual system elements embedded in the textual discussion threads. To save developer time and prevent potential miscommunication, we need a better PR interface that helps the stakeholders create and access the references around the PR. Our approach is to make the references interactive and recommended. To this end, we will: (1) Build a taxonomy of references in PR by qualitatively analyzing diverse PR dataset available online, (2) Apply the taxonomy to a large number of PRs and identify patterns in how code elements are referenced, and (3) Design a new traceable PR discussion interface based on our empirical findings.

Qualifications
- Strong technical skills including OOP, data structures, and algorithms.
- (optional) Deep experience in front-end web development
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)

Please apply by January 27. For application details please check the following link: https://docs.google.com/document/d/1iRUvK5wGmIZY7ZKeWg-XWW3P1w1qF5rfMQhnbw2IQ7Y/edit.

Helge Rhodin

Project 1. Egocentric motion capture

In recent years, there has been tremendous progress in video-based 6D object pose and human 3D pose estimation, even from head-mounted cameras [Related work]. They can now both be done in real time but not yet to the level of accuracy that would allow the capture how people interact with other people and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object or shakes someone else’s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered if the resulting models are to be used in see-through AR devices, such as the Hololens or consumer-level video see-through versions.
Key to this project is the accurate modeling of contact points and the resulting physical forces between interacting hands and feet with the surrounding. The hardware setup will be a mobile head-mounted camera, building upon our egocentric motion capture work (EgoCap) [Related work]. The goal of this project is to use inertial measurement units (IMUs) and deep learning to sense ground contact and relative positions.

Related work:
EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras. Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov, Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, and Christian Theobalt. SIGGRAPH Asia 2016

Project 2. Gravity for scale estimation in multi-view reconstruction

This project aims at reconstructing objects and camera geometry by exploiting the Newton's equations of motion and gravity. Estimating metric scale from image or video recordings is a fundamental problem in computer vision and important for determining distances in forensics, autonomous driving, person re-identification, and structure-from-motion (SfM). In general, object size and distance cancel in perspective projection---which makes the problem ill-posed.
The main idea is to use the omnipresent gravity on earth as a reference 'object'. Newton's second equation of motion dictates that the trajectory of an object is a parabola, a function of time, its initial speed and position, with the curvature determined by the acceleration induced by constant external forces. This project build upon our earlier work on estimating a person’s height by relating acceleration in the image and to gravity on earth [Related work], as sketched in the inset. By contrast, the aim of this project is to use similar principles for multi-view reconstruction, for structure from motion and automatic camera calibration.

Related work:
Gravity as a Reference for Estimating a Person's Height from Video. Didier Bieler, Semih Günel, Pascal Fua, and Helge Rhodin. ICCV 2019

Project 3. Learning anthropometric constraints for monocular human motion capture

In recent years image-based human motion capture (MoCap) has progressed immensely with varying applications in e.g. movie production, sports analytics, virtual reality systems, games, human computer interaction, and medical examinations. Nowadays marketable software in these fields requires sophisticated motion capture studios and expensive measurement devices which strongly limits its applicability. This project aims to achieve human MoCap using only a single RGB camera to enable MoCap in the wild.
Since a video taken by a single camera contains no depth information, additional assumptions on the scene need to be made. Fortunately, the human body satisfies several constraints: bones lengths limits to specific values, symmetry of opposing body parts, joint angle limits etc. Depending on your interest different machine learning approaches will be applied to learn these constraints from image data.

Related work:
RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation, Bastian Wandt, Bodo Rosenhahn, CVPR 2019

Ivan Beschastnikh

Project 1. Privacy-preserving ML on health data

And, we are developing brand-new distributed ML approaches that do away with centralization. We are collaborating with researchers in the UBC medical school and VGH, and patient groups in the city to come up with technologies that are sensitive to user's privacy constraints and solve real problems. Our work is open source and aims to provide practical alternatives to today's systems that provide minimal privacy guarantees to patients. We are looking for 1-2 students who have a background in ML, databases, networks, or distributed systems. If you're interested in technologies to improve healthcare, then this project is for you! To learn more about our work, visit https://leap-project.github.io/.

Project 1. Deep learning for branch prediction in SAT solvers

Designing efficient algorithms for solving the boolean satisfiability (SAT) problem is crucial for practical applications such as scheduling, planning, and software testing. Machine learning is an important tool in algorithm design for SAT, however SAT practitioners have yet to exploit the representational power of deep learning. We propose to train a neural network to learn to make variable ordering decisions for the widely used CDCL-style SAT solvers.

CDCL SAT solvers search for a solution in a tree-like fashion, where decisions are made to iteratively partition the search space. Existing heuristics for deciding how to partition the search space don't use machine learning to leverage the structure of the internal solver state. We propose to use Monte-Carlo Tree Search (MTCS) reinforcement learning to train a neural network policy to map from internal solver state to partitioning decisions in order to minimize the number of steps required to prove SAT/UNSAT.

Most of the student's work will involve software development and setting up large computational experiments for:

Improving efficiency of Monte-Carlo tree search for generating training data Training neural networks to learn policies for CDCL SAT solver Amortizing cost of queries to neural nets over many consecutive decisions

The student should have basic understanding of machine learning concepts and should be competent in C++ (for SAT solver) and python (for ML). Knowledge of deep learning (pytorch), cluster computing, statistics, and CDCL SAT solvers will be an asset.

Please contact Chris Cameron at cchris13@cs.ubc.ca if interested and to set up a time to talk.

Summer 2020

Alex Summers

Project 1. Prusti - Deductive Verification Tools for Rust

The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. We are constantly extending Prusti with support for a richer variety of language features: in this project you will contribute to these efforts by designing and/or implementing verification techniques for advanced features such as closures and lifetime constraints. Prior experience with Rust and/or formal reasoning about programs is an advantage, but is not essential; expertise with imperative programming of some kind is needed.

Project 2. Automated Debugging for SMT Solving

SMT solvers have a wide variety of applications across Computer Science, including program analysis and synthesis tools, automated planning and constraint solving, optimisation problems and software verification. Advanced tools such as program verifiers are often built around SMT encodings of their problems. However, designing these encodings to perform reliably and fast is a challenging task. In recent work, we developed the Axiom Profiler tool to serve as a debugging tool for quantifier-rich SMT problems. In this project, you will work to extend this tool to provide new automated techniques for helping a user to zoom in on, diagnose and solve inadequacies in their SMT encodings.

Project 3. Static Program Analysis for Liveness Properties of Message-Passing Programs

Static program analysis techniques typically aim to identify guaranteed program behaviours fully automatically. Program analysers typically tackle *safety* properties, e.g. prescribing that certain program points will only be reached when certain conditions hold. In the context of programs which heavily employ asynchronous messaging (e.g. actor-based programs), *liveness* properties, e.g. stating that servers always eventually respond to requests, are also of central importance. In this project, we will explore the feasibility of a modular static analysis which combines inference of both safety and liveness properties for asynchronous programs.

David Poole

Project 1. AIspace2

AIspace (see http://aispace.org) has been developed by USRA students and grad students over a number of years. Recently USRA students created a Python-javascript version (https://aispace2.github.io/AISpace2/install.html) based on open-source AI algorithms in Python (http://aipython.org) created by David Poole and Alan Makworth. The aim is to integrate the code with interactive visualizations that are easy for students to extend and allow the students to modify the AI algorithms. There are three aspects of the project: the first is to make the current AIspace2 code more modifiable and user-friendly. The second is to translate interactive demos (https://artint.info/demos/) to Python. The third is to develop similar tools for the rest of the AIPython code. Skills required: knowledge of the content of CPSC 322, proficiency in Javascript and Python, the ability to write clear and simple code.

Please apply by January 25. For application details please check the following link: https://docs.google.com/document/d/1iRUvK5wGmIZY7ZKeWg-XWW3P1w1qF5rfMQhnbw2IQ7Y/edit.

Project 1. Augmented Reality Interfaces for Asynchronous Collaboration in 3D Environments

This study aims to design and build a collaborative augmented reality (VR/AR) system for annotating a 3D environment with recordings of multimodal interactions (e.g., speech, gesture, gaze), drawing on human-computer interaction approaches. Annotations are basic building blocks of asynchronous collaboration in a shared workspace (e.g., a game director giving feedback to a level designer on a 3D map by commenting on it). However, existing AR annotation interfaces rely primarily on static content (e.g., text, mid-air drawing), which is not as nuanced nor as expressive as in-person communication where people can talk, gaze, and gesture. To enrich and expand communicative capacities of AR annotations, I envisage an AR counterpart of email or Google Docs, where collaborators can record their multimodal performances (e.g., voice, view changes, and hand movements) in a 3D environment and share such rich media-based messages back and forth with other parties. The challenges are as follows: (1) developing an easy-to-use interface for creating and editing the recorded multimodal annotation, (2) designing lightweight interactions for browsing and skimming multimodal recordings, and (3) helping users overcome psychological barriers in recording multimodal inputs (e.g., speech anxiety).

Qualifications
- Successful completion of the introductory computer graphics courses (e.g., CPSC 314)
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)

Project 2. Computational Text Analysis and Crowd-sourced Annotation of Research Documents for Identifying Gender Bias of Human Subject Sampling

The empirical basis of HCI studies draws on user conditions and needs collected from 'human subjects' (i.e., informants or participants of the study). However, researchers often ignore gender bias in the sample population, in pursuit of expediency and convenience during the sampling process. This underrepresentation can result in theories and guidelines for designing technologies that fail or even harm women and non-binary people. To theorize, validate, and address gender bias in HCI research, We then examine the claims of gender imbalance by establishing a robust evidentiary basis using a computational data-driven meta-analysis of the HCI literature, including ~16K research papers, at-scale. we will integrate computational text analysis into our novel crowd-sourced labeling framework. This study takes three phases as follows: (Phase 1) ‘Data crawling’ of the ~16K HCI publication documents and structuring them in a machine-readable format; (Phase 2) Building a text analysis engine that takes a research paper and automatically identifies paragraphs and sentences containing human subject descriptions (e.g., "Participants" chapters) by using document layout analysis (e.g., X-Y cut) and supervised text classification techniques (e.g., Naive Bayes); and (Phase 3) Building and testing the crowdsourcing framework where the subject’s gender data is extracted from the text snippets from Phase 2.2 and then verified by multiple workers, following Identify-Encode-Verify workflow

Qualifications
- Strong technical skills including OOP, data structures, and algorithms.
- (optional) Successful completion of AI courses (e.g., CPSC 322, 422, or equivalent

Project 3. Natural User Interactions for Video Interfaces

This project aims to build and study novel interaction techniques for browsing and skimming online videos. As we watch videos daily on YouTube, MOOCs, and SNS, video has become a central medium for education, entertainment, and social interactions. However, the way we interact with videos has remained the same for decades. How can we go beyond a slider-bar and thumbnails? To support dynamic, seamless, and semantic interactions for browsing, searching, and skimming online videos, we will (1) develop novel interaction metaphors, (2) leverage speech and video recognition techniques, and (3) employ natural interaction capacities of modern interactive devices (e.g., touch and gesture of tablets).

Qualifications
- Strong technical skills including OOP, data structures, and algorithms.
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)

Project 4. Improving Language Education with a Speech and Gesture Commenting System

The goal of this project is to improve foreign language speaking practice with a speech and gesture commenting tool. In face-to-face instruction, speech vanishes into the air without leaving a trace. Due to the transient nature of speech, language instructors don’t have time to provide in-depth feedback on students’ speaking performances, and students miss the opportunity to reflect on their mistakes. To fill this gap, we will build a speech commenting tool, based on an existing rich commenting system called RichReview, through which students can submit speech recordings and instructors can give speech feedback on students’ submissions. This tool has two beneficial features: (1) an animated visual pointer to refer to part of audio content (e.g., “You are using wrong inflection HERE.”), and (2) an efficient browsing feature to replay multiple speech clips quickly and effortlessly. For evaluation, we will pilot the tool in two foreign language courses in UBC.

Qualifications
- Web development skills and experiences (full-stack and node.js preferred)
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)

Helge Rhodin

Project 1. Egocentric motion capture

Project 2. Gravity for scale estimation in multi-view reconstruction

Related work:
Gravity as a Reference for Estimating a Person's Height from Video. Didier Bieler, Semih Günel, Pascal Fua, and Helge Rhodin. ICCV 2019

Project 3. Learning anthropometric constraints for monocular human motion capture

Related work:
RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation, Bastian Wandt, Bodo Rosenhahn, CVPR 2019

Ivan Beschastnikh

Project 1. Privacy-preserving ML on health data

To train multi-party ML models from user-generated data, users must provide and share their training data, which can be expensive or privacy-violating. We are exploring ways to augment state-of-the-art approaches, like federated learning, with better security/privacy. And, we are developing brand-new distributed ML approaches that do away with centralization. We are collaborating with researchers in the UBC medical school and VGH, and patient groups in the city to come up with technologies that are sensitive to user's privacy constraints and solve real problems. Our work is open source and aims to provide practical alternatives to today's systems that provide minimal privacy guarantees to patients. We are looking for 1-3 students who have a background in ML, databases, networks, or distributed systems. If you're interested in technologies to improve healthcare, then this project is for you!

Project 2. Better resource scheduling in the cloud

We rely on some cloud for most of our daily activities on and off the web. A challenge for cloud providers is to efficiently utilize their data-centers that house hundreds of thousands of servers. A common technique to multiplex server resources across multiple users is to isolate each user's compute requirement as a Virtual Machine (VM). Thus, the cloud resource allocation challenge is equivalent to placing VMs on servers based on some objective, such as maximizing a datacenter's utilization. In this project we are developing new algorithms and systems to improve resource scheduling in the cloud. Some algorithms rely on training ML models to predict the lifetime of a user's VM, others rely on heuristics that pack related VMs closer to each other. Our work is open source and we are working with cloud providers to deploy our algorithms into production data centers. For this project we are looking for 1-2 students who have a background in algorithms, ML, and networks or distributed systems. A longer description of the project is posted here. If you're interested in algorithms and cloud computing, then this project is for you!

Ian Mitchell

All projects below list a small number of required skills and a larger number of "useful" skills. Do not be discouraged by the latter -- applicants are not required to have any of those skills, and most applicants will have no more than a few.

For all positions applicants must be prepared to work independently, to express themselves clearly in discussions, and to give at least two brief presentations to the research group.

If you are interested in applying for one or more of these positions, please email ian.mitchell@ubc.ca with your resume/CV and a transcript. The subject should be "USRA application". Please state in your email which project(s) you are interested in (numerical software, wheelchair and/or racer).

Project 1. Numerical software for analyzing cyber-physical systems

Cyber-physical systems are those which involve interaction between computers and the external world, and include many safety critical systems such as aircraft, cars, and robots. Analysis of these systems typically uses differential equation models for the physical component of the system, because its state evolves in continuous time and space. Reachability algorithms can be used to verify -- or even synthesize controllers to ensure -- the correct behavior of dynamic systems, and a variety of such algorithms have been designed for differential equation models. The goal of this project is to demonstrate a new example on, improve the user interface of, validate the implementation of, parallelize and/or add features to one of several software packages used for approximating sets of solutions in order to demonstrate the correctness of robotic or cyber-physical systems. The Toolbox of Level Set Methods [http://www.cs.ubc.ca/~mitchell/ToolboxLS] is a locally developed example, but others include JuliaReach [https://github.com/JuliaReach], CORA [http://www6.in.tum.de/Main/SoftwareCORA] and SpaceEx [http://spaceex.imag.fr/]. Applicants should be comfortable with numerical ODE solvers (for example, CPSC 303 or Math 405) and a numerical computing language (such as Matlab, SciPy, or Julia). Familiarity with computational optimization (such as CPSC 406), parallel programming (such as CPSC 418), machine learning (such as CPSC 330 or 340), or numerical partial differential equations (such as Math 405) would be useful for some but not all potential subprojects.

Project 2. Collaborative control scheme design, simulation and testing for a smart wheelchair

As part of the AGE-WELL Network Center of Excellence [http://www.agewell-nce.ca] I have a project investigating techniques which would allow elderly individuals with mild cognitive and/or sensory impairments to better use powered wheelchairs. While it is relatively easy to implement a system in which either the user or the robotic planner chooses the motion of the wheelchair, it is much more challenging to blend these two inputs in real-time and in a manner which is both safe and non-threatening to a cognitively impaired user. As part of this process, the team runs user studies with the target population and their therapists in long term care centers. Potential goals for this summer's project include ongoing prototype development and evaluation of collaboration and training interfaces and control policies, development and evaluation of learning methods for predicting behavior of the chair and/or user, data collection and analysis from real-world or virtual trials, or setting up a virtual reality workstation for trials of collaboration control policies. Applicants should be comfortable with C++, Matlab and/or Python, and will be expected to learn and use ROS (robot operating system) to program the wheelchair(s). Familiarity with human-computer interaction (such as CPSC 344), computational optimization (such as CPSC 406), machine learning (such as CPSC 330 or 340), computer vision (such as CPSC 425) or electronics would be useful for some but not all potential subprojects.

Project 3. Software development, motion modeling, sensor integration, collaborative control scheme design, and testing for a highly dynamic ground robot

The wheelchair project focuses on a collaborative control problem where the vehicle is large but slow moving, the environment is unconstrained, the user is untrained, and the interface options are limited. At the other end of the spectrum, autonomous racing vehicles are small and fast, the environment is more constrained, users can be highly trained, and it is possible to design richer interfaces. Potential platforms include the NVIDIA Jetracer [https://github.com/NVIDIA-AI-IOT/jetracer] or F1tenth [http://f1tenth.org/]. Goals for this project include characterizing the racer(s) physics, integrating sensors, designing a collaborative control scheme, designing an interface, and testing the results. Applicants should be comfortable with C++, and will be expected to learn and use ROS (robot operating system) to control the vehicles. Familiarity with machine learning (such as CPSC 330 or 340), computer vision (such as CPSC 425), human-computer interaction (such as CPSC 344), computational optimization (such as CPSC 406), parallel programming (such as CPSC 418), system identification, electronics, mechatronics or autonomous vehicles would be useful for some but not all potential subprojects.

Khanh Dao Duc

Project 1. Building an interface to classify and cluster ribosome components from the Protein Data Bank (PDB)

My group is looking for a CS student to use bioinformatic tools and implement computational methods for analyzing molecular structures found by cryo-EM or X-ray crystallography. In the context of my recent research, there is a need for studying a family of these structures, called ribosomes. Ribosomes are the molecular machines that mediate protein translation, one of the most fundamental process underlying life. Since the ribosome is made of many different proteins, a key question is to understand differences in composition for different species, and have the tools to automate such comparison. For the past few years, many new ribosomes structures have been discovered and publicly shared through the Protein Databank. The goal will be to develop the tools and an interface that allows the user to compare and visualize all these structures and the proteins (~80 for each structure) that constitute a ribosome, to identify homologous proteins, clusters of proteins close in space, label them according to their position, and implement general methods for geometric comparison.

I am currently working on writing an invited review on the heterogeneity of cryo-EM structures, in collaboration with Dr Frederic Poitevin (Stanford), to be submitted in spring, so some results for this project will be added in the review (with the student added as co-author). We also expect this project to lead to a bioinformatic tool and subsequent paper in bioinformatics, describing how it can be used to quantitatively study ribosome structures.

Reference: Dao Duc et al., 2019, Nucleic Acids Research, https://academic.oup.com/nar/article/47/8/4198/5364857

Project 1. Improving efficiency and student satisfaction in peer grading systems

Peer grading has the potential to improve educational outcomes in three main ways: (i) it makes educational systems more scalable by offloading some grading work to students, (ii) it provides students with faster and more detailed feedback, and (iii) it helps students to learn better through thinking critically about the work of others. Mechanical TA2 (MTA2) is a web-based peer grading application that facilitates peer grading. The newest version is just now receiving its first uses but is still a work in progress. The initial use of MTA2 provided evidence that while students benefited from doing the peer reviews, they doubted the quality of the grades that they got through their peers. The focus of this project is to study the effectiveness of the current MTA2's design and try to improve it as much as possible.

Student’s role in the project:

1. Investigating the data gathered through MTA2’s initial use and searching for the sources of inefficiencies that caused students dissatisfaction in the course (e.g., the way that peer grading are aggregated, common trends in the way students grade or common patterns in the grades they received).

2. Coming up with solutions to improve the inefficiencies in the systems design (e.g., suggesting a better grade aggregation method).

3. Implementing new features in MTA2 based on the obtained solutions to improve the system’s efficiency.

Skilled required for the project: Data analysis, programming, and the ability to work independently.