One of the most valuable research experiences for an undergraduate student is to be a research assistant. Each year, the department receives a number of research awards that help provide funding for an undergrad student to spend 16 weeks over the summer working full time in one of the department’s research labs, often with the opportunity to publish their work. (See this page for previous projects and supervisors.) This kind of research experience is highly sought after by graduate programs.
All applicants are required to confirm both their eligibility to apply and to work, and ensure they have all necessary requirements prepared (ex. Social Insurance Number and permits).
International students must have a valid Social Insurance Number and be eligible to work on campus for the duration of the award (ex. in the summer). Students will be required to provide any necessary details and documentation upon accepting the award (SURE or WLIUR). This is necessary for processing and payment. Students who are offered awards but who do not meet this criteria will not be able to accept. For questions about eligibility, please speak with an International Student Advisor.
The positions are available to 2nd, 3rd, and 4th year students with strong academic records. More information, including eligibility requirements, can be found at the links provided on this page. Watch for in-class and email announcements from the department for details and deadlines.
Please see the pages linked below for important information, including eligibility:
NSERC USRA - NSERC Undergraduate Student Research Award
See the following links for more details:
How to Apply
Deadline: February 11, 2020 at 4:00 PM
New Application Requirement: all applicants are required to apply with a confirmed supervisor. The list of projects and supervisors will be posted on this page, and you can use this to approach any of the supervisors listed. However, you aren't limited to the projects and supervisors listed below. We encourage you to directly contact professors you would like to work with to find a match. Many professors will be happy to talk to you about the opportunity to hire students at a subsidized wage. You can find our faculty directory here. For some additional tips, please see the UBC Careers page.
- Read the details above and the information at the links on this page
- Determine which awards you are eligible for
- Contact potential supervisors from the Projects and Supervisors list (see below) or by approaching Computer Science faculty members you would like to work with
- Once you have a confirmed supervisor, submit the online application webform by the stated deadline (the webform will become available before the deadline):
- Before submitting, please ensure that you have read over the online guidelines, eligibility requirements, and webform instructions carefully
- Make sure to read all of the instruction text in the webform, there may be important details noted below each field
- When the department is informed of how many awards are available, a departmental adjudication committee will rank the applications. All applicants will then receive a decision. If you are selected for an award, you will then receive an email with instructions to submit a new webform to provide additional information/documentation:
- When you are applying: please read the webform carefully to ensure you are prepared to accept by providing these details
- (Your supervisor will also be contacted for required information)
Please note that we cannot provide a timeline for a decision on your application or provide any additional details and therefore will not respond to requests for such information.
- All students should complete the NSERC Form 202 on the NSERC USRA website by clicking "On-line System Login" or, if you are a first-time user, "Register"
- All students should upload a PDF of the completed NSERC Form 202 to the online aaplication webform
- DO NOT submit the application on the NSERC website until you have been accepted for the award and instructed to do so (at the end only students awarded for NSERC USRA will submit the application to NSERC wesite)
- Instructions on how to complete the forms can be found on the NSERC USRA website
Questions? For further details, please visit the UBC Student Services website or the NSERC USRA website and review the information and links provided, as these will likely give you the answers to your questions. If you would still like additional assistance, please see our Advising Webform instructions to see if you are eligible to submit a webform request.
Projects and Supervisors: Summer 2020
Project 1. Improving computational workflows for the analysis of spatial transcriptomic data
Emerging technologies are allowing biologists to generate increasingly large and complex datasets. One exciting new area of technology are spatial omics methods which leverage advanced micro-fluidics and high resolution microscopy to measure the abundance of RNA and proteins in cells in 2D. The Roth lab is working with other scientists at BC Cancer to develop computational methods to analyze this data. This project will focus on the analysis data generated by a protocol called “Multiplexed error-robust fluorescence in situ hybridization” (MERFISH).
MERFISH analysis generates a large number of high resolution microscopy images, which can go through a sophisticated computational analysis pipeline. Key steps in this pipeline include alignment and stitching of images, identification of spots representing RNA, segmentation of cell boundaries, and “decoding” the RNA barcodes. The Roth lab has currently implemented a Python based pipeline based on the original MERFISH paper. The successful applicant will be responsible for maintaining this pipeline and implementing new features under the supervision of Dr. Roth and his graduate students. Knowledge of best practices in software engineering such as unit testing, version control and continuous integration are required. Previous experience with image analysis, machine learning and familiarity with Python would all be assets.
Project 2. Non-parametric Bayesian models for cancer stratification
It is increasingly apparent that cancer is a collection of related diseases, caused by cells with differing mutational and gene expression profiles. Understanding these differences, and stratifying cancers into subtypes has important implications for the treatment of patients. Current approaches which apply high throughput genomics data to stratify subtypes have been restricted to analysing cancers from a single anatomical region. However, there is now a large quantity of data from multiple types of cancer which could be analysed jointly. A major impediment to this task, is that the cancer samples tend to cluster together primarily by their cell or anatomical region of origin. Thus a joint cluster analysis is not significantly more informative than considering each type of cancer separately.
This project will explore the use of feature allocation models instead of clustering models to perform molecular stratification of cancers across multiple types. This should allow for latent features that explain the dominant signal from the tissue of origin, with additional features explaining important alterations causing malignancy. By allowing these features to be shared across cancers from multiple types, we expect to simultaneously gain statistical power as well as identify molecular subtypes which span multiple cancer types.
The successful applicant will work with Dr. Roth and his graduate students developing a generative model based on the non-parametric Bayesian Indian Buffet Process (IBP) for analysing gene expression data from multiple cancer types. The student will help implement inference procedures for the model using Markov Chain Monte Carlo and potentially variational methods. In addition to developing experience in statistical modelling and computation, the student will also gain a basic knowledge of high throughput genetic sequencing data and its impact in oncology and clinical cancer genomics. The student will apply the model to real world data from the International Cancer Genome Consortium, which has generated gene expression data for ~9,000 samples from 16 types of cancer. A strong mathematical background with at least one course in probability is required. Students who have taken courses in computational statistics and machine learning would be good candidates. Familiarity with Python and numerical libraries such as numpy, scipy, numba and pandas would an asset. A basic knowledge of biology is also required i.e. what is RNA?
Project 3. Single cell multi-omics data
Single cell sequencing technologies are revolutionising cancer research. Biologists are now able to measure different features such as DNA, RNA and methylation of 100s-1000s of cells. However, we can typically only make one type of measurement per cell as the measurement process is destructive. As a result it is challenging to relate measurements of different features. Below are two potential projects related to this field.
Project 3a: Analysis of multi-omics data
The Roth lab is working with collaborators at BC Cancer who have generated extensive single cell datasets. This sub-project will focus on the analysis of the data. As such it requires a strong biological background and would be ideal for a student interested in pursuing graduate studies in Bioinformatics. The successful applicant will work with Dr. Roth and his graduate students to analyse single cell datasets and integrate the different measurements to develop biological hypotheses. A background in statistics or machine learning is required. In particular familiarity with dimensionality reduction techniques such as PCA and also regression techniques is required. The applicant must also be familiar with data science workflows in either R or Python. Previous bioinformatics experience is desirable.
Project 3b: Methods for integrating multi-omics data
The current standard for integrating multi-omics data is driven by independent analysis of each data type and subsequent manual integration by domain experts. This approach is laborious due to the scale of the datasets, and lacks statistical power as each type of data is treated independently. This project will focus on developing statistical and machine learning methods for automating the integration of single cell multi-omics data. A strong mathematical background with at least one course in probability is required. Students who have taken courses in computational statistics and machine learning would be good candidates. Familiarity with Python and numerical libraries such as numpy, scipy, numba and pandas would be an asset. A basic knowledge of biology is also required i.e. what is RNA?
The Prusti project (a collaboration between UBC and ETH Zurich) aims to provide modular, code-level verification tools for Rust, a modern systems programming language. The combination of Rust's advanced ownership type system and user-defined specifications makes it possible to provide scalable and efficient formal verification for a wide variety of program properties, including allowing programmers to express their intentions and check that their code will always live up to these. We are constantly extending Prusti with support for a richer variety of language features: in this project you will contribute to these efforts by designing and/or implementing verification techniques for advanced features such as closures and lifetime constraints. Prior experience with Rust and/or formal reasoning about programs is an advantage, but is not essential; expertise with imperative programming of some kind is needed.
SMT solvers have a wide variety of applications across Computer Science, including program analysis and synthesis tools, automated planning and constraint solving, optimisation problems and software verification. Advanced tools such as program verifiers are often built around SMT encodings of their problems. However, designing these encodings to perform reliably and fast is a challenging task. In recent work, we developed the Axiom Profiler tool to serve as a debugging tool for quantifier-rich SMT problems. In this project, you will work to extend this tool to provide new automated techniques for helping a user to zoom in on, diagnose and solve inadequacies in their SMT encodings.
Static program analysis techniques typically aim to identify guaranteed program behaviours fully automatically. Program analysers typically tackle *safety* properties, e.g. prescribing that certain program points will only be reached when certain conditions hold. In the context of programs which heavily employ asynchronous messaging (e.g. actor-based programs), *liveness* properties, e.g. stating that servers always eventually respond to requests, are also of central importance. In this project, we will explore the feasibility of a modular static analysis which combines inference of both safety and liveness properties for asynchronous programs.
Please apply by January 25. For application details please check the following link: https://bit.ly/2ZS74DD.
Project 1. Augmented Reality Interfaces for Asynchronous Collaboration in 3D Environments
This study aims to design and build a collaborative augmented reality (VR/AR) system for annotating a 3D environment with recordings of multimodal interactions (e.g., speech, gesture, gaze), drawing on human-computer interaction approaches. Annotations are basic building blocks of asynchronous collaboration in a shared workspace (e.g., a game director giving feedback to a level designer on a 3D map by commenting on it). However, existing AR annotation interfaces rely primarily on static content (e.g., text, mid-air drawing), which is not as nuanced nor as expressive as in-person communication where people can talk, gaze, and gesture. To enrich and expand communicative capacities of AR annotations, I envisage an AR counterpart of email or Google Docs, where collaborators can record their multimodal performances (e.g., voice, view changes, and hand movements) in a 3D environment and share such rich media-based messages back and forth with other parties. The challenges are as follows: (1) developing an easy-to-use interface for creating and editing the recorded multimodal annotation, (2) designing lightweight interactions for browsing and skimming multimodal recordings, and (3) helping users overcome psychological barriers in recording multimodal inputs (e.g., speech anxiety).
- Successful completion of the introductory computer graphics courses (e.g., CPSC 314)
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)
Project 2. Computational Text Analysis and Crowd-sourced Annotation of Research Documents for Identifying Gender Bias of Human Subject Sampling
The empirical basis of HCI studies draws on user conditions and needs collected from 'human subjects' (i.e., informants or participants of the study). However, researchers often ignore gender bias in the sample population, in pursuit of expediency and convenience during the sampling process. This underrepresentation can result in theories and guidelines for designing technologies that fail or even harm women and non-binary people. To theorize, validate, and address gender bias in HCI research, We then examine the claims of gender imbalance by establishing a robust evidentiary basis using a computational data-driven meta-analysis of the HCI literature, including ~16K research papers, at-scale. we will integrate computational text analysis into our novel crowd-sourced labeling framework. This study takes three phases as follows: (Phase 1) ‘Data crawling’ of the ~16K HCI publication documents and structuring them in a machine-readable format; (Phase 2) Building a text analysis engine that takes a research paper and automatically identifies paragraphs and sentences containing human subject descriptions (e.g., "Participants" chapters) by using document layout analysis (e.g., X-Y cut) and supervised text classification techniques (e.g., Naive Bayes); and (Phase 3) Building and testing the crowdsourcing framework where the subject’s gender data is extracted from the text snippets from Phase 2.2 and then verified by multiple workers, following Identify-Encode-Verify workflow
- Strong technical skills including OOP, data structures, and algorithms.
- (optional) Successful completion of AI courses (e.g., CPSC 322, 422, or equivalent
Project 3. Natural User Interactions for Video Interfaces
This project aims to build and study novel interaction techniques for browsing and skimming online videos. As we watch videos daily on YouTube, MOOCs, and SNS, video has become a central medium for education, entertainment, and social interactions. However, the way we interact with videos has remained the same for decades. How can we go beyond a slider-bar and thumbnails? To support dynamic, seamless, and semantic interactions for browsing, searching, and skimming online videos, we will (1) develop novel interaction metaphors, (2) leverage speech and video recognition techniques, and (3) employ natural interaction capacities of modern interactive devices (e.g., touch and gesture of tablets).
- Strong technical skills including OOP, data structures, and algorithms.
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)
Project 4. Improving Language Education with a Speech and Gesture Commenting System
The goal of this project is to improve foreign language speaking practice with a speech and gesture commenting tool. In face-to-face instruction, speech vanishes into the air without leaving a trace. Due to the transient nature of speech, language instructors don’t have time to provide in-depth feedback on students’ speaking performances, and students miss the opportunity to reflect on their mistakes. To fill this gap, we will build a speech commenting tool, based on an existing rich commenting system called RichReview, through which students can submit speech recordings and instructors can give speech feedback on students’ submissions. This tool has two beneficial features: (1) an animated visual pointer to refer to part of audio content (e.g., “You are using wrong inflection HERE.”), and (2) an efficient browsing feature to replay multiple speech clips quickly and effortlessly. For evaluation, we will pilot the tool in two foreign language courses in UBC.
- Web development skills and experiences (full-stack and node.js preferred)
- (optional) Successful completion of the introductory HCI courses (e.g., CPSC 344 or 544)
Project 1. Egocentric motion capture
In recent years, there has been tremendous progress in video-based 6D object pose and human 3D pose estimation, even from head-mounted cameras [Related work]. They can now both be done in real time but not yet to the level of accuracy that would allow the capture how people interact with other people and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object or shakes someone else’s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered if the resulting models are to be used in see-through AR devices, such as the Hololens or consumer-level video see-through versions.
Key to this project is the accurate modeling of contact points and the resulting physical forces between interacting hands and feet with the surrounding. The hardware setup will be a mobile head-mounted camera, building upon our egocentric motion capture work (EgoCap) [Related work]. The goal of this project is to use inertial measurement units (IMUs) and deep learning to sense ground contact and relative positions.
EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras. Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov, Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, and Christian Theobalt. SIGGRAPH Asia 2016
Project 2. Gravity for scale estimation in multi-view reconstruction
This project aims at reconstructing objects and camera geometry by exploiting the Newton's equations of motion and gravity. Estimating metric scale from image or video recordings is a fundamental problem in computer vision and important for determining distances in forensics, autonomous driving, person re-identification, and structure-from-motion (SfM). In general, object size and distance cancel in perspective projection---which makes the problem ill-posed.
The main idea is to use the omnipresent gravity on earth as a reference 'object'. Newton's second equation of motion dictates that the trajectory of an object is a parabola, a function of time, its initial speed and position, with the curvature determined by the acceleration induced by constant external forces. This project build upon our earlier work on estimating a person’s height by relating acceleration in the image and to gravity on earth [Related work], as sketched in the inset. By contrast, the aim of this project is to use similar principles for multi-view reconstruction, for structure from motion and automatic camera calibration.
Gravity as a Reference for Estimating a Person's Height from Video. Didier Bieler, Semih Günel, Pascal Fua, and Helge Rhodin. ICCV 2019
Project 3. Learning anthropometric constraints for monocular human motion capture
In recent years image-based human motion capture (MoCap) has progressed immensely with varying applications in e.g. movie production, sports analytics, virtual reality systems, games, human computer interaction, and medical examinations. Nowadays marketable software in these fields requires sophisticated motion capture studios and expensive measurement devices which strongly limits its applicability. This project aims to achieve human MoCap using only a single RGB camera to enable MoCap in the wild.
Since a video taken by a single camera contains no depth information, additional assumptions on the scene need to be made. Fortunately, the human body satisfies several constraints: bones lengths limits to specific values, symmetry of opposing body parts, joint angle limits etc. Depending on your interest different machine learning approaches will be applied to learn these constraints from image data.
RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation, Bastian Wandt, Bodo Rosenhahn, CVPR 2019
Project 1. Privacy-preserving ML on health data
To train multi-party ML models from user-generated data, users must provide and share their training data, which can be expensive or privacy-violating. We are exploring ways to augment state-of-the-art approaches, like federated learning, with better security/privacy. And, we are developing brand-new distributed ML approaches that do away with centralization. We are collaborating with researchers in the UBC medical school and VGH, and patient groups in the city to come up with technologies that are sensitive to user's privacy constraints and solve real problems. Our work is open source and aims to provide practical alternatives to today's systems that provide minimal privacy guarantees to patients. We are looking for 1-3 students who have a background in ML, databases, networks, or distributed systems. If you're interested in technologies to improve healthcare, then this project is for you!
Project 2. Better resource scheduling in the cloud
We rely on some cloud for most of our daily activities on and off the web. A challenge for cloud providers is to efficiently utilize their data-centers that house hundreds of thousands of servers. A common technique to multiplex server resources across multiple users is to isolate each user's compute requirement as a Virtual Machine (VM). Thus, the cloud resource allocation challenge is equivalent to placing VMs on servers based on some objective, such as maximizing a datacenter's utilization. In this project we are developing new algorithms and systems to improve resource scheduling in the cloud. Some algorithms rely on training ML models to predict the lifetime of a user's VM, others rely on heuristics that pack related VMs closer to each other. Our work is open source and we are working with cloud providers to deploy our algorithms into production data centers. For this project we are looking for 1-2 students who have a background in algorithms, ML, and networks or distributed systems. A longer description of the project is posted here. If you're interested in algorithms and cloud computing, then this project is for you!
All projects below list a small number of required skills and a larger number of "useful" skills. Do not be discouraged by the latter -- applicants are not required to have any of those skills, and most applicants will have no more than a few.
For all positions applicants must be prepared to work independently, to express themselves clearly in discussions, and to give at least two brief presentations to the research group.
If you are interested in applying for one or more of these positions, please email email@example.com with your resume/CV and a transcript. The subject should be "USRA application". Please state in your email which project(s) you are interested in (numerical software, wheelchair and/or racer).
Project 1. Numerical software for analyzing cyber-physical systems
Cyber-physical systems are those which involve interaction between computers and the external world, and include many safety critical systems such as aircraft, cars, and robots. Analysis of these systems typically uses differential equation models for the physical component of the system, because its state evolves in continuous time and space. Reachability algorithms can be used to verify -- or even synthesize controllers to ensure -- the correct behavior of dynamic systems, and a variety of such algorithms have been designed for differential equation models. The goal of this project is to demonstrate a new example on, improve the user interface of, validate the implementation of, parallelize and/or add features to one of several software packages used for approximating sets of solutions in order to demonstrate the correctness of robotic or cyber-physical systems. The Toolbox of Level Set Methods [http://www.cs.ubc.ca/~mitchell/ToolboxLS] is a locally developed example, but others include JuliaReach [https://github.com/JuliaReach], CORA [http://www6.in.tum.de/Main/SoftwareCORA] and SpaceEx [http://spaceex.imag.fr/]. Applicants should be comfortable with numerical ODE solvers (for example, CPSC 303 or Math 405) and a numerical computing language (such as Matlab, SciPy, or Julia). Familiarity with computational optimization (such as CPSC 406), parallel programming (such as CPSC 418), machine learning (such as CPSC 330 or 340), or numerical partial differential equations (such as Math 405) would be useful for some but not all potential subprojects.
Project 2. Collaborative control scheme design, simulation and testing for a smart wheelchair
As part of the AGE-WELL Network Center of Excellence [http://www.agewell-nce.ca] I have a project investigating techniques which would allow elderly individuals with mild cognitive and/or sensory impairments to better use powered wheelchairs. While it is relatively easy to implement a system in which either the user or the robotic planner chooses the motion of the wheelchair, it is much more challenging to blend these two inputs in real-time and in a manner which is both safe and non-threatening to a cognitively impaired user. As part of this process, the team runs user studies with the target population and their therapists in long term care centers. Potential goals for this summer's project include ongoing prototype development and evaluation of collaboration and training interfaces and control policies, development and evaluation of learning methods for predicting behavior of the chair and/or user, data collection and analysis from real-world or virtual trials, or setting up a virtual reality workstation for trials of collaboration control policies. Applicants should be comfortable with C++, Matlab and/or Python, and will be expected to learn and use ROS (robot operating system) to program the wheelchair(s). Familiarity with human-computer interaction (such as CPSC 344), computational optimization (such as CPSC 406), machine learning (such as CPSC 330 or 340), computer vision (such as CPSC 425) or electronics would be useful for some but not all potential subprojects.
Project 3. Software development, motion modeling, sensor integration, collaborative control scheme design, and testing for a highly dynamic ground robot
The wheelchair project focuses on a collaborative control problem where the vehicle is large but slow moving, the environment is unconstrained, the user is untrained, and the interface options are limited. At the other end of the spectrum, autonomous racing vehicles are small and fast, the environment is more constrained, users can be highly trained, and it is possible to design richer interfaces. Potential platforms include the NVIDIA Jetracer [https://github.com/NVIDIA-AI-IOT/jetracer] or F1tenth [http://f1tenth.org/]. Goals for this project include characterizing the racer(s) physics, integrating sensors, designing a collaborative control scheme, designing an interface, and testing the results. Applicants should be comfortable with C++, and will be expected to learn and use ROS (robot operating system) to control the vehicles. Familiarity with machine learning (such as CPSC 330 or 340), computer vision (such as CPSC 425), human-computer interaction (such as CPSC 344), computational optimization (such as CPSC 406), parallel programming (such as CPSC 418), system identification, electronics, mechatronics or autonomous vehicles would be useful for some but not all potential subprojects.
Project 1. Building an interface to classify and cluster ribosome components from the Protein Data Bank (PDB)
My group is looking for a CS student to use bioinformatic tools and implement computational methods for analyzing molecular structures found by cryo-EM or X-ray crystallography. In the context of my recent research, there is a need for studying a family of these structures, called ribosomes. Ribosomes are the molecular machines that mediate protein translation, one of the most fundamental process underlying life. Since the ribosome is made of many different proteins, a key question is to understand differences in composition for different species, and have the tools to automate such comparison. For the past few years, many new ribosomes structures have been discovered and publicly shared through the Protein Databank (https://www.rcsb.org2). The goal will be to develop the tools and an interface that allows the user to compare and visualize all these structures and the proteins (~80 for each structure) that constitute a ribosome, to identify homologous proteins, clusters of proteins close in space, label them according to their position, and implement general methods for geometric comparison.
I am currently working on writing an invited review on the heterogeneity of cryo-EM structures, in collaboration with Dr Frederic Poitevin (Stanford), to be submitted in spring, so some results for this project will be added in the review (with the student added as co-author). We also expect this project to lead to a bioinformatic tool and subsequent paper in bioinformatics, describing how it can be used to quantitatively study ribosome structures.
Reference: Dao Duc et al., 2019, Nucleic Acids Research, https://academic.oup.com/nar/article/47/8/4198/5364857
Project 1. Improving efficiency and student satisfaction in peer grading systems
Peer grading has the potential to improve educational outcomes in three main ways: (i) it makes educational systems more scalable by offloading some grading work to students, (ii) it provides students with faster and more detailed feedback, and (iii) it helps students to learn better through thinking critically about the work of others. Mechanical TA2 (MTA2) is a web-based peer grading application that facilitates peer grading. The newest version is just now receiving its first uses but is still a work in progress. The initial use of MTA2 provided evidence that while students benefited from doing the peer reviews, they doubted the quality of the grades that they got through their peers. The focus of this project is to study the effectiveness of the current MTA2's design and try to improve it as much as possible.
Student’s role in the project:
1. Investigating the data gathered through MTA2’s initial use and searching for the sources of inefficiencies that caused students dissatisfaction in the course (e.g., the way that peer grading are aggregated, common trends in the way students grade or common patterns in the grades they received).
2. Coming up with solutions to improve the inefficiencies in the systems design (e.g., suggesting a better grade aggregation method).
3. Implementing new features in MTA2 based on the obtained solutions to improve the system’s efficiency.
Skilled required for the project: Data analysis, programming, and the ability to work independently.
Reproducibility is a crucial tenet of research. Scientists can freely share their code and data to help increase the repeatability of their experiments. However, even with scripts used by the original authors, executing them is not guaranteed the same results. Provenance, a history of the execution of a script, can assist in increasing the reproducibility of these analyses. Previous work, containR, used provenance with the R language to bundle scripts and data into containers, improving their repeatability. This work will build on containR for Python scripts. Using provenance collected at the source code level, the student will assess analyses and build a tool to improve the reproducibility of new and existing experiments.
In CPSC 313, we use the y86 language to teach about machine architecture.
It would be convenient to have a compiler that generates y86 code. In theory, developing a backend for LLVM should be straight forward. In practice, doing so is not entirely trivial and will requires some creativity. If you would like an opportunity to build a production compiler backend, this project is for you!
Project 1. Evaluating Quality Impact of Static Analysis Tools
Software systems are complex entities built by teams of engineers. Teams frequently enforce rules that require source code be written according to a set of pre-defined rules to increase it's consistency between team members. Automated tools such as lint and checkstyle are commonly used to statically check these rules are followed and usually disallow changes being added to version control unless they are followed. While the primary goal of these rules is to improve the overall understandability of the system, often-stated tangential goals are to improve the quality and evolvability of the system. While the consistency enforced by style conventions clearly make it easier to understand a system that has been created by a large team, the evidence for quality and evolutionary benefits has not been clearly established.
This project will investigate whether the introduction of common linting rules aimed at improving quality (such as maximum file or method length) have any impact on the overall evolvability or quality of a system. The project will examine both OSS projects as well as locally-available educational resources for which we have a rich collection of lint-related projects.