CS Theses & Dissertations 2015

For 2015 graduation dates (in alphabetical order by last name):

A single subject participatory action design method for powered wheelchairs providing automated back-in parking assistance to cognitively impaired older adults : a pilot study
Adhikari, Bikram
DOI : 10.14288/1.0167654
URI : http://hdl.handle.net/2429/51861
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisors : Dr. Mackworth, Dr. Mitchell

An exploratory study of socio-technical congruence in an ecosystem of software developers
Azad, Deepak
DOI : 10.14288/1.0166086
URI : http://hdl.handle.net/2429/51506
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisor : Dr. G. Murphy

Leveraging distributed explicit-state model checking for practical verification of liveness in hardware protocols
Bingham, Bradley Donald
DOI : 10.14288/1.0166129
URI : http://hdl.handle.net/2429/52670
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisor : Dr. Greenstreet, Dr. Pattabiraman (Electrical & Computer Engineering)

Protocol verification is a key component to hardware and software design. The proliferation of concurrency in modern designs stresses the need for accurate protocol models and scalable verification tools. Model checking is an approach for automatically verifying properties of designs, the main limitation of which is state-space explosion. As such, automatic verification of these designs can quickly exhaust the memory of a single computer. This thesis presents PReach, a distributed explicit-state model checker, designed to robustly harness the aggregate computing power of large clusters. The initial version verified safety properties, which hold if no error states can be reached. PReach has been demonstrated to run on hundreds of machines and explore state space sizes up to 90 billion, the largest published to date. Liveness is an important class of properties for hardware system correctness which, unlike safety, expresses more elaborate temporal reasoning. However, model checking of liveness is more computationally complex, and exacerbates scalability issues as compared with safety. The main thesis contribution is the extension of PReach to verify two key liveness-like properties of practical interest: deadlock-freedom and response. Our methods leverage the scalability and robustness of PReach and strike a balance between tractable verification for large models and catching liveness violations. Deadlock-freedom holds if from all reachable system states, there exists a sequence of actions that will complete all pending transactions. We find that checking this property is only a small overhead as compared to safety checking. We also provide a technique for establishing that deadlock-freedom holds of a parameterized system -- a system with a variable number of entities. Response is a stronger property than deadlock-freedom and is the most common liveness property of interest. In practical cases, fairness must be imposed on system models when model checking response to exclude those execution traces deemed inconsistent with the expected underlying hardware. We implemented a novel twist on established model checking algorithms, to target response properties with action-based fairness. This implementation vastly out-performs competing tools. This thesis shows that tractable verification of interesting liveness properties in large protocol models is possible.

Applications of machine learning in sensorimotor control
Bradley, Susanne Michelle
DOI : 10.14288/1.0166596
URI : http://hdl.handle.net/2429/54570
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Pai

Continuous ply minimization
Busto, Daniel Silvano
DOI : 10.14288/1.0166358
URI : http://hdl.handle.net/2429/54019
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisors : Dr. Evans, Dr. Kirkpatrick

Interference mitigation and detection in wifi networks under congestion
Cai, Kan
DOI : 10.14288/1.0166312
URI : http://hdl.handle.net/2429/53933
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisor : Dr. Feeley

In IEEE 802.11, nodes regulate access to the airspace they share in a decentralized fashion using CSMA/CA. The goal of this approach is to share the common airspace fairly and efficiently without requiring centralized channel administration or direct coordination among peer nodes. However, it is well known that strong interference, as consequence of this de-centralized coordination scheme, can lead to extremely unfair network bandwidth allocation between competing devices. Interference detection and mitigation has posed great challenges. The cause of interference is complicated, involving many networking factors such as topology and traffic, and the interference relationship changes all the time. This thesis addresses these challenges by proposing a throttling based interference mitigation system (Shaper) and an online passive interference detection system (VOID). The main contribution of this thesis is to point out the correlated relationship between interference and congestion. First, this thesis provides a more thorough analysis on the impact of node topology, traffic type and signal strength on wireless performance. We came up with 9 UDP models and 10 TCP models just for two competing flow scenarios. The outcome of wireless interference can get harder to predict, however, as we introduce more factors into the interference model such as more competing nodes, sending rate, signal propagation model, etc. On the other hand, this thesis identifies the immediate cause to the unfair bandwidth distribution under interference: 802.11 network congestion. We observed and explained that all competing devices are able to perform well regardless of topology or traffic class, as long as there is sufficiently more bandwidth than the aggregate throughput demands. Therefore, we propose to trade the aggregate throughput to mitigate the impact of interference and prove its effectiveness through simulation and emulation. Finally, the key to successful addressing the interference is an accurate and fast interference detection mechanism and this thesis proposes such a system called VOID. It deploys the correlation between congestion and interference to infer the interference relationship from the ip-layer throughput variations. It is fast, accurate and more importantly, very easy to deploy in existing WiFi networks.

FindFS : adding tag-based views to a hierarchical filesystem
Chou, Jason Truman
DOI : 10.14288/1.0166678
URI : http://hdl.handle.net/2429/54703
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Feeley

CMCMPI : Compose-Map-Configure MPI
Chung, Ryan Ki Sing
DOI : 10.14288/1.0165563
URI : http://hdl.handle.net/2429/51185
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisor : Dr. Wagner

A camera-based approach to remote pointing interactions in the classroom
Escalona Gonzalez, Francisco Javier
DOI : 10.14288/1.0166754
URI : http://hdl.handle.net/2429/54880
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Booth

Design and integration of controllers for simulated characters
Firmin, Michael Cameron
DOI : 10.14288/1.0167045
URI : http://hdl.handle.net/2429/51128
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisor : Dr. van de Panne

Accurate smooth pursuit eye movements lead to more accurate manual interceptions
Fooken, Jolande
DOI : 10.14288/1.0166559
URI : http://hdl.handle.net/2429/54482
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Pai, Dr. Spering (Ophthalmology & Visual Sciences)

Do developers respond to code stability warnings?
Foss, Sylvie Loretta
DOI : 10.14288/1.0166189
URI : http://hdl.handle.net/2429/52807
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisor : Dr. G. Murphy

Augmentation of coarse meshes with wrinkles
Gillette, Russell
DOI : 10.14288/1.0166683
URI : http://hdl.handle.net/2429/54687
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Sheffer

Constructing user models from eye gaze data in the domain of information visualization
Gingerich, Matthew Jungyun
DOI : 10.14288/1.0166236
URI : http://hdl.handle.net/2429/52993
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Conati

Applications of inverse problems in fluids and imaging
Gregson, James Peter
DOI : 10.14288/1.0166394
URI : http://hdl.handle.net/2429/54081
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisor : Dr. Heidrich

Three applications of inverse problems relating to fluid imaging and image deblurring are presented. The first two, tomographic reconstruction of dye concentration fields from multi-view video and deblurring of photographs, are addressed by a stochastic optimization scheme that allows a wide variety of priors to be incorporated into the reconstruction process within a straightforward framework. The third, estimation of fluid velocities from volumetric dye concentration fields, highlights a previously unexplored connection between fluid simulation and proximal algorithms from convex optimization. This connection allows several classical imaging inverse problems to be investigated in the context of fluids, including optical flow, denoising and deconvolution. The connection also allows inverse problems to be incorporated into fluid simulation for the purposes of physically-based regularization of optical flow and for stylistic modifications of fluid captures. Through both methods and all three applications the importance of incorporating domain-specific priors into inverse problems for fluids and imaging is highlighted.

A physics-based model for wrinkling skin
Harrison, Darcy Alexander
DOI : 10.14288/1.0166733
URI : http://hdl.handle.net/2429/54853
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Pai

Algorithms for prediction of RNA pseudoknotted secondary structures
Jabbari, Hosna
DOI : 10.14288/1.0167140
URI : http://hdl.handle.net/2429/52396
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisor : Dr. Condon

RNA molecules are crucial in different levels of cellular function, and their functions largely depend on their structures. Since experimental methods for determining RNA structure are expensive, computational methods for prediction of RNA secondary structure (the set of base pairs) are valuable. One such approach is based on the Minimum Free Energy (MFE) folding hypothesis, which states an RNA molecule folds into the structure with the minimum free energy. Another approach is based on the hierarchical folding hypothesis, which posits that an RNA molecule first folds into a psuedoknot- free structure (i.e., no crossing base pairs), then additional base pairs are added that may form pseudoknots (structures with crossing base pairs) to lower the structure’s free energy. The overarching goal of this thesis is to overcome some limitations of the previous methods, namely (1) slow running time, (2) poor prediction accuracy (i.e., low number of correctly predicted base pairs), (3) limited in types of pseudoknots they can predict, and (4) limited opportunity to provide structural information that can guide prediction. We propose two algorithms and study different variations of each method. We propose a relaxed version of the hierarchical folding hypothesis and present Iterative HFold, which is based on this hypothesis. Furthermore, we propose the CCJ algorithm, an MFE-based algorithm that significantly expands the structures handled in O(n⁵) time. Employing a sparsification technique, we propose a sparse version of CCJ that improves its space usage. While CCJ’s and Iterative HFold’s performance are not significantly different on our large data set, Iterative HFold is considerably less expensive to run than CCJ. In addition, Iterative HFold provides the user with ability to incorporate structural information as input constraint. Both CCJ and Iterative HFold predict significantly more accurate structures on key data sets than two of the best performing computational methods currently available (i.e., HotKnots V2.0 and IPknot). Of the folding hypotheses that we have studied, it seems that the relaxed hierarchical folding hypothesis has better potential to predict RNA secondary structure, at least with respect to the energy model used in this work

Exploring machine learning design options in discourse parsing
Liao, Weicong
DOI : 10.14288/1.0167698
URI : http://hdl.handle.net/2429/52985
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Ng

Compositional compression of deep image features using stacked quantizers
Martinez-Covarrubias, Julieta
DOI : 10.14288/1.0166087
URI : http://hdl.handle.net/2429/51511
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisors : Dr. Little, Dr. Hoos

Low-stretch trees for network visualization
McKnight, Rebecca Linda
DOI : 10.14288/1.0166557
URI : http://hdl.handle.net/2429/54490
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Harvey

Storage system tracing and analysis
Meyer, Dutch Thomassen
DOI : 10.14288/1.0166682
URI : http://hdl.handle.net/2429/54694
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisor : Dr. Warfield

File system studies are critical to the accurate configuration, design, and continued evolution of storage systems. However, surprisingly little has been published about the use and behaviour of file systems under live deployment. Most published file system studies are years old, or are extremely limited in size and scope. One reason for this deficiency is that studying clusters of storage systems operating at scale requires large data repositories that are difficult to collect, manage, and query. In effect, the demands introduced by performing detailed file system traces creates new and interesting storage challenges in and of itself. This thesis describes a methodology to facilitate the collection, analysis, and publication of large traces of file system activity and structure so that organizations can perform regular analysis of their storage systems and retain that analysis to answer questions about their system's behaviour. To validate this methodology I investigate the use and performance of several large storage deployments. I consider the impact of the observed system usage and behaviour on file system design, and I describe the processes by which the collected traces can be efficiently processed and manipulated. I report on several examples of long standing incorrect assumptions, efficient engineering alternatives, and new opportunities in storage system design.

Policies, practices, and potentials for computer-supported scholarly peer review
Nobarany, Syavash
DOI : 10.14288/1.0166341
URI : http://hdl.handle.net/2429/53979
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisor : Dr. Booth

The scholarly peer-review process has been one of the cornerstones of science for centuries, but it has also been the subject of criticism for decades. The peer-review process is increasingly supported by computer systems; however, computer support for peer review has been mostly limited to facilitating traditional peer-review processes and remedying scalability issues. We took a holistic approach to understanding the peer-review process with the goal of devising computer-supported interactions, mechanisms, and processes for improving peer review. We conducted a series of studies to investigate various aspects of the peer-review process, including procedural fairness, anonymity and transparency, reviewing motivations, politeness of reviews, and opinion measurement. In the study of fairness, we learned about researchers’ attitudes and concerns about the fairness of the peer-review process. In the study of anonymity and transparency, we learned about the diversity of anonymity policies used by various publication venues. In the study of reviewing motivations, we learned the many reasons reviewers consider reviewing as part of their academic activities and what makes a review request more likely to be accepted. In the study of the use of politeness strategies, we learned about reviewers’ use of language for mitigating criticisms in a non-blind reviewing environment. In the study of opinion measurement we iteratively designed opinion measurement interfaces that can enhance elicitation of quantitative subjective opinions. Through these five studies, we expanded the understanding of challenges and opportunities for designing better peer-review processes and systems to support them, and we presented various ways through which computer support for peer review can be enhanced to address the identified challenges.

Combining SMT with theorem proving for AMS verification : analytically verifying global convergence of a digital PLL
Peng, Yan
DOI : 10.14288/1.0166267
URI : http://hdl.handle.net/2429/52978
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Greenstreet

A method for real-time dynamic cloth wrinkling
Peters, Craig Garrett
DOI : 10.14288/1.0165787
URI : http://hdl.handle.net/2429/54681
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Sheffer

[no title]
Phan-Ba, Michael Duy-Nam
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisor : Dr. Beschastnikh

Advances in meta-algorithmic software libraries for distributed automated algorithm configuration
Ramage, Stephen Edward Andrew
DOI : 10.14288/1.0167184
URI : http://hdl.handle.net/2429/52809
Degree : Master of Science - MSc
Graduation Date : 2015-05
Supervisors : Dr. Leyton-Brown, Dr. Hoos

Learning periorbital soft tissue motion
Ranjan, Anurag
DOI : 10.14288/1.0166703
URI : http://hdl.handle.net/2429/54784
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Pai

Randomized algorithms for solving large scale nonlinear least squares problems
Roosta-Khorasani, Farbod
DOI : 10.14288/1.0166123
URI : http://hdl.handle.net/2429/52663
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisor : Dr. Ascher

This thesis presents key contributions towards devising highly efficient stochastic reconstruction algorithms for solving large scale inverse problems, where a large data set is available and the underlying physical systems is complex, e.g., modeled by partial differential equations (PDEs). We begin by developing stochastic and deterministic dimensionality reduction methods to transform the original large dimensional data set into the one with much smaller dimensions for which the computations are more manageable. We then incorporate such methods in our efficient stochastic reconstruction algorithms. In the presence of corrupted or missing data, many of such dimensionality reduction methods cannot be efficiently used. To alleviate this issue, in the context of PDE inverse problems, we develop and mathematically justify new techniques for replacing (or filling) the corrupted (or missing) parts of the data set. Our data replacement/completion methods are motivated by theory in Sobolev spaces, regarding the properties of weak solutions along the domain boundary. All of the stochastic dimensionality reduction techniques can be reformulated as Monte-Carlo (MC) methods for estimating the trace of a symmetric positive semi-definite (SPSD) matrix. In the next part of the present thesis, we present some probabilistic analysis of such randomized trace estimators and prove various computable and informative conditions for the sample size required for such Monte-Carlo methods in order to achieve a prescribed probabilistic relative accuracy. Although computationally efficient, a major drawback of any (randomized) approximation algorithm is the introduction of “uncertainty” in the overall procedure, which could cast doubt on the credibility of the obtained results. The last part of this thesis consists of uncertainty quantification of stochastic steps of our approximation algorithms presented earlier. As a result, we present highly efficient variants of our original algorithms where the degree of uncertainty can easily be quantified and adjusted, if needed. The uncertainty quantification presented in the last part of the thesis is an application of our novel results regarding the maximal and minimal tail probabilities of non-negative linear combinations of gamma random variables which can be considered independently of the rest of this thesis.

IR-MetaOCaml : (re)implementing MetaOCaml
Roubinchtein, Evgeny
DOI : 10.14288/1.0166800
URI : http://hdl.handle.net/2429/55111
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Garcia

Qualitative repository analysis with RepoGrams
Rozenberg, Daniel
DOI : 10.14288/1.0166565
URI : http://hdl.handle.net/2429/54495
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Beschastnikh

Modelling human cell protein phosphorylation networks
Safaei Mehranpour, Javad
DOI : 10.14288/1.0166161
URI : http://hdl.handle.net/2429/52983
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisors : Dr. Gupta, Dr. Pelech (Medicine), Dr. Manuch and Dr. Stacho (SFU)

Defects in cell signalling networks are linked to over 400 human diseases. My thesis research aimed to model these networks in more detail to facilitate understanding of their architecture and operations under normal and pathological conditions. The various protein levels in diverse normal human cell and tissue types were inferred from their mRNA expressions, and their up/down-regulation was also investigated in about 300 human cancer cell lines and 50 types of human cancers. This was based on meta-analyses of gene microarray measurements deposited in the USA National Center for Biotechnology Information's Gene Expression Omnibus database. I identified proteins that were commonly or uniquely expressed in normal and cancerous human cells and tissues. The co-expression patterns of proteins were used to predict potential interactions, but there was not a strong correlation between high co-expression and actual direct protein-protein interactions documented in the scientific literature. With respect to the post-translational regulation of proteins, my research efforts primarily targeted protein phosphorylation, which is the most predominant type of reversible covalent modification of proteins. Complex protein phosphorylation networks emerge through the interplay of protein kinases, protein phosphatases, phosphorylation site-dependent binding proteins, and their phosphoprotein substrates. I modelled the interactions of protein kinases with substrate proteins and inhibitory compounds. Nearly a million human phosphosites were predicted, and each of these was tested in silico as substrates for 500 human protein kinases. The interactions of over 550 known protein kinase inhibitory drugs with the 500 protein kinases were also tested. These predicted interactions were compared with empirical data from other on-line protein-protein and protein-drug interaction databases. The human phosphosites were also analysed with respect to their protein conservation in over 20 other diverse species, and it was found that threonine phosphosites in protein kinases that were activatory were particularly well conserved in evolution. Finally, probabilistic graphical models were developed to model the most probable structure of substrate phosphosites for specific protein kinases. The discussed probabilistic graphical model, gave more theory justifications for the protein-protein interaction modelling that was presented in the earlier parts of the thesis.

Whose cache line is it anyway : automated detection of false sharing
Spear, Mark Anthony
DOI : 10.14288/1.0166652
URI : http://hdl.handle.net/2429/54677
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Warfield

Transport-level transactions : simple consistency for complex scalable low-latency cloud applications
Tayarani Najaran, Mahdi
DOI : 10.14288/1.0166572
URI : http://hdl.handle.net/2429/54520
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-11
Supervisor : Dr. Hutchinson

The classical move from single-server applications to scalable cloud services is to split the application state along certain dimensions into smaller partitions independently absorbable by a separate server in terms of size and load. Maintaining data consistency in the face of operations that cross partition boundaries imposes unwanted complexity on the application. While for most applications many ideal partitioning schemes readily exist, First-Person Shooter (FPS) games and Relational Database Management Systems (RDBMS) are instances of applications whose state can’t be trivially partitioned. For any partitioning scheme there exists an FPS/RDBMS workload that results in frequent cross-partition operations. In this thesis we propose that it is possible and effective to provide unpartitionable applications with a generic communication infrastructure that enforces strong consistency of the application’s data to simplify cross-partition communications. Using this framework the application can use a sub-optimal partitioning mechanism without having to worry about crossing boundaries. We apply our thesis to take a head-on approach at scaling our target applications. We build three scalable systems with competitive performances, used for storing data in a key/value datastore, scaling fast-paced FPS games to epic sized battles consisting of hundreds of players, and a scalable full-SQL compliant database used for storing tens of millions of items.

The positronic economist : a computational system for analyzing economic mechanisms
Thompson, David Robert Martin
DOI : 10.14288/1.0166200
URI : http://hdl.handle.net/2429/52868
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisor : Dr. Leyton-Brown

A mechanism is a formal protocol for collective decision making among self-interested agents. Mechanisms model many important social processes from auctions to elections. They are also widely studied in computer science: the participants in real-world mechanisms are often autonomous software systems (e.g., algorithmic bidding and trading agents), and algorithmic problems (e.g., job scheduling) give rise to mechanisms when users have competing interests. Strategic behavior (or "gaming") poses a major obstacle to understanding mechanisms. Although real-world mechanisms are often fairly simple functions (consider, e.g., plurality voting), a mechanism's outcome depends on both this function and on agents' strategic choices. Game theory provides a principled means for analyzing such choices. Unfortunately, game theoretic analysis is a difficult process requiring either human effort or very large amounts of computation. My thesis is that mechanism analysis can be done computationally, due to recent advances in compact representations of games. Compact representation is possible when a game's description has a suitable independence structure. Exploiting this structure can exponentially decrease the space required to represent games, and exponentially decrease the time required to analyze them. The contributions of my thesis revolve around showing that the relevant kinds of structure (specifically, the structure exploited by action-graph games) are present in important mechanisms of interest. Specifically, I studied two major classes of mechanisms, position auctions (as used for internet advertising) and voting rules. In both cases, I was able to provide exponential improvements in the space and time complexity of analysis, and to use those improvements to address important open questions in the literature. I also introduced a new algorithm for analyzing action-graph games, with faster empirical performance and additional benefits over the previous state-of-the-art. Finally, I created the Positronic Economist system, which consists of a python-based descriptive language for mechanism games, with automatic discovery of computationally useful structure.

Inferring user cognitive abilities from eye-tracking data
Wu, Ming-An
DOI : 10.14288/1.0165831
URI : http://hdl.handle.net/2429/55086
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Conati

Composite recommendation : semantics and efficiency
Xie, Min
DOI : 10.14288/1.0135685
URI : http://hdl.handle.net/2429/52406
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisor : Dr. Lakshmanan

Classical recommender systems provide users with a list of recommendations where each recommendation consists of a single item, e.g., a book or DVD. However, many applications can benefit from a system which is capable of recommending packages of items. Sample applications include travel planning, e-commerce, and course recommendation. In these contexts, there is a need for a system that can recommend the most relevant packages for the user to choose from. In this thesis we highlight our research achievements for the composite recommendation problem. We first consider the problem of composite recommendation under hard constraint, e.g., budget. It is clear that this is a very common paradigm for the composite recommendation problem. In Chapter 3, we first discuss how given a fixed package schema, we can efficiently find the top-k most relevant packages with hard constraints. The proposed algorithm is shown to be instance optimal, which means that no algorithm in a reasonable class can perform more than a constant times better, for some fixed constant. And we also propose relaxed solutions based on probabilistic reasoning. In Chapter 4, we lift the constraint on the package schema, and discuss how efficient algorithms can be derived to solve the more general problem with a flexible package schema. For this problem, again we propose both instance optimal algorithm and heuristics-based solution which have been verified to be effective and efficient through our extensive empirical study. Then in Chapter 5, motivated by the fact that hard constraints sometimes might lead to unfavorable results, and following the recent paradigm on “softening” the constraints, we study the problem of how to handle top-k query processing with soft constraints. Finally, in Chapter 6, we discuss a general performance tuning solution based on cached views which can be leveraged to further optimize the various algorithms proposed in this thesis.

Performance modelling and automated algorithm design for NP-hard problems
Xu, Lin
DOI : 10.14288/1.0135602
URI : http://hdl.handle.net/2429/51175
Degree : Doctor of Philosophy - PhD
Graduation Date : 2015-05
Supervisors : Dr. Leyton-Brown, Dr. Hoos

In practical applications, some important classes of problems are NP-complete. Although no worst-case polynomial time algorithm exists for solving them, state-of-the-art algorithms can solve very large problem instances quickly, and algorithm performance varies significantly across instances. In addition, such algorithms are rather complex and have largely resisted theoretical average-case analysis. Empirical studies are often the only practical means for understanding algorithms’ behavior and for comparing their performance. My thesis focuses on two types of research questions. On the science side, the thesis seeks a in better understanding of relations among problem instances, algorithm performance, and algorithm design. I propose many instance features/characteristics based on instance formulation, instance graph representations, as well as progress statistics from running some solvers. With such informative features, I show that solvers’ runtime can be predicted by predictive performance models with high accuracy. Perhaps more surprisingly, I demonstrate that the solution of NP-complete decision problems (e.g., whether a given propositional satisfiability problem instance is satisfiable) can also be predicted with high accuracy. On the engineering side, I propose three new automated techniques for achieving state-of-the-art performance in solving NP-complete problems. In particular, I construct portfolio-based algorithm selectors that outperform any single solver on heterogeneous benchmarks. By adopting automated algorithm configuration, our highly parameterized local search solver, SATenstein-LS, achieves state-of- the-art performance across many different types of SAT benchmarks. Finally, I show that portfolio-based algorithm selection and automated algorithm configuration could be combined into an automated portfolio construction procedure. It requires significant less domain knowledge, and achieved similar or better performance than portfolio-based selectors based on known high-performance candidate solvers. The experimental results on many solvers and benchmarks demonstrate that the proposed prediction methods achieve high predictive accuracy for predicting algorithm performance as well as predicting solutions, while our automatically constructed solvers are state of the art for solving the propositional satisfiability problem (SAT) and the mixed integer programming problem (MIP). Overall, my research results in more than 8 publications including the 2010 IJCAI/JAIR best paper award. The portfolio-based algorithm selector, SATzilla, won 17 medals in the international SAT solver competitions from 2007 to 2012.

[no title]
Zhu, Ben
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. Wagner

What to learn next: recommending commands in a feature-rich environment
Zolaktaf, Sedigheh
DOI : 10.14288/1.0166527
URI : http://hdl.handle.net/2429/54362
Degree : Master of Science - MSc
Graduation Date : 2015-11
Supervisor : Dr. G. Murphy