For NP-complete problems such as SAT, even the best known algorithms have worst-case running times that increase exponentially as the problem size increases. In practice, however, many large instances of NP-hard problems still can be solved within a reasonable amount of time. In order to understand this phenomenon, much effort has been invested in understanding the "empirical hardness" of such problems. Our approach of empirical hardness models uses linear basis function regression to obtain models that cab predict the time required for an algorithm to solve a given SAT instance. These empirical hardness models can be used to obtain insight into the factors responsible for an algorithm's performance, or to induce distributions of problem instances that are challenging for a given algorithm. They can also be leveraged to select among several different algorithms for solving a given problem instance. Empirical hardness models have also proven very useful for combinatorial auctions, a prominent NP-hard optimization problem. The procedure of building an empirical hardness model is follows.

- Identify a set of instance characteristics that may correlate with algorithm's performance
- Draw samples from your data distribution and use them as your training data
- Collect instance characteristics and solver's runtime data on your training data
- Use machine learning techniques (e.g., ridge linear regression) to train an empirical hardness model based on the training data you collected
- For a new instance, just compute instance characteristics and put them into the model. Now, you have the runtime prediction for that instance.

*Note that empirical hardness models can also be used for prediction
algorithm performance other than runtime. For example, we use empirical
hardness models to predict solver's score in SAT competition.*

Previous research in the SAT domain has shown that better prediction accuracy and simpler models can be obtained when models are trained separately on satisfiable and unsatisfiable instances. We extend this work by training separate hardness models for each class, predicting the probability that a novel instance belongs to each class, and using these predictions to build a hierarchical hardness model using a mixture-of-experts approach. By studying a wide range of instance distributions and SAT solvers, we demonstrated that a classifier can be used to distinguish between satisfiable and unsatisfiable instances with surprisingly high (though not perfect) accuracy. We also showed that how such a classifier can be combined with conditional hardness models into a hierarchical hardness model using a mixture-of-experts approach. Our results show that in cases where we achieved high classification accuracy, the hierarchical models thus obtained always offered substantial improvements over an unconditional model. When the classifier was less accurate, our hierarchical models did not offer a substantial improvement over the unconditional model; however, hierarchical models were never significantly worse. It should be noted that our hierarchical models come at virtually no additional computational cost, as they depend on the same features as used for the individual regression models.

People in our lab:

- Frank Hutter, Ph.D. student (UBC)
- Lin Xu, Ph.D. student (UBC)
- Holger Hoos, Associate Professor (UBC)
- Kevin Leyton-Brown, Assistant Professor (UBC)

Co-authors from other institutions:

- Alex Devkar
- Youssef Hamadi, (Microsoft Research Cambridge)
- Eugene Nudelman
- Yoav Shoham, Professor of Computer (Stanford University)

K. Leyton-Brown, E. Nudelman, Y. Shoham. Appeared in Principles and Practice of Constraint Programming (CP-02), Ithaca, 2002. [PDF]**Learning the Empirical Hardness of Optimization Problems: the case of combinatorial auctions.**E. Nudelman, K. Leyton-Brown, H. Hoos, A. Devkar, Y. Shoham. Appeared in Principles and Practice of Constraint Programming (CP-04), Toronto, 2004. [PDF]**Understanding Random SAT: Beyond the Clauses-to-Variables Ratio.**K. Leyton-Brown, E. Nudelman, Y. Shoham. Chapter 19 in Combinatorial Auctions, P. Cramton, Y. Shoham, R. Steinberg (eds.), MIT Press, 2006. [chapter PDF]**Empirical Hardness Models for Combinatorial Auctions.**F. Hutter, Y. Hamadi, H. Hoos, K. Leyton-Brown. Appeared in Principles and Practice of Constraint Programming (CP-06), Nantes, 2006. [PDF]*Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms,*L. Xu, H. Hoos, K. Leyton-Brown. Appeared at Principles and Practice of Constraint Programming (CP), Providence, 2007. [PDF]*Hierarchical Hardness Models for SAT.*

For paper "* Performance Prediction and Automated Tuning of
Randomized and Parametric Algorithms*"

For paper "** Hierarchical Hardness Models for
SAT**"

Useful codes for computing features and collecting runtime data

**Uniform-random 3-SAT with 400 variables from the solubility phase transition**(clauses-to-variables ratio 4.26, 20,000 instances, 184MB)**Uniform-random 3-SAT with 400 variables and clauses-to-variables ratios randomly selected from 3.26 to 5.26**(20,000 instances, 184MB)**Random quasi-group completion**(30,620 SAT-encoded instances, 1,525MB)**Graph-coloring on small-world graphs**(20,000 SAT-encoded instances, 787MB)

Please send us your feedback to xulin730@cs.ubc.ca