# Uniform Random-3-SAT

Uniform Random-3-SAT is a family of SAT problems distributions obtained by randomly generating 3-CNF formulae in the following way: For an instance with n variables and k clauses, each of the k clauses is constructed from 3 literals which are randomly drawn from the 2n possible literals (the n variables and their negations) such that each possible literal is selected with the same probability of 1/2n. Clauses are not accepted for the construction of the problem instance if they contain multiple copies of the same literal or if they are tautological (i.e., they contain a variable and its negation as a literal). Each choice of n and k thus induces a distribution of Random-3-SAT instances. Uniform Random-3-SAT is the union of these distributions over all n and k.

#### Forced vs. unforced generation

Random-3-SAT instances can be satisfiable or unsatisfiable. Since the behaviour of SAT algorithms can be expected to differ significantly between satisfiable and unsatisfiable instances, we seperated these into different test-sets using a complete algorithm. As complete SAT algorithms have usually considerable higher run-times than (incomplete) SLS algorithms, this approach for generating benchmark problems is computationally very expensive, especially when dealing with larger problem sizes. Therefore, especially when dealing with satisfiable instances, it would be preferable to generate the instances in a way such that their satisfiability (or unsatisfiability) is guaranteed by construction.

For generating satisfiable instances, one possibility for achieving this is to randomly determine a variable assignment B at the beginning of the generation process. Then, the clauses are generated as before, but only clauses which are compatible with B (i.e., clauses which have B as a model) are accepted for the formula F to be constructed. Thus, F has at least one model (namely B). The problem distributions thus obtained are called ``forced'' uniform Random-3-SAT problems; earlier literature on SLS algorithms for SAT used these for evaluating algorithms, until it was noticed that these problems tend to be much easier to solve for SLS algorithms then ``unforced'' problems obtained as described before. It has been speculated for some time that the reason for this phenomenon might be that in forced formulae, the clause are somehow guiding the SLS algorithm into the right direction.

#### Phase transitions

One particularly interesting property of uniform Random-3-SAT is the occurence of a phase transition phenomenon [CKT91], i.e., a rapid change in solubility which can be observed when systematically increasing (or decreasing) the number of k clauses for fixed problem size n. More precisely, for small k almost all formulae are satisfiable; at some critical k=k', the probability of generating a satisfiable instance drops sharply to almost zero. Beyond k', almost all instances are unsatisfiable. Intuitively, k' characterises the transition between a region of underconstrained instances which are almost certainly soluble, to overconstrained instanced which are mostly insoluble. For Random-3-SAT, this phase transition occurs approximately at k' = 4.26 n for large n; for smaller n, the critical clauses/variable ratio k'/n is slightly higher. Furthermore, for growing n the transition becomes increasingly sharp. The phase transition would not be very interesting in the context of evaluating SLS algorithms, didn't empirical analyses show that problems from the phase transition region of uniform Random-3-SAT tend to be particularly hard for both systematic SAT solvers and SLS algorithms . Striving for testing their algorithms on hard problem instances, many researchers used test-sets sampled from the phase transition region of uniform Random-3-SAT (see for some examples). Although, similar phase transition phenomena have been observed for other subclasses of SAT, including uniform Random-k-SAT with k >= 4, but these have never gained the popularity of uniform Random-3-SAT . Maybe, one of the reasons for this is the the prominent role of 3-SAT as a prototypical and syntactically particularly simple NP-complete problem.

#### The benchmark test-sets

The test-sets provided here are sampled from the phase transition region of uniform Random-3-SAT. The instances were generated in an unforced way and were separated into test-sets of satisfiable and unsatisfiable instances . The characteristics of these test-sets are shown in Table 1. The smallest problems (20 variables) are typically used for studying scaling properties. Due to limited computational resources, only the test-sets for n=20,50,100 contain 1000 instances, while for larger problem sizes each test-set contains 100 instances.

 test-set instances clause-len vars clauses uf20-91 1000 3 20 91 uf50-218 / uuf50-218 1000 3 50 218 uf75-325 / uuf75-325 100 3 75 325 uf100-430 / uuf100-430 1000 3 100 430 uf125-538 / uuf125-538 100 3 125 538 uf150-645 / uuf150-645 100 3 150 645 uf175-753 / uuf175-753 100 3 175 753 uf200-860 / uuf200-860 100 3 200 860 uf225-960 / uuf225-960 100 3 225 960 uf250-1065 / uuf250-1065 100 3 250 1065

Table 1: Uniform Random-3-SAT test-sets; test-sets uf* contain only satisfiable instances, uuf* only unsatisfiable instances.

#### Bibliography

 [CKT91] Peter Cheeseman and Bob Kanefsky and William M. Taylor Where the really hard problems are. Proc. IJCAI-91, pages 331--337, 1991