CpSc 538G: The Quest for Efficient SAT Solvers

September 26 & 28, 2016
  0.  The big picture
    A.  Review DPLL
    B.  Look at each piece and describe the implementation trade-offs
      1.  Unit propagation is the critical step
      2.  The difference between random and structured problems
        a.  Real-world problems tend to have structure
        b.  SAT solvers can exploit this structure to solve large problems
        c.  This is connected to the observation in the "SMT: Introduction
              and Applications" paper: the formulas are large, but
              "shallow".
    C.  Describe a few extensions to the DPLL framework
    D.  If time permits:
      1.  Questions from about HW 1.
      2.  Discuss Carl Seger's presentation
                   
  I.  DPLL review
    A.  Recursive description of DPLL (Fig. 1)
      1.  Make deductions based on current variable assignments
      2.  If a satisfying assignment has been found, return it.
      3.  Else if a conflict has been detected, return "CONFLICT"
      4;  Else
        a.  choose a decision variable
          *   Recurse with decision variable set to 1
                If satisfying assignment found, return it
          *   Recurse with decision variable set to 0
                Return result from the recursive call.
    B. An iterative version (Fig. 2)
            Refactor the algorithm to highlight the main operations:
      1.  preprocess()
        a.  an extension of DPLL
        b.  as we'll discuss below, implementations of the main loop focus
              of fast unit propagation.
        c.  here the solver can try some more expensive techniques, "just
              once", to try to reformulate the problem into an easier to
              solve version.
      2.  decide_next_branch()
            Different implementations use different methods for choosing a good
              decision variable.
      3.  deduce()
        a.  Perform logical deductions that follow from the branch.
        b.  This can result in
          * finding a satisfying assignment,
          * discovering a conflict, or
          * perhaps some further simplifications of the formula, but we
              still don't know whether or not it's satisfiable
      4.  If a conflict is encountered
        a.  blevel = analyze_conflict()
          *  blevel is the decision level to which the algorithm backtracks
          *  may backtrack earlier than most recent decision.
          *  if blevel == 0, then analyze_conflict() has determined that
               there is no backtracking that can resolve the conflict.
               The original formula is unsatisfiable.
          *  otherwise, analyze_conflict() may also learn new clauses
            +  These clauses are implied by the original formula
            +  But adding them allows the solver to avoid reconstructing
                 variations of this conflict in the future.
        b.  backtrack(blevel)
      5.  else if a satisfying assignment has been found,
            return it
      6.  else go to step 2 and choose the next decision variable
        a.  Hey!  Isn't this a recursive algorithm?
        b.  Yes, the paper presents it as a while loop.
        c.  But, the implementation maintains a data structure that keeps
              track of the choices of decision variables.
        d.  This data structure takes the role of the stack; so
          *  The algorithm is recursive, but
          *  the implementation is with a while-loop, and a programmer
               implemented data structure
        e.  Why do this?
          *   To tune the implementation for maximum efficiency.


 II.  Each step in detail
    A.  preprocess():  The paper doesn't say much, but Section 3.5 says
          that reference [45] provides a survey of preprocessing
          techniques.

    B.  decide_next_branch()
      1.  The main trade-offs
        a.  Choose a variable that will quickly lead to finding a solution
              or refuting the formula.
        b.  Don't spend too long making the decision, or it will have a
              large impact on solver time.
      2.  Heuristics to maximize impact
        a.  Try to maximize the number of implications generated or clauses
              satisfied
        b.  E.g. if choosing the variable satisfies a lot of clauses, that's good.
        c.  If it produces a lot of unit clauses, that's good.
        d.  Note that unit clauses are often created by unit propagation
          *  It's not enough just to count how many 2-literal clauses the
               variable is in,
          *  But there's a trade-off between carefully analysing the impact
               and spending to much time to decide.
        e.  A very simple measure of "impact" is the number of clauses in
              which the variable or its complement appear.
          *  But this changes as clauses are satisfied.
          *  Updating the counters slows down this approach.
      3.  Heuristics based on decaying sums
        a.  VSIDS: Variable State Independent Delayed Sums
          *  Initially, compute number of occurrences of each literal
            +  separate scores for x and for NOT x
          *  When a clause is added, increase scores of variables in that clause
          *  Periodically, but not with every decision, divide all the scores by a
               constant.
          *  For example, if the problem has m variables, and we divide all the
               scores by 2 every m decisions, then the time for scaling is a small
               part of the total time for the algorithm.
          *  Likewise, the time to acjust the scores of variables in the learned
               clause is a small part of the total time.
          *  In parctice, VSIDS works quite well
            +  Similar number of branches compared with other heuristics
            +  Time for VSIDS is a small fraction of the total execution time.
        b.  The BerkMin SAT solver
          *  Branching heuristic similar to VSIDS
          *  Increment scores of variables that occur in clauses that led to the
                 conflict
            +  Not just those in the conflict variable
            +  Note that a single conflict can increase the score of a variable by
                 more than one.
            +  The BerkMin authors claim this increases the sensitivity of BerkMin
                 to finding the important variables in a SAT problem.
          *  Periodically divide all the scores as in VSIDS.
      4.  Other approaches
        a.  satz uses a "look-ahead" heuristic
          *   couldn't find an online copy of the paper
          *   Found a related paper:
                "Parallelizing Satz Using Dynamic Workload Balancing",
                B. Jurkowiak, C-M Li, G. Utard,
                LICS 2001 Workshop on Theory and Applications of
                Satisfiability Testing.
                http://dx.doi.org/10.1016/S1571-0653(04)00321-X
          *   satz tries *every* variable as a possible branching choice
            +   Then, it tries some choices for the next branch.
            +   This is the "look-ahead" part.
            +   From these explorations, it chooses a branching variable.
          *   satz doesn't perform clause learning or non-chronological
                  backtracking
            +   The focus is on random SAT instances, and the claim is that
                  clause learning and non-chronological backtracking offer
                  little benefit for random instances
        b.  cnfs uses a "backbone directed heuristic"
                Quick scan of paper suggests
          *  Looks at "k-SAT" formulas (e.g. 3-SAT)
          *  Notes phase transition based on ratio of number of clauses to
                 number of variables
            +  For random 3-SAT problems,
                 if #clauses < alpha_crit*(#variables),
                 then problem is very likely to be satisfiable
                 else the problem is very likey to be unsatisfiable.
            +  Emperically, alpha_crit is about 4.25 for 3-SAT
            +  If a problem is satisfiable, it is likely to have an
                 exponentially large number of satisfying assignments.
          *  The backbone appears to be a set of variables that play a key
                 role in this phase transition.
            +  I looked at [23] which cites
                  "Determining computational complexity from characteristic
                    ‘phase transitions’", R. Monasson, R. Zecchina, et al.,
                    Nature, pages 133-137, 1999.
            +  Maybe I'll add more later
      5.  Structured vs. random problems

    C.  deduce()
      1.  Unit propagation is used by nearly all modern SAT solvers
        a.  Unit propagation is fast
        b.  More sophisticated schemes may make more inferences,
              but, the extra runtime generally isn't worth it.
      2.  Counters  (e.g. Grasp)
        a.  The "obvious" way to do it
          *  The data structures:
            +  Each clause has a counter for the number of literals that have
                 been decided in a way that doesn't satisfy the clause.  This
                 is called the "zero counter".
            +  Each clause has a counter for the number of literals that have
                 been decided in a way that does satisfy the clause.  This is
                 called the "one counter".
            +  Each variable has a list of all clauses in which that variable
                 appears
          *  When a variable is "decided":
            +  update the counters for the clauses that contain that variable
            +  if zeroCounter(clause) == #literals(clause) - 1
                   and oneCoutner(clause) == 0
                 mark the clause as a unit clause
               else if zeroCounter(clause) == #literals(clause)
                 the clause is a conflict clause,
                 analyse the conflict and backtrack
        b.  Time complexity:
          *  Consider a problem with m clauses, n variables, and,
               on-average, l literals per clause.
          *  The average variable appears in ml/n clauses.
            +  O(ml/n) work per decision or unit propagation
            +  If we learn lots of clauses (i.e. m is large),
                 and note that learned clauses often have a large number of
                 variables, then we see that this ml/n term dominates the
                 runtime.
          *  The work to restore the counters when backtracking is similar.
      3.  Head-tail lists (e.g. SATO)
        a.  Key observation: we don't need to update the counters everytime
               a variable is assigned.
          *  We're really only interested in the questions:
            +  Is the clause satisfied?
            +  Is there one free-literal left that could satify the clause?
            +  Are there no free-literals left that could satify the clause?
          *  If we let f be the number of free-literals left if for a clause,
               the only "interesting" values for f are 0, 1, and "many".
        b.  Idea:
          *  Assume some ordering of literals in the clause (e.g by variable
               index)
          *  Keep track of the lowest-indexed free literal, and the
               highest-indexed free literal
          *  When one of these changes, we check the status of the clause.
        c.  Why does this work?
          *  If the lowest-indexed free literal is different than the
               highest-indexed one, then there must be at least two free 
               literals in the clause.
          *  If the lowest-index and highest-index are the same
            +  If this variable has been decided, and the literal is true,
                 then the clause is satisfied for this assignment
            +  If this variable has been decided, and the literal is false,
                 then we've detected a conflict.
            +  If this variable is still free, then the clause is now a
                 unit clause.
        d.  Time complexity:
          *  m clauses, n variables, and l literals per clause
          *  a variable is the head or tail of, on average, 2m/n clauses
          *  more work when we are the head or tail pointer, but
            +  zero work to do when our assignment satisfies the clause.
            +  I'll guess O(l/log(l)) work, on average, when the assignment
                 of a value to a variable at the head or tail of a clause 
                 does not satisfy the clause.  I'd need to read the SATO
                 paper to find out the details.
          *  backtracking has similar cost to the forward process.
      4.  2-literal watching
        a.  Key idea: the head-tail lists work because we don't need to do
              anything if there are at least two free literal left in a
              clasue.  2-literal watching generalizes this idea to note that
              we *any* two free varaiables will suffice.
        b.  Data structures
          *   pos_watched(v) -- v appears in non-negated form in each clause
                on this list, and v is one of the two "watched" variables for
                that clause.
          *   neg_watched(v) -- like pos_watched, but v appears in negated form.
        c.  When v is assigned 1:
          *  check each clause on the neg_watched list.
            +  If there is a literal that satisfies the clause, the clause is
                 satisfied.  We don't need to do anything.
            +  Otherwise, if there are no free literals left for this clause,
                 the clause is a conflict.  We backtrack.
            +  Otherwise, if the only free literal left is the other watched
                 literal, the clause is a unit clause.
            +  Otherwise, we choose a free literal in the clause that isn't
                 being watched and put this clause on the watch list for
                 that variable.  We remove this clause from the watch list
                 of v.
          *  similar actinos when v is assigned 0, but we use the pos_watched list.
        d.  Complexity
          *  For decisions and unit-propagation, same as head-tail lists.
          *  For backtracking: nothing needs to be done!

      5.  Other approaches
        a.  Main idea: these ideas seem to work for particular classes of
              problems, but few show overall improvement on a large class
              of benchmarks
        b.  Detect pure-literals:
          *  if a literal only appears in non-negated form in the remaining
               clauses, set it true.
          *  if a literal only appears in negated form in the remaining
               clauses, set it false.
        b.  Detect variables that must be equivalent (or each other's
               negations)
        c.  Check 2-literal clauses for consistency
        d.  See also, "satzilla"

    D.  analyze_conflict()
      1.  Main idea: resolution
        a.  What is resolution
          *   Given two clauses (a1 OR a2 OR ... am OR c), and
                (b1 OR b2 OR ... bn OR NOT c) the resolvent is
                (a1 OR ... am OR b1 OR ... bn)
          *   The key property is:
                        ((a1 OR ... am OR c) AND (b1 OR ... bn OR NOT c))
                IMPLIES (a1 OR ... am OR b1 OR ... bn)
          *   Proof:
            +   First consider the simple case, we want to show:
                    ((a OR c) AND (b OR NOT c)) IMPLIES (a OR b)
              -   Proof 1: just try it in Z3py:
                    >>> from z3 import *
                    >>> a, b, c = Bools('a b c')
                    >>> prove(Implies(And(Or(a, c), Or(b, Not(c))), Or(a, b)))
                    proved
              -   Proof 2: case split on c:
                    case c==True:
                          ((a OR c) AND (b OR NOT c)) IMPLIES (a OR b)
                      <=> ((a OR True) AND (b OR NOT True)) IMPLIES (a OR b)
                      <=> (True AND b) IMPLIES (a OR b)
                      <=> b IMPLIES (a OR b)
                      <=> True
                    case c==False:
                      Similar reasoning applies.
            +   Now consider the general case.
                  Let a = (a1 OR ... OR am), and b = (b1 OR ... OR bm)
                  Then
                                ((a1 OR ... am OR c) AND (b1 OR ... bn OR NOT c))
                        IMPLIES (a1 OR ... am OR b1 OR ... bn)
                    <=> (a OR c) AND (b OR c)) IMPLIES (a OR b)  ; def. of a and b
                    <=> True ; previous lemma
        b.  Note that modus ponens is a special case of resolution:
          *  Modus ponens:  (p IMPLIES q), p
                            -----------------
                                 q
             (p IMPLIES q) <=> ((Not p) OR q)
             (p IMPLIES q) AND p <=> ((Not p) OR q) AND p
             ((Not p) OR q) AND p => q  ; resolution
             (p IMPLIES q) AND p => q ; substitution
          *  Modus ponens is one of those silly Latin phrases computer science
                 undergrads are forced to learn.
            +  Thanks alot, Aristotle
            +  Why Latin?  Wasn't Aristotle Greek?
            +  Yeah, he was, but his "Rhetoric" was translated to Latin before
                 it was added to the core-curriculum for scholars in the
                 Middle Ages.
          *  More generally, resolution is sufficient as a decision procedure 
               for propositional logic.

      2.  Implication graphs and resolution
        a.  When reading "Complete Algorithms" from "Handbook of
              Satisfiability", conflict analysis was presented using
              implication graphs.
        b.  There is an equivalent formulation using resolution
          *   If an implication graph has an edge from from vertex v1 to
                vertex v2.
          *   Let c2 be a conflicting clause associated with vertex v2.
          *   Then there is some variable, var, which is free at v1, and
                bound at v2.
          *   This means there is a unit clause that determined the value of
                x at v2.  Call this clause ante -- it's the antecedent of var.
          *   Note that every literal of c2 is set to false in the current
                assignment -- it's a conflicting clause.
          *   Every literal of ante except for the one for var is set to false
                in ante -- it's a unit clause.
          *   Var must appear in opposite polarity in c1 and ante.
          *   We can use resolution to merge c1 and ante.
          *   All literals of the resolvent are false in the current
                assignment.
          *   Thus, the resolvent is a conflicting clause.
        c.  This means we can start at the conflicting clause that we first
              found, and work our way back using resolution to derive a
              learned clause.

      3.  Figure 4 in the paper gives pseudo-code for this approach.
        a.  The key loop is:
                while(!stop_criterion_met(cl)) {
                  lit = choose_literal(cl);
                  var = variable_of_literal(lit);
                  ante = antecedent(var);
                  cl = resolve(cl, ante, var);
                }
          * As with the overall algorithm, different implementations
              choose different implementations of the functino above.
        b.  lit = choose_literal(cl)
          *   most common: choose variables in reverse chronological order.
                guaranteed to reach an asserting clause because  it will
                eventually reach the point where the only variable left
                at the current decision level is the decision variable.
          *   if variables are not chosen in some other order, care must
                be taken to ensure that the stop criterion is eventually
                satisfied.
        c.  var = variable_of_literal(lit): simple
        d.  ante antecedent(var)
              Record the clause that was actually used to deduce the value
              of var.  That clause is the antecedent.
        e.  cl = resolve(cl, ante, var)
              Resolution, as described above.
        f.  stop_criterion_met(cl)
          *   We want an asserting clause: that means it has exactly one
                literal for a variable decided at this level.
          *   rel_sat performs resolution until it reaches the decision
                 variable for the current level.
          *   citation [39] observes that as we compute conflicting
                clauses for each vertex in the implication graph, we can
                observe when such a clause is an asserting clause.
                This approach stops at the first asserting clause.
            +  Note that a vertex that is labeled with an asserting clause
                 is a UIP.
            +  The solver described in [39] stops at a first UIP.

    E.  backtrack()
      1.  Conflict analysis produced an asserting clause.
      2.  The asserting clause has one literal for a variable that was
            decided at the last level.
      3.  Backtrack to the second highest level of a literal in the
            asserting clause.


III.  Other stuff
    A.  Implementation issue
      1.  Memory is the biggest issue
        a.  total memory usage *and* cache performance both matter
          *   Should I say a bit about caches?
        b.  This favors using arrays instead of linked structures.
          *   Arrays are very compact
          *   Arrays provide for memory access locality
          *   But, linked list are easier to manipulate
          *   And array require special code for "garbage collection", i.e.
                compressing gaps out of the array.
      2.  Sparse matrix representation
        a.  Each clause is stored as a list of indices for the literals
               in the clause
          *  Use some bit-fiddling to encode the variable and negation in a
                 single word
            +  e.g. a positive integer value is the index of a non-inverted
                 variable, and a negative integer value is the negation of
                 the index of an inverted value.
            +  Or odd values represent inverted variables, even values
                 represent non-inverted variables, and arrayElement >> 1
                 is the index for the variable.
          *  Each clause starts with a header that has elements for:
            +  How many literals are in the clause (making it easier to find
                 the next clause)
            +  Info. for tracking unit clauses, such as the indices of the
                 two watched variables.
            +  The index of the antecedent clause for this one if this clause
                 is a unit clause or conflicting clause
            +  Etc.
        b.  Other arrays for pos_watched and neg_watched and other data
              structures.
      3.  Tries, BDD like structures, and others have been tries
        a.  Advantage is that a wider range of operations can be implemented
               efficiently
          *  Example: detecting subsumed clauses
        b.  Disadvantage: the impact of degraded memory performance seems
              to outweigh the advantages of being able to implement more
              sophisticated algorithms.

    B.  Preprocessing (as mentioned earlier)
      1.  Some SAT solvers perform more sophisticated analysis of the problem
            at the beginning when it's a one time cost.
      2.  Then, hand the "simplified" problem to the optimized SAT solver.
      3.  Reference [45] provides a survey.

    C.  Restarts
      1.  Random restarts
        a.  Try different variable orderings, clause orderings, etc.
        b.  If the solver is "stuck" exploring a complicated part of the
              solution space, a restart may allow it to finish faster.
        c.  We can think of the run-time as being a random variable that's
              always positive and has a huge variance.  Taking the minimum
              of several trials may be faster than spending lots of time
              on one, long-running approach.
      2.  Restarts also let the solver "clean up" after accumulating lots
              of learned clauses.
        a.  Some care needs to be taken to ensure that the algorithm terminates.
        b.  One approach is to run each successive restart longer than its
              predecessor.


               
 IV.  Summary
    A.  The paper presented a general framework for understanding state-of-the
            art SAT solvers.
    B.  Divided the implementation of DPLL into
          decide_next_branch(), deduce(), analyse_conflict(), and backtrack.
    C.  Discussion of key trade-offs for each of these steps.
    D.  Described key issues in data-structures for SAT solvers:
          sparse arrays have been the most effective data structure
          because they make efficient use of memory.
    E.  The role of randomization what briefly mentioned and highlighted.
This document was last modified on: (GMT)