CpSc 538G: The DPLL algorithm
September 19, 2016
Based on "The Handbook of Satisfiability", (A. Biere, M. Heule, H. van Maaren,
and T. Walsh, Eds), Chapter 3: "Complete Algorithms", A. Darwiche and
K. Pipatsrisawat. Sections 3.5-3.7
I. Basic DPLL: Decisions and Unit-Propagation
A. Given a Boolean formula (in CNF), find a satisfying assignment
B. Problem formulation
1. View the problem as searching a tree
a. Vertices are labeled with variables
b. edges are labeled "true" or "false" according to the choice
for that variable
c. leaves either satisfy the formula or don't.
d. Note: there are many possible trees:
for each subtree, we just need to pick a variable that isn't
already assigned.
2. A formula is satisfiable iff the decision tree has a satisfying leaf
a. The path to the leaf gives such an assignment
b. Showing that all leaves fail to satisfy the formula is a
refutation.
c. The number of leaves is 2^n where n is the number of variables.
C. Unit propagation
1. With each decision
a. Eliminate clauses that are now satisfied
b. Reduce the number of disjuncts in clauses that
have a false literal due to the decision.
2. If this produces a clause that has one literal
a. Such a clause is called a "unit clause"
b. We can make a decision for that variable right away --
the other choice falsifies the formula.
c. We can do this for each unit clause produced by the decision
d. These choice can produce more unit clauses, and allow
choices to be made for more variables.
3. If this produces an empty clause
a. The clause is unsatisfiable
b. Back-up to the most recent decision where there's a choice
remaining
c. Make the other choice, and continue
4. If all clauses are eliminated (i.e. satisfied), then we've
found a satisfying assignment.
D. Observations
1. DPLL doesn't have to go all the way to the leaves when exploring
assignments
a. Unit propagation can refute a subtree or find an assignment by
exploring a single path in that subtree.
b. Note that a satisfying assignment can be found without making
assignments to all variables. In this case, the remaining
variables are "don't cares" -- they can be set arbitrarily
to true or false and still produce a satisfying assignment.
c. Likewise, a subtree can be refuted without making assignments
to all variables.
2. These optimizations make DPLL useful in practice
a. However, it is still, worst-case exponential time.
II. Key improvements: conflict-clauses and non-chronological backtracking
A. Implication graphs
1. Track the unit-propagation
2. This is a valuable data structure for analysing conflicts
3. Details:
a. Vertices are labelled with a tuple: (level, assignment)
* An assignment is a variable and a value
* If an assignment is a decision, its level is the number
of decisions that were made prior to this one.
* If an assignment is an inference from unit propagation,
its level is the level for the last decision that
led to this inference.
* Special case: a label of {} means that a conflict,
i.e. empty clause, was inferred.
b. Edges are the inferences, they are labelled with clause
instances.
B. conflict clauses
1. Conflict analysis. When a clause becomes unsatisfiable,
a. The implication graph shows how the decisions led to this
conflict.
b. Any cut in the graph that separates the decisions from the
conflict yields a clause that excludes this conflict.
2. We want to find a small cut -- fewer literals in the learned
clause make the learned clause more general.
3. In practice, we choose exactly one literal from the last
level.
4. Add this inferred clause to the clauses of the problem
a. Any satisfying assignment must satisfy this clause.
b. Thus, we haven't changed the set of assignments that
satisfy the formula.
c. But, the new clause allows the SAT solver to avoid repeating
variations of this conflict.
C. non-chronological backtracking
1. What decisions should we undo?
a. Original DPLL goes back to the most recent decision.
* This can result in lots of exploration of the same conflict
* We'll do the example from the book.
b. Key observation:
* The learned clause forces the choice of the most recent
decision, once the previous decision (in the learned clause)
is made.
2. Asserting level
a. Each literal in a learned clause has an associated level.
b. The highest level is the last decision, the one that led to the
conflict.
c. The second highest level is called the "asserting level"
d. Once all of the decisions have been made up to the asserting level,
the learned clause forces a unit-propagation for the variable at
the last level.
3. With non-chronological backtracking, the solver backtracks to the
asserting level.
a. Effectively, it discards all of the "irrelevant" decisions that
it made between the asserting level and the actual conflict.
b. The solver retains the decision at the asserting level, adds
the new clause, and continues with unit-propagation.
III. More refinements
A. Asserting clauses
1. non-chronological backtracking exploits unit-propagation from
the asserting level.
a. this means that once we've made decision up to and including
the asserting level, the final decision should be implied
by unit propagation.
b. this requires that we have exactly one variable in the conflict
clause from the level at which the conflict was detected.
2. Now we see why the SAT solver makes this restriction on cuts.
a. The restriction is always satisfiable.
b. The decision variable at the last decision level suffices.
B. Unique implication points
1. There may be more than one variable at the last level that
can be chosen as for the cut.
a. Each such variable is called a "unique implication point" (UIP)
b. Note that there can be more than one *unique* implication point
c. UIPs can be ordered, because all paths from the last decision
to the conflict go through all UIPs, and in the same order.
d. Heuristically, picking the UIP closest to the conflict seems
best.
2. On the other hand, picking the UIP closest to the conflict at
other levels seems to degrade performance.
C. Deleting conflict driven clauses
1. Even with all these optimizations, SAT can still be hard
a. This means that the solver learns a lot of conflict clauses
b. If a problem has n variables, the solver must learn a new clause
at least once for every n decisions.
c. The performance of the solver can be degraded by having too many
clauses that aren't very useful
* extra memory
* extra work to maintain data structures
2. Solution: delete some clauses
a. subsumed clauses: clause A subsumes clause
D. Restarts
IV. A few more observations
A. 2SAT
B. Discovering equivalences
C. Exploiting symmetry
D. Certifying results