CpSc 538G: Binary Decision Diagrams
September 21, 2016
0. The big picture
A. The graph <-> function duality
1. BDDs are graphs: DAGs to be more precise
2. BDDs are a representation of boolean functions
a. We can think of the formula (a | b) & c as a function with three
parameters:
function f(bool a, bool b, bool c): bool
return (a & b) | c
b. We can represent functions in many different ways:
* code in a programming langauge
* a mathematical or pseudo-code notation
* a sequence of machine instructions or byte codes
* a data-flow graph or data-control flow graph
c. BDDs are a special case of a data-control flow graph
d. Evaluate the function by traversing the graph from the root to
a leaf.
e. BDDs let up perform operations on functions
* Example: restriction
Let f be the function defined above
f|_(b=True)
produces a two parameter fucntion
function f_r(bool a, bool c): bool
return f(a, True, c)
This is also known as "partial evaluation".
B. ROBDDs are canonical
1. A BDD is a decision DAG
2. A RBDD is a special class of BDD's where:
a. no two vertices correspond to the same function.
b. If the "true" and "false" edges of a vertex go to the same descendant,
we eliminate the upper vertex, and replace all edges into the upper
vertex with edges into the descendant. In English, we skip a test
if it's not needed.
c. R stands for "Reduced"
d. Transforming a BDD to a RBDD is an optimization that saves space.
3. A ROBDD is a special class of BDD's where the tests appear in the
same order on all paths.
a. If path1 tests v_i before testing v_j, and path2 tests v_i and v_j,
then path2 tests v_i before v_j.
b. "O" stands for ordered.
4. A ROBDDs are canonical
a. If f and g have the same parameters and produce the same result for
each valuation of the parameters, then f and g have the same
ROBDD representation.
b. This makes testing for equality of functions easy.
c. Testing satisfiability is easy: f is satisfiable if its ROBDD is
not the "False" vertex.
C. Terminology
1. Technically BDDs are a superset of RBDDs and OBDDs, and
RBDDs and OBDDs are a superset of ROBDDs.
2. OTOH, I'll often say BDD as a shortcut for ROBDD because ROBDDs
are generally the only version we're interested in.
I. Representing Boolean Functions with ROBDDs
(Reduced Ordered Binary Decision Diagrams)
A. From decision trees to ROBDDs
0. Running example: one-bit adder
a. compute sum and carry_out given 'a', 'b', and carry_in
b. sum = a ^ b ^ c_in (where ^ denotes exclusive-OR)
c. carry_out = majority(a, b, carry_in)
= (a & b) | (a & c) | (b & c)
1. Decision tree
2. From tree to DAG:
a. remove duplicate terminals
b. share identical subtrees.
3. Remove redundant tests
B. ROBDDs are canonical
Proof: by induction
1. Base case: ROBDD with no variables
2. Induction step:
assume ROBDDs with k variables are canonical
show that ROBDDs with k+1 variables are canonical as well
C. Variable ordering matters
1. Example: Let x and y be n-bit vectors
2. x == y can be done represented by an ROBDD with O(n) nodes
if bits are interleaved.
a At each level, compare the bits
b Pass the conjunction along
3. x == y requires Omega(2^n) nodes if the bits of x precede
the bits of y:
Need to distinguish all possible values of x
& There are 2^n such values
=> The ROBDD has Omega(2^n) branches going from the last bit of
x to the first bit of y
II. Operations on ROBDDs
A. Shannon expansion
1. f = (!x & f|_(x=False)) | (x & f|_(x=True))
a. f is a function, for example
def f(a,b,c):
return (a & b) | (a & c) | (b & c)
b. f|_(x=v) means f with the parameter x replaced by the value v.
using the example from part (a),
f|_(a=False) = (False & b) | (False & c) | (b & c)
= b & c
f|_(a=True ) = (True & b) | (True & c) | (b & c)
= b | c | (b & c)
= b | c
c. Shannon expansion:
f = (!a & f|_(a=False)) | (a & f|_(a=True))
= (!a & (b & c)) | (a & (b | c))
= (!a & b & c) | (a & b) | (a & c)
= (!a & b & c) | (a & b) | (a & c) | (a & b & c)
= (b & c) | (a & b) | (a & c)
2. BDDs support Shannon expansion by having separate "true" and "false"
subtrees at each level.
3. Shannon expansions are used to implement the operations on BDDs
described below.
B. Apply
1. Let's say we have boolean functions f and g represented by BDDs.
a. We want to compute f | g.
b. More generally, we want to compute f op g where op is any
boolean function of two variables.
c. Shannon expansion
(f op g) = (!a & (f op g)|_(a=False)) | (a & (f op g)|_(a=True))
d. Boolean operators commute with Shannon expansion:
(f op g)|_(a=False) = f|_(a=False) op g|_(a=False)
(f op g)|_(a=True) = f|_(a=True) op g|_(a=True)
2. To compute f op g, with variable ordering a -> b -> c
a. Compute f|_(a=False), this is just the left descendant of the
root vertex vertex for f.
b. Likewise, g|_(a=False) is the left descendent of the root vertex
for g.
c. Recurse to the problem of computing
f|_(a=False) op g|_(a=False)
d. When we reach the leaves, they are labeled with constants, just
compute f_leaf op g_leaf.
e. When we have the results for our left and right subgraphs,
create a new vertex with these results as its left and right
subgraphs.
3. A detail. A ROBDD will skip over levels for which the left and
right subtree are the same.
a. If f skips one or more levels that g doesn't (or vice-versa),
we just treat the vertex of f as the match of the intermediate
vertices of g.
b. I'll draw a picture on the board. This is just implicitly
un-reducing that vertex of f on demand.
4. An example: f = (a & b), g = (a & c), compute f | g
a. I'll write h as the function f | g.
b. Let's say we have the ROBDDs for f and g.
c. We first explore the left descendant from the a-vertex for the
two functions.
* Both of these are the False terminal.
* False | False -> False
* The left descendant of the a-vertex for h is the False terminal.
* Our tree for h looks like:
!a -> False
a -> remaining work
c. Now, explore the right descendant of the a-vertex for f and g.
* The right descendant for f is labeled with b.
* The right descendant for g is labeled with c.
We treat this as if there were a b vertex for g, where both
outgoing edges go to the current vertex.
* We recurse again, and look at the left descendant of the b-vertex
for f.
+ It goes to the terminal False.
+ We're now at the c-vertex for g.
- The left descendant is the False terminal.
False | False -> False
The left descendant of this c-vertex for h will be False.
- The right descendant is the True terminal.
False | True -> True
The right descendant of this c-vertex for h will be True.
+ Putting together what we've done so far, h looks like
!a -> False
a -> !b -> !c -> False
-> c -> True
b -> remaining work
* We now look at the right descendant of the b-vertex for f.
+ It goes to the terminal True.
+ We descend to the c-vertex for g.
- The left descendant is the False terminal.
True | False -> True
The left descendant of this c-vertex for h will be True.
- The right descendant is the True terminal.
True | True -> True
The right descendant of this c-vertex for h will be True.
- Both descendants from the c-vertex are the same,
we can skip this decision level. The subtree
+ The right descendant of the b-vertex for h is True.
d. Putting this together, the ROBDD for h looks like
!a -> False
a -> !b -> !c -> False
-> c -> True
b -> True
5. Some optimizations:
a. Hash subproblems.
* Let f_i and g_j be vertices of the ROBDDs for f and g.
* Check to see if we have already computed the result for f_i and g_j.
- If so, use the previously computed subgraph
- Otherwise, do the recursive computation to get the new subgraph
and enter it into the hash table.
b. Hash results:
* Let's say a vertex for variable v has subgraphs v_0 and v_1.
* Check to see if we already have the result for (v, v_0, v_1)
+ If so, use the previously computed vertex.
+ Otherwise, construct the vertex and put it in the hash table.
c. Skip unnecessary decisions.
* Let's say that a vertex for variable v produces subgraphs v_0
and v_1, and that v_0 = v_1.
* We don't need a decision vertex for v. We just return v_0 to
the computation for the previous variable (in our variable
ordering).
d. Short-circuit when encountering a dominant constant
* If op is AND, and we encounter a False, we don't need to
explore the subgraph for the other argument.
* Likewise, if op is OR and we encounter a True.
* The paper doesn't describe the case where op is AND, we are
considering variable v, and the f vertex for v is True.
In this case, our result vertex is just the subgraph for g.
Likewise if op is OR, and we encounter a False vertex.
6. Complexity.
a. |f op g| = O(|f| |g|).
b. The time to compute |f op g| is O(|f| |g|) as well.
c. These are worst-case bounds. In some cases, the time may
be much lower. See the discussion of variable ordering
and BDD sizes earlier in the paper.
C. Restrict
1. Want to compute f|_(x=True) or f|_(x=False) where x is a parameter
to function f.
2. Consider the case of restricting to x=True, the other is similar.
a. Traverse the ROBDD for f with a depth first search.
b. Whenever we encounter an edge to a vertex labeled x
+ Replace that with an edge to the 'True' descendant of the
vertex labeled x.
2. Need to use the hash table of previously computed results to
keep the ROBDD reduced.
a. Example: variable ordering p -> w -> x -> y -> z
p -> w1 -> x1 -> y0 -> z0 -> True
-> !z0 -> False
-> !y0 -> False
-> !x1 -> y1 -> z0
-> !y1 -> z1 -> False
-> !z1 -> True
-> !w1 -> False
!p -> w2 -> x2 -> y0
-> !x2 -> y2 -> z1
!y2 -> z0
-> !w2 -> False
b. I wrote x1, w1, y0, and so on to indicate different vertices.
The letter in the vertex name indicates the decision variable.
c. Note, this function is
If(p, w & If(x, y & z, y == z),
w & If(x, y & z, y != z))
d. Restrict to x=True.
p -> w1r -> y0
-> !w1r -> False
!p -> w2r -> y0
-> !w2r -> False
e. Note that w1r and w2r are the same.
- We need to detect this and make them the same vertex.
- This is what the hash table is for.
f. Restrict take O(|f|) time and space.
D. Other operations:
1. Quantifiers:
a. ForAll x. f is implemented as:
f|_(x=False) & f|_(x=True)
b. Likewise for Exists x. f
c. Quantification takes time O(|f|^2)
2. Function composition
a. Let f and g be functions
b. Let's say we want to compute f|_(x=g)
c. This is (!g & f|_(x=False)) | (g & f|_(x=false))
* This involves two RESTRICT and three APPLY operations.
* The time complexity is O(|f|^2 |g|^2)
* The operations can be combined into a single depth-first
traversal of the graphs representing f and g.
* This optimized version takes time O(|f| |g|^2).
d. "Modern" BDD packages will replace a vector of variables
with a vector of functinos.
* Doing all the replacements at once is more efficient.
* I don't have a reference handy for the efficiency.
E. ROBDDs are canonical.
1. Using the hash table methods described above
a. Two ROBDDs represent the same function iff they have the
same top-level vertex.
b. Checking for equality is a simple, pointer comparison.
2. This feature is used in many algorithms that exploit ROBDDs
3. Note that the "hash table" is the language of the paper.
Any "collection" with O(1) insert and look-up will suffice.
III. A few implementation notes
A. The method of using a single ROBDD structure to represent
multiple functions has been very successful.
1. This allows sharing of common sub-graphs.
B. Implementations tend to be memory-bound
1. Lots of work goes into tuning the data structures to use
as little memory per vertex as possible -- lots of bit
fiddling.
2. Efficient allocation and reclaiming of vertices is critical.
C. The paper mentioned "promising" early results on parallel
implementations
1. Didn't really work out very well.
2. As the paper noted, synchronization overhead were a serious
bottleneck.
IV. Representing other data types
A. Domains described in the paper:
logic, finite domains, sets, functions, relations.
B. I won't go over the details (not enough time)
1. But, this is essential to many applications of formal methods
2. Representing various domains with boolean variables is not
just for BDDs. It's a key concept for using any SAT solver.
3. Note: BDDs have some nice capabilities with functions and
relations that aren't as easily achieved with other SAT
methods.
a. The function composition operations of ROBDDs provide
tremendous versatility.
V. Applications for Verification
A. Equivalence checking
1. An early application of formal methods in HW design.
2. Here, the canonical properties of ROBDDs are very useful
a. When checking if a gate-level netlist is equivalent to the
expressions in a hardware description language,
b. BDDs can detect when a gate output corresponds to a
sub-expression in the HDL.
c. Typical logic networks have multiple inputs and multiple outputs.
BDDs exploit the common sub-expressions use for several
different outputs.
d. See Kuehlmann's paper on the reading list. It's pretty intense,
but an early, compelling implementation of a logical
equivalence checking tool.
B. Symbolic simulation
1. Use a mix of concrete values and symbolic variables in a logic
simulator for hardware, or symbolic interpreter for software.
2. By using symbolic variables for only some of the inputs, the
computations can be much more tractable than a purely symbolic
approach.
3. For some problems (e.g. arithmetic hardware), this has been
highly effective. A small number of test cases can *prove*
the design for all inputs.
4. For hardware verification, see the work by Carl Seger on
"Symbolic Trajectory Evaluation". The logical framework has
a rather steep learning curve, but the results that have
been obtained are impressive.
5. Currently, there is much interest in mixed concrete and symbolic
techniques for software analysis. For example, "concolic"
execution.
C. Model checking
1. We'll start on model-checking a week from today.
a. The main idea is to go from formulas about a single logical
state to formulas about sequences of states.
b. For example, we might want to show that a system *never*
reaches a "bad" state (buffer overflow, corrupted cache
data, etc.)
c. Or, we may want to show that the system *eventually* reaches
some goal: for example, all requests are eventually granted.
d. Model checking is a way to reason about such sequences of states.
2. We can view the transition from one state to the next as a function.
a. More generally, we can view transitions as a satisfying a
relation.
b. The added flexibility of a relation, "non-determinism" allows
us to model uncertain inputs, uncertain timing relations
(which processor writes to a particular memory location first),
etc.
3. BDDs are good at representing
a. Functions: the next state
b. Composition of functions: a sequence of states
c. Relations, and composition of relations.
d. Detecting if two relations are identical
4. Relation squaring.
a. Let R(s0, s1) indicate that state s1 can be reached in one step
from s1.
b. A step can be one step of program execution, or one clock-cycle of
a hardware design, or one action in a distributed protocol.
c. If S0 is the set of all initial states, then
S0 union R(S0)
is the set of all states that can be reached in one step.
Where
R(S0) = {s' | Exists s. (s, s') in R}
d. Using BDDs, we can implement an operation on relations -- I'll
call it Fwd, where
Fwd(R) = {(s, s') | (s == s') OR ((s,s') in R) OR
Exists s". (s,s"), (s",s') in R}
* (s,s') in Fwd(R) iff s' can be reached from s in 0, 1, or 2
steps.
e. With Fwd we have
* s in Fwd(R)(S0) iff s can be reached from some state in S0 in
0, 1, or 2 steps
* s in Fwd(Fwd(R))(S0) iff s can be reached from some state in S0
in 0, 1, 2, 3, or 4 steps.
* s in Fwd^k(R)(S0) iff s can be reached from some state in S0
in 0, 1, 2, ... 2^k steps.
f. If Fwd^{k+1}(R)(S0) = Fwd^k(R)(S0), then every state, s', that can be
reached after *any* number of steps is in Fwd^k(R)(S0)
g. BDDs provide a natural way to implement relation squaring.
VI. Review/Preview
A. Review
1. ROBDDs are an optimized decision tree
a. They are efficient for many problems in practice
b. But they aren't efficient for everything:
if they did, we'd have a proof that P = NP
1. ROBDDs are a canonical form
a. Easy to test if two boolean functions represented by the
ROBDDs with the same variable ordering are the same
function -- O(1) time
b. Easy to test if a function is a tautology -- O(1) time
c. Easy to test if a function is satisfieable -- O(1) time
2. ROBDDs provide a way to treat boolean functions as first
class citizens. Given ROBDDs for f and g, we can
a. Compute the ROBDD for ~f, f&g f|g, f^g, etc.
b. Compute ForAll x. f(x, y), Exists x. f(x, y)
c. Compute function composition f(g(x))
3. ROBDDs are useful for
a. equivalence checking
b. model-checking (i.e. fixpoint computations)
B. A bit of context and history
1. BDDs were a big leap ahead of DPLL SAT solvers when BDDs
were first used for verification (~1986)
a. They provided enough capability to make formal methods
practical for industrial use in the early to mid 1990s.
b. BDDs were the dominant method for boolean reasoning until
the early 2000s.
c. Then, conflict-clause-learning and non-chronological
backtracking pushed DPLL algorithms back into the lead.
2. Today
a. Academics will tell you that BDDs are history, no one uses
them any more.
b. People in industry will tell you that they still use BDDs for
certain problems.
* Variable ordering is both the biggest weakness and biggest
strength of BDD based methods.
+ Variable ordering is a weakness because it often involves
insight by the user to find a good ordering.
+ The advantage is that some problems, especially arithmetic
with bit-vectors, have well-understood, good orderings.
Here, BDDs can be extremely effective.
* The ability to operate on functions and relations gives BDDs
capabilities that are harder to achieve with SAT.
+ Recent advances in "interpolants" have made fix-point
computations feasible with SAT (and SMT) solvers.
+ Thus, SAT-based methods are overtaking BDDs for
model-checking as well.
+ But, there are still niches where BDDs excel.
C. Further reading:
1. http://www.itu.dk/courses/AVA/E2005/bdd-eap.pdf
A detailed tutorial, going into more detail than
we'll need for this course. By Henrik Reif Andersen
2. http://dx.doi.org/10.1145/123186.123222
Efficient Implementation of a BDD Package
Brace, Rudell, and Bryant
3. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=73590&userType=&tag=1
"On the complexity of VLSI implementations and graph
representations of boolean functions with application
to integer multiplication"
Bryant.
4. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=480018
"Binary decision diagrams and beyond: enabling technologies
for formal verification". Bryant.
A survey paper discussing many extensions to the BDD idea
including dynamic (automatic) variable re-ordering, diagrams
with more than two terminal values (for representing arithmetic
and other domains), etc.
C. Preview
0. Python/Jython + BDDs tutorial on Friday, Jan. 11
1. Reading for Jan. 14:
"Equivalence Checking Using Cuts and Heaps", Kuehlmann.
This is a short, but dense paper. I hope to get some
notes up on the web by the weekend.