CpSc 538G: Binary Decision Diagrams

September 21, 2016
  0.  The big picture
    A.  The graph <-> function duality
      1.  BDDs are graphs: DAGs to be more precise
      2.  BDDs are a representation of boolean functions
        a.  We can think of the formula (a | b) & c as a function with three
              parameters:
                function f(bool a, bool b, bool c): bool 
                  return (a & b) | c
        b.  We can represent functions in many different ways:
          *  code in a programming langauge
          *  a mathematical or pseudo-code notation
          *  a sequence of machine instructions or byte codes
          *  a data-flow graph or data-control flow graph
        c.  BDDs are a special case of a data-control flow graph
        d.  Evaluate the function by traversing the graph from the root to
              a leaf.
        e.  BDDs let up perform operations on functions
          *  Example: restriction
               Let f be the function defined above
                 f|_(b=True)
               produces a two parameter fucntion
                 function f_r(bool a, bool c): bool
                   return f(a, True, c)
               This is also known as "partial evaluation".
    B.  ROBDDs are canonical
      1.  A BDD is a decision DAG
      2.  A RBDD is a special class of BDD's where:
        a.  no two vertices correspond to the same function.
        b.  If the "true" and "false" edges of a vertex go to the same descendant,
              we eliminate the upper vertex, and replace all edges into the upper
              vertex with edges into the descendant.  In English, we skip a test
              if it's not needed.
        c.  R stands for "Reduced"
        d.  Transforming a BDD to a RBDD is an optimization that saves space.
      3.  A ROBDD is a special class of BDD's where the tests appear in the
              same order on all paths.
        a.  If path1 tests v_i before testing v_j, and path2 tests v_i and v_j,
              then path2 tests v_i before v_j.
        b.  "O" stands for ordered.
      4.  A ROBDDs are canonical
        a.  If f and g have the same parameters and produce the same result for
              each valuation of the parameters, then f and g have the same
              ROBDD representation.
        b.  This makes testing for equality of functions easy.
        c.  Testing satisfiability is easy: f is satisfiable if its ROBDD is
              not the "False" vertex.
    C.  Terminology
      1.  Technically BDDs are a superset of RBDDs and OBDDs, and
            RBDDs and OBDDs are a superset of ROBDDs.
      2.  OTOH, I'll often say BDD as a shortcut for ROBDD because ROBDDs
            are generally the only version we're interested in.
                   
  I.  Representing Boolean Functions with ROBDDs
        (Reduced Ordered Binary Decision Diagrams)
    A.  From decision trees to ROBDDs
      0.  Running example: one-bit adder
        a.  compute sum and carry_out given 'a', 'b', and carry_in
        b.  sum = a ^ b ^ c_in  (where ^ denotes exclusive-OR)
        c.  carry_out = majority(a, b, carry_in)
                = (a & b) | (a & c) | (b & c)
      1.  Decision tree
      2.  From tree to DAG:
        a.  remove duplicate terminals
        b.  share identical subtrees.
      3.  Remove redundant tests
    B.  ROBDDs are canonical
      Proof: by induction
        1.  Base case: ROBDD with no variables
        2.  Induction step:
              assume ROBDDs with k variables are canonical
              show that ROBDDs with k+1 variables are canonical as well
    C.  Variable ordering matters
      1.  Example: Let x and y be n-bit vectors
      2.  x == y can be done represented by an ROBDD with O(n) nodes
             if bits are interleaved.
        a  At each level, compare the bits
        b  Pass the conjunction along
      3.  x == y requires Omega(2^n) nodes if the bits of x precede
              the bits of y:
           Need to distinguish all possible values of x
        &  There are 2^n such values
        => The ROBDD has Omega(2^n) branches going from the last bit of
             x to the first bit of y
           
 II.  Operations on ROBDDs
    A.  Shannon expansion
      1.  f = (!x & f|_(x=False)) | (x & f|_(x=True))
        a.  f is a function, for example
              def f(a,b,c):
                return (a & b) | (a & c) | (b & c) 
        b.  f|_(x=v) means f with the parameter x replaced by the value v.
              using the example from part (a),
                f|_(a=False) = (False & b) | (False & c) | (b & c)
                             = b & c
                f|_(a=True ) = (True & b) | (True & c) | (b & c)
                             = b | c | (b & c)
                             = b | c
        c.  Shannon expansion:
                f = (!a & f|_(a=False)) | (a & f|_(a=True))
                  = (!a & (b & c)) | (a & (b | c))
                  = (!a & b & c) | (a & b) | (a & c)
                  = (!a & b & c) | (a & b) | (a & c) | (a & b & c)
                  = (b & c) | (a & b) | (a & c)
      2.  BDDs support Shannon expansion by having separate "true" and "false"
            subtrees at each level.
      3.  Shannon expansions are used to implement the operations on BDDs
            described below.
            
    B.  Apply
      1.  Let's say we have boolean functions f and g represented by BDDs.
        a.  We want to compute f | g.
        b.  More generally, we want to compute f op g where op is any
              boolean function of two variables.
        c.  Shannon expansion
              (f op g) = (!a & (f op g)|_(a=False)) | (a & (f op g)|_(a=True))
        d.  Boolean operators commute with Shannon expansion:
              (f op g)|_(a=False) = f|_(a=False) op g|_(a=False)
              (f op g)|_(a=True)  = f|_(a=True)  op g|_(a=True)
      2.  To compute f op g, with variable ordering a -> b -> c
        a.  Compute f|_(a=False), this is just the left descendant of the
              root vertex vertex for f.
        b.  Likewise, g|_(a=False) is the left descendent of the root vertex
              for g.
        c.  Recurse to the problem of computing 
                f|_(a=False) op g|_(a=False)
        d.  When we reach the leaves, they are labeled with constants, just
              compute f_leaf op g_leaf.
        e.  When we have the results for our left and right subgraphs,
              create a new vertex with these results as its left and right
              subgraphs.
      3.  A detail.  A ROBDD will skip over levels for which the left and
              right subtree are the same.
        a.  If f skips one or more levels that g doesn't (or vice-versa),
              we just treat the vertex of f as the match of the intermediate
              vertices of g.
        b.  I'll draw a picture on the board.  This is just implicitly
              un-reducing that vertex of f on demand.
      4.  An example:  f = (a & b), g = (a & c), compute f | g
        a.  I'll write h as the function f | g.
        b.  Let's say we have the ROBDDs for f and g.
        c.  We first explore the left descendant from the a-vertex for the
                  two functions.
          * Both of these are the False terminal. 
          * False | False -> False
          * The left descendant of the a-vertex for h is the False terminal.
          * Our tree for h looks like:
              !a -> False
               a -> remaining work
        c.  Now, explore the right descendant of the a-vertex for f and g.
          *  The right descendant for f is labeled with b.
          *  The right descendant for g is labeled with c.
               We treat this as if there were a b vertex for g, where both
               outgoing edges go to the current vertex.
          *  We recurse again, and look at the left descendant of the b-vertex
                 for f.
            +  It goes to the terminal False.
            +  We're now at the c-vertex for g.
              -  The left descendant is the False terminal.
                   False | False -> False
                   The left descendant of this c-vertex for h will be False.
              -  The right descendant is the True terminal.
                   False | True -> True
                   The right descendant of this c-vertex for h will be True.
            +  Putting together what we've done so far, h looks like
                !a -> False
                 a -> !b -> !c -> False
                         ->  c -> True
                       b -> remaining work
          *  We now look at the right descendant of the b-vertex for f.
           +  It goes to the terminal True.
           +  We descend to the c-vertex for g.
              -  The left descendant is the False terminal.
                   True | False -> True
                   The left descendant of this c-vertex for h will be True.
              -  The right descendant is the True terminal.
                   True | True -> True
                   The right descendant of this c-vertex for h will be True.
              -  Both descendants from the c-vertex are the same,
                   we can skip this decision level.  The subtree 
           +  The right descendant of the b-vertex for h is True.
        d.  Putting this together, the ROBDD for h looks like
                !a -> False
                 a -> !b -> !c -> False
                         ->  c -> True
                       b -> True
      5.  Some optimizations:
        a.  Hash subproblems.
          *  Let f_i and g_j be vertices of the ROBDDs for f and g.
          *  Check to see if we have already computed the result for f_i and g_j.
            -  If so, use the previously computed subgraph
            -  Otherwise, do the recursive computation to get the new subgraph
                 and enter it into the hash table.
        b.  Hash results:
          *  Let's say a vertex for variable v has subgraphs v_0 and v_1.
          *  Check to see if we already have the result for (v, v_0, v_1)
            +  If so, use the previously computed vertex.
            +  Otherwise, construct the vertex and put it in the hash table.
        c.  Skip unnecessary decisions.
          *  Let's say that a vertex for variable v produces subgraphs v_0
               and v_1, and that v_0 = v_1.
          *  We don't need a decision vertex for v.  We just return v_0 to
               the computation for the previous variable (in our variable
               ordering).
        d.  Short-circuit when encountering a dominant constant
          *  If op is AND, and we encounter a False, we don't need to
               explore the subgraph for the other argument.
          *  Likewise, if op is OR and we encounter a True.
          *  The paper doesn't describe the case where op is AND, we are
               considering variable v, and the f vertex for v is True.
               In this case, our result vertex is just the subgraph for g.
               Likewise if op is OR, and we encounter a False vertex.
      6.  Complexity.
        a.  |f op g| = O(|f| |g|).
        b.  The time to compute |f op g| is O(|f| |g|) as well.  
        c.  These are worst-case bounds.  In some cases, the time may
              be much lower.  See the discussion of variable ordering
              and BDD sizes earlier in the paper.

    C.  Restrict
      1.  Want to compute f|_(x=True) or f|_(x=False) where x is a parameter
            to function f.
      2.  Consider the case of restricting to x=True, the other is similar.
        a.  Traverse the ROBDD for f with a depth first search.
        b.  Whenever we encounter an edge to a vertex labeled x
          +  Replace that with an edge to the 'True' descendant of the
               vertex labeled x.
      2.  Need to use the hash table of previously computed results to
              keep the ROBDD reduced.
        a.  Example: variable ordering p -> w -> x -> y -> z
              p  ->  w1 ->  x1 ->  y0 ->  z0 -> True
                                      -> !z0 -> False
                               -> !y0 ->  False
                        -> !x1 ->  y1 ->  z0
                               -> !y1 ->  z1 -> False
                                      -> !z1 -> True
                 -> !w1 -> False
              !p ->  w2 ->  x2 ->  y0
                        -> !x2 ->  y2 ->  z1
                                  !y2 ->  z0
                 -> !w2 -> False
        b.  I wrote x1, w1, y0, and so on to indicate different vertices.
              The letter in the vertex name indicates the decision variable.
        c.  Note, this function is
              If(p, w & If(x, y & z, y == z),
                    w & If(x, y & z, y != z))
        d.  Restrict to x=True.
              p  ->  w1r -> y0
                 -> !w1r -> False
              !p ->  w2r -> y0
                 -> !w2r -> False
        e.  Note that w1r and w2r are the same.
          -  We need to detect this and make them the same vertex.
          -  This is what the hash table is for.
        f.  Restrict take O(|f|) time and space.

    D.  Other operations:
      1.  Quantifiers:
        a.  ForAll x. f is implemented as:
              f|_(x=False) & f|_(x=True)
        b.  Likewise for Exists x. f
        c.  Quantification takes time O(|f|^2)
      2.  Function composition
        a.  Let f and g be functions
        b.  Let's say we want to compute f|_(x=g)
        c.  This is (!g & f|_(x=False)) | (g & f|_(x=false))
          *  This involves two RESTRICT and three APPLY operations.
          *  The time complexity is O(|f|^2 |g|^2)
          *  The operations can be combined into a single depth-first
               traversal of the graphs representing f and g.
          *  This optimized version takes time O(|f| |g|^2).
        d.  "Modern" BDD packages will replace a vector of variables
                with a vector of functinos.
          * Doing all the replacements at once is more efficient.
          * I don't have a reference handy for the efficiency.

    E.  ROBDDs are canonical.
      1.  Using the hash table methods described above
        a.  Two ROBDDs represent the same function iff they have the
              same top-level vertex.
        b.  Checking for equality is a simple, pointer comparison.
      2.  This feature is used in many algorithms that exploit ROBDDs
      3.  Note that the "hash table" is the language of the paper.
            Any "collection" with O(1) insert and look-up will suffice.

III.  A few implementation notes
    A.  The method of using a single ROBDD structure to represent
            multiple functions has been very successful.
      1.  This allows sharing of common sub-graphs.
    B.  Implementations tend to be memory-bound
      1.  Lots of work goes into tuning the data structures to use
            as little memory per vertex as possible -- lots of bit
            fiddling.
      2.  Efficient allocation and reclaiming of vertices is critical.
    C.  The paper mentioned "promising" early results on parallel
            implementations
      1.  Didn't really work out very well.
      2.  As the paper noted, synchronization overhead were a serious
            bottleneck.

 IV.  Representing other data types
    A.  Domains described in the paper:
          logic, finite domains, sets, functions, relations.
    B.  I won't go over the details (not enough time)
      1.  But, this is essential to many applications of formal methods
      2.  Representing various domains with boolean variables is not
            just for BDDs.  It's a key concept for using any SAT solver.
      3.  Note:  BDDs have some nice capabilities with functions and
              relations that aren't as easily achieved with other SAT
              methods.
        a.  The function composition operations of ROBDDs provide
              tremendous versatility.

  V.  Applications for Verification
    A.  Equivalence checking
      1.  An early application of formal methods in HW design.
      2.  Here, the canonical properties of ROBDDs are very useful
        a.  When checking if a gate-level netlist is equivalent to the
              expressions in a hardware description language,
        b.  BDDs can detect when a gate output corresponds to a
              sub-expression in the HDL.
        c.  Typical logic networks have multiple inputs and multiple outputs.
              BDDs exploit the common sub-expressions use for several
              different outputs.
        d.  See Kuehlmann's paper on the reading list.  It's pretty intense,
              but an early, compelling implementation of a logical
              equivalence checking tool.
    B.  Symbolic simulation
      1.  Use a mix of concrete values and symbolic variables in a logic
            simulator for hardware, or symbolic interpreter for software.
      2.  By using symbolic variables for only some of the inputs, the
            computations can be much more tractable than a purely symbolic
            approach.
      3.  For some problems (e.g. arithmetic hardware), this has been
            highly effective.  A small number of test cases can *prove*
            the design for all inputs.
      4.  For hardware verification, see the work by Carl Seger on
            "Symbolic Trajectory Evaluation".  The logical framework has
             a rather steep learning curve, but the results that have
             been obtained are impressive.
      5.  Currently, there is much interest in mixed concrete and symbolic
            techniques for software analysis.  For example, "concolic"
            execution.
    C.  Model checking
      1.  We'll start on model-checking a week from today.
        a.  The main idea is to go from formulas about a single logical
              state to formulas about sequences of states.
        b.  For example, we might want to show that a system *never*
              reaches a "bad" state (buffer overflow, corrupted cache
              data, etc.)
        c.  Or, we may want to show that the system *eventually* reaches
              some goal: for example, all requests are eventually granted.
        d.  Model checking is a way to reason about such sequences of states.
      2.  We can view the transition from one state to the next as a function.
        a.  More generally, we can view transitions as a satisfying a
              relation.
        b.  The added flexibility of a relation, "non-determinism" allows
              us to model uncertain inputs, uncertain timing relations
              (which processor writes to a particular memory location first),
              etc.
      3.  BDDs are good at representing
        a.  Functions: the next state
        b.  Composition of functions: a sequence of states
        c.  Relations, and composition of relations.
        d.  Detecting if two relations are identical
      4.  Relation squaring.
        a.  Let R(s0, s1) indicate that state s1 can be reached in one step
              from s1.
        b.  A step can be one step of program execution, or one clock-cycle of
              a hardware design, or one action in a distributed protocol.
        c.  If S0 is the set of all initial states, then
                S0 union R(S0)
              is the set of all states that can be reached in one step.
              Where
                R(S0) = {s' | Exists s. (s, s') in R}
        d.  Using BDDs, we can implement an operation on relations -- I'll
              call it Fwd, where
                Fwd(R) = {(s, s') | (s == s') OR ((s,s') in R) OR
                                     Exists s". (s,s"), (s",s') in R}
          *  (s,s') in Fwd(R) iff s' can be reached from s in 0, 1, or 2
               steps.
        e.  With Fwd we have
          *  s in Fwd(R)(S0) iff s can be reached from some state in S0 in
               0, 1, or 2 steps
          *  s in Fwd(Fwd(R))(S0) iff s can be reached from some state in S0
               in 0, 1, 2, 3, or 4 steps.
          *  s in Fwd^k(R)(S0) iff s can be reached from some state in S0
               in 0, 1, 2, ... 2^k steps.
        f.  If Fwd^{k+1}(R)(S0) = Fwd^k(R)(S0), then every state, s', that can be
             reached after *any* number of steps is in Fwd^k(R)(S0)
        g.  BDDs provide a natural way to implement relation squaring.
           

 VI.  Review/Preview
    A.  Review
      1.  ROBDDs are an optimized decision tree
        a.  They are efficient for many problems in practice
        b.  But they aren't efficient for everything:
             if they did, we'd have a proof that P = NP
      1.  ROBDDs are a canonical form
        a.  Easy to test if two boolean functions represented by the
              ROBDDs with the same variable ordering are the same
              function -- O(1) time
        b.  Easy to test if a function is a tautology -- O(1) time
        c.  Easy to test if a function is satisfieable -- O(1) time
      2.  ROBDDs provide a way to treat boolean functions as first
              class citizens.  Given ROBDDs for f and g, we can
        a.  Compute the ROBDD for ~f, f&g f|g, f^g, etc.
        b.  Compute ForAll x. f(x, y), Exists x. f(x, y)
        c.  Compute function composition f(g(x))
      3.  ROBDDs are useful for
        a.  equivalence checking
        b.  model-checking (i.e. fixpoint computations)
    B.  A bit of context and history
      1.  BDDs were a big leap ahead of DPLL SAT solvers when BDDs
              were first used for verification (~1986)
        a.  They provided enough capability to make formal methods
              practical for industrial use in the early to mid 1990s.
        b.  BDDs were the dominant method for boolean reasoning until
              the early 2000s.
        c.  Then, conflict-clause-learning and non-chronological
              backtracking pushed DPLL algorithms back into the lead.
      2.  Today
        a.  Academics will tell you that BDDs are history, no one uses
              them any more.
        b.  People in industry will tell you that they still use BDDs for
               certain problems.
          *  Variable ordering is both the biggest weakness and biggest
                 strength of BDD based methods.
            +  Variable ordering is a weakness because it often involves
                 insight by the user to find a good ordering.
            +  The advantage is that some problems, especially arithmetic
                 with bit-vectors, have well-understood, good orderings.
                 Here, BDDs can be extremely effective.
          *  The ability to operate on functions and relations gives BDDs
                 capabilities that are harder to achieve with SAT.
            +  Recent advances in "interpolants" have made fix-point
                 computations feasible with SAT (and SMT) solvers.
            +  Thus, SAT-based methods are overtaking BDDs for
                 model-checking as well.
            +  But, there are still niches where BDDs excel.

    C.  Further reading:
      1.  http://www.itu.dk/courses/AVA/E2005/bdd-eap.pdf
            A detailed tutorial, going into more detail than
            we'll need for this course.  By Henrik Reif Andersen
      2.  http://dx.doi.org/10.1145/123186.123222
            Efficient Implementation of a BDD Package
            Brace, Rudell, and Bryant
      3.  http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=73590&userType=&tag=1
            "On the complexity of VLSI implementations and graph
              representations of boolean functions with application
              to integer multiplication"
            Bryant.
      4.  http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=480018
            "Binary decision diagrams and beyond: enabling technologies
            for formal verification".  Bryant.
            A survey paper discussing many extensions to the BDD idea
            including dynamic (automatic) variable re-ordering, diagrams
            with more than two terminal values (for representing arithmetic
            and other domains), etc.
    C.  Preview
      0.  Python/Jython + BDDs tutorial on Friday, Jan. 11
      1.  Reading for Jan. 14:
            "Equivalence Checking Using Cuts and Heaps", Kuehlmann.
            This is a short, but dense paper.  I hope to get some
            notes up on the web by the weekend.