CpSc 513	     Verification in Real Life			Jan. 7, 2013
		SLAM and the verifying the i7 design

  0.  Topics for today
      I.  Formal verification methods in today's papers
     II.  SLAM/SDV
    III.  Verification in the i7 design
     IV.  Review and Preview

  I.  Formal verification methods in today's papers
    A.  Model checking
      1.  Model a program as having a finite set of states
        a.  For example, a program that has a fixed number of boolean
		variables.
        b.  If a program has N boolean variables, then these can take on
		2^N combinations of values.
        c.  If the same program has K possible values for the program
		counter (for now, assume no recursion), then there are
		K*2^N possible program states.
        d.  Not all of these states can occur in real executions.
      2.  The verification problem:
        a.  Let q0 be the initial state of the program
	  * e.g. PC=0, all program variables are 'false'
	  * more generally, we might have, Q0, a set of possible initial
	      states (e.g. some variables may be uninitialized).
        b.  Let Q be the set of all possible states.
        c.  Let R be the state transition relation of the program
	  * (q, q') IN R iff the program can go from state q to state q'
	      in a single step.
	  * Why is R a relation?
	      Allow uncertainty in the model to account for:
		inputs
		actions by the environment
		simplifications that we made to the program to make
		  verification tractable, etc.
	  * Usually we don't write down R explicitly (i.e. we don't
	  	list all of the state pairs in R)
	    + Instead, for any q, we can find all q' such that (q, q') IN R
	        by looking at the program source code.
	    + Note that q tells us the current PC and the values of all
	        program variables.
	    + We just consider executing that statement from with those
	    	values for the variables in the program.
        d.  Let Spec: Q -> boolean be the specification.
	  * We want to show for all reachable states q, Spec(q) holds.
      3.  Model checking
	      let Pending := { q0 };
	      let Reachable := EmptySet;
	      while Pending != EmptySet do
		let q := choose(Pending);
		Pending = Pending - {q};
		if NOT Spec(q) return(fail);
		endif
		for each q' such that (q, q') IN R do
		  if NOT (q' IN Reachable)
		    Pending = Pending UNION {q'};
		    Reachable = Reachable UNION {q'};
		  endif
		end  // for
	      end // while
	      return(pass);
        a.  This algorithm returns 'fail' if it is possible to reach a
	      state that doesn't satisfy Spec.  Otherwise, all reachable
	      states satisfy Spec and model checking returns 'pass'.
        b.  The algorithm terminates because:
	  *  There are a finite number of possible states
	  *  Each state is added to Reachable at most once
	  *  Each state is added to Pending at most once
	  *  Each iteration of the while loop removes a state from Pending,
	       thus, the while loop body is executed at most |Q| times.
	  *  Likewise, each time the for loop is encountered, its body
	       is executed at most |Q| times.
	  *  The total runtime is at most |Q|^2
        c.  The algorithm can be used to produce counterexample traces.
	      For every state q' in Reachable
	        either q = q0
		or we can find a state q such that q in Reachable and
		  (q, q') in Reachable
	      Note: q is just the state we had for 
        d.  Later, we'll learn more about model checking
	  *  Algorithms that work well in practice.
	  *  How to check other kinds of properties as well.

    B.  Abstraction
      1.  In general
        a.  Real programs can have an unbounded number of variables
	  *  Recursion
	  *  Dynamic allocation
	  *  Even a program with a small number of int's, float's, or
	       double's has astronomically more states than we can hope
	       to explore.
        b.  In practice, model checking is performed on a "simplified"
	      version of the program.  Abstraction is one way to do this.
        c.  Let Q be the set of states of the program, and A be a set of
	      "abstracted" states.
        d.  Define a function abs: Q -> A so that for q in Q, abs(q) is
	      the abstraction of q.
        e.  Now, define an abstract transition relation R_abs so that
	      (a, a') IN R_abs if Exists (q, q') IN R s.t.
	        a = abs(q) AND a' = abs(q')
	f.  Let Spec(q) = Spec'(abs(q))
        g.  If Spec' is a safety property of the program with state space
	      A, transition relation R' and initial state abs(q0), then
	      Spec is an safety property of the original program.
        h.  Thus, we can prove safety properties of complicate programs
	      by proving the property for a simpler program.
      2.  Predicate abstraction
        a.  An abstraction function where state variables of the abstract
	      program correspond to predicates over the values of the
	      variables of the detailed program.
        b.  Often, we can use the predicates that appear as conditions of
	      if-statements, or while-loops, etc. as the variables for the
	      abstract program.
        c.  There's a bunch of cool theory (that we'll cover later) that
	      shows this approach is very general.
      3.  CEGAR -- Counter-Example Guided Abstraction Refinement
        a. If our desired property fails to hold for the abstract program
		 what can we conclude about the original program?
	       Not much.
          *  The original program may have a bug,
          *  Or, the abstraction may be too coarse.
        b. Let a_0, a_1, ... a_k be a counter-example for the abstract
		program.
	  * Can we find
	      q_0, q_1, ... q_k in the original program such that
	           ForAll 0 <= i <= k. a_i = abs(q_i)
		AND ForAll 0 <= i < k.  (q_i< q_(i+q)) IN R
	      ?
	  * If "yes", then we've found a counter-example for the original
	       program.
	  * If "no", then in many cases, this tells us why our abstraction
	       is too coarse, and we can use this to (in some cases
	       automatically) find a better abstraction.
	  * In future lectures we will:
	   + Look at abstraction in more detail.
	   + Consider more general versions where the original program
	       may take several steps for one step of the abstract program
	   + ...

    C.  Inductive Invariants
      1.  Another way to show that as safety property, Spec, hold in all
	      reachable states.
        a.  Find a predicate I such that
		ForAll (q, q') IN R. I(q) => I(q')
	b.  In English, this means that if I holds in some state, it
	      will hold in the next state as well.
	c.  By induction (hence the name "inductive invariant"), if I
	      holds in some state, it will hold in all future states.
	d.  Thus, if I holds in state q0, it will hold in all reachable states.
	e.  If I holds in q0 and I => Spec, then Spec holds in in all
		reachable states.
      2.  Showing that a predicate, I, is an invariant can be *much* easier
      	    than finding all reachable states.
      3.  Finding I is the hard part:
        a.  human ingenuity (scarce)
        b.  software that "discovers" invariants (hot research topic)
        c.  combined human/computer methods

    D.  Symbolic Simulation
      1.  Replace some of the values in a simulation with symbolic variables.
        a.  Propagate the variables through the simulation.
        b.  Check properties at the end.
      2.  Generalizes a simulations
        a.  Increases coverage
	b.  To keep it tractable, not all variables are symbolic.
      3.  There are systematic ways to combined symbolic simulation results
        to get a complete proof.
        

 II.  SLAM/SDV
    0.  Overview
      A.  Motiviation
        1.  Research goals, demonstrate an application of
	  *   Boolean programs (predicate abstraction)
	  *   Model-checking for pushdown automata
	  *   Pointer-alias analysis
        2.  Practical goal
	  Provide better bug-finding for Windows device drivers.
      B.  Glossary:
	API  -- Applications Program Interface
	DDK  -- Driver Development Kit (for Windows Server 2003)
		  see also WDK
	ML   -- originally "MetaLanguage", a functional programming language,
	  see
	    http://en.wikipedia.org/wiki/ML_(programming_language)
	    http://www.cs.cmu.edu/~rwh/introsml/
	  Functional languages are widely used in formal verification because
	  they it is relatively easy to give a mathematically precise meaning
	  (i.e. semantics) to programs written in functional languages.
	SDV  -- Static Driver Verifier
	SLAM -- Software, programming Languages, And Model checking
	     or Specifications, programming Languages, And Model checking
	SLIC -- Specification Language for Interface Checking
	WDK  -- Windows Driver Kit (for Windows Vista)
	WDF  -- Windows Driver Framework (or "Windows Driver Foundation")
	  A toolkit for writing drivers for windows.  Includes Microsoft
	  supplied "good" drivers that the programmer can extend or override
	  to create their own driver.  See:
	    http://en.wikipedia.org/wiki/Windows_Driver_Foundation
	WDM  -- Windows Driver Model
	  Predecessor to the WDF.  A framework for Windows device drivers
	  that was used with Windows 98, 2000, XP, Server 2003, and Vista.
	  The programming model is more a "write the driver from scratch"
	  compared with the "override and extend" approach for the WDF.  See:
	    http://en.wikipedia.org/wiki/Windows_Driver_Model

    A.  SLAM -- Software, programming Languages, And Model checking
      1.  Stateful usage rules
        a.  Many rules for device drivers are about the order in which
		kernel operations may be performed:
	  +  E.g. it is an error to attempt to acquire a lock that the
	  	thread already holds.
	  +  E.g. it is an error to attempt to release a lock that the
	  	thread does not hold.
        b.  These can be expressed as rules about state machines.
	      Model checking can be used to show that the program never
	      reaches a bad state.
      2.  Boolean programs
        a.  All state variables are boolean valued
        b.  These vahttp://hardware-review24.com/load/cpu/intel_sandy_bridge_microarchitecture/1-1-0-22riables correspond to predicates in the original
	      program and predicates identified in the CEGAR process.
      3.  Monitors:  SLIC
        a.  Properties are specified in a C-like language that describes
		monitoring operations in the kernel functions.
	b. These functions aren't actually compiled into the kernel
	c. Instead, they are used by the verifier to say:
	     Whenever you see the driver call kernel function KeFunc,
	     take the following actions in the abstract program.
	d. In their example, the function to acquire a lock sets a variable
	     called "state" to the value Locked, and the function to
	     release the lock sets the value of "state" to Unlocked.
      4.  Model-checking in SLAM
        a.  Same idea as described in I.A.3
        b.  Extended to handle pushdown automata (machines with stacks)
	  *   They use inter-procedure analysis to reduce the complexity
	        of the pushdown automaton to keep the analysis practical.
        c.  They use binary decision diagrams (to be covered in the Jan. 9
		lecture) to represent machine states symbolically.
	  *   A large number of machine states can be represented by a
	  	"simple" formula.
      4.  CEGAR (CounterExample Guided Abstraction Refinment) in SLAM
        a.  Start with a very simple, automatically generated abstraction
	  *   The abstraction is a boolean program.
        b.  Model check the abstract program.
        c.  If the abstract program passes the check, then the real program
	      does as well.
        d.  If the abstract program fails the check, try to create a
	      counter-example in the original program.  This is done using
	      a "SMT" solver (to be covered later in this course).
        e.  If a counter-example is found, it is reported.
        f.  If there is no such counter-example, the SMT solver provides a
	      proof http://hardware-review24.com/load/cpu/intel_sandy_bridge_microarchitecture/1-1-0-22that the formula describing the counter-example is
	      unsatisfiable.
        g.  The really cool thing is that by using this proof, they can
	       automatically extract one or more predicates that can be
	       added to the abstract program (as additional boolean valued
	       variables).
	  *  This produces a boolean program whose executions are a subset
	       of those of the previous boolean program.  In particular,
	       the bogus counterexample is now excluded.
	  *  Steps b and beyond are re-executed with this new abstraction.
        h.  Although each refinement produces a tighter abstraction the
	      verification is not guaranteed to terminate.
	      Why?
      5.  Example:
        a.  Verifying a lock...unlock sequence where both operations
		are guarded by (x > 0)
	b.  Originally the boolean program allows an unlock without a lock
	  *  The x > 0 conditions had been replaced with non-deterministic
	       choices.
	  *  The model-checker finds a trace where the lock is skipped
	       but the unlock is performed.
	c.  This trace of the boolean program doesn't correspond to any
	       execution of the real program.
	  *  The refinement process adds a boolean variable corresponding
	       to the predicate "x > 0" to the boolean program.
	d.  Now, the condition for performing the lock and the condition
	      for the unlock always do the same thing, and the program
	      passes the check.

    B.  SDV -- Static Device Verifier
      0.  Verification needs more than a model checker
        a.  Need to specify properties to verify
        b.  Need an environment model
        c.  Need to integrate into development environment
      1.  Specifying properties to verify
        a.  specify properties for each driver API
        b.  separate properties specified for
	  *  network drivers
	  *  storagehttp://hardware-review24.com/load/cpu/intel_sandy_bridge_microarchitecture/1-1-0-22 device drivers
	  *  graphics device drivers
        c.  writing these rules requires formal methods expertise
	  *  The effort is amortized over the large number of drivers to be
	  	verified.
	  *  Some rules developed by the formal methods team at Microsoft
	  	research.
	  *  Most developed by other groups in Microsoft:
	        The formal methods research group provides formal methods
		  training so other groups can have there "rule writers".
		They describe the success of hiring summer interns to
		write rules.
      2.  Modeling the environment
        a.  Model the application program that calls the driver, and
	  *  Use non-determinism to express that the application program
	  	may make an arbitrary sequence of calls.
	  *  If there are restrictions on how the application program
	  	can call the driver, these are included in the model as
		well.
        b.  Model the behaviour of system calls
	  *  Use non-determinism for
	   +  return codes
	   +  return values (e.g. pointers into buffers) --
	        need to make this finite-state for the model checker.
      3.  Integrating into software development environment
        a.  Scripts to combine driver code with rules and environment model
        b.  The user runs these scripts
        c.  The script produces an error summary along with counter-example
	      traces.

    C.  Experience
      1.  Successes
        a.  Found many real bugs in real drivers
	b.  Boolean abstraction works well with programs up to a few
		tens of thousands of source code lines.
      2.  Limitations
        a.  Only checks for "sequence of operation" rules
	      I don't believe it finds pointer errors, array bound violations
	  	or similar programming errors.
        b.  Only checks sequential programs
	      But some recent work has studied verification of concurrent
	      programs assuming "bounded context switching" (see ref. 40).
        c.  CEGAR and predicate abstraction impractical for programs
		with hundreds of thousands of source code lines or more.

III.  Verification in the i7 design
    0.  Motivation and Glossary
      A.  Motivation
        1.  Find more design errors earlier in the "execution cluster"
	      of the i7 processor architecture.
        2.  Claim that verification is superior to simulation for designs
	        of this complexity in terms of
	  a.  quality of design
	  b.  time and cost of verification effort
      B.  Glossary
        AGU   -- "Advanced Vector Extensions".  Instruction for 256-bit
	             "vectors".
        AVX   -- "Address Generation Unit".  See III.A.4.iv.
	CTE   -- "Cluster Test Environment", for simulation.
	CVE   -- "Cluster Verification Environment", for formal verification.
        EXE   -- The "EXecution Cluster".  See III.A.4
	FEC   -- The "Front End Cluster".  See III.A.3.a
        FPU   -- "Floating Point Unit".  See III.A.4.iii.
	IA-32 -- The Intel version of the x86 instruction set.
        IEU   -- "Integer Execution Unit.  See III.A.4.b.i.
        JEU   -- "Jump Execution Unit".  See III.A.4.b.v.
        MEC   -- "MEmory Cluster", see III.A.3.d
	MIU   -- "Memory Interface Unit".  See III.A.4.b.vi.
	MMX   -- SIMD instructions for "packed integers"
		   Uses the 64-bit wide FPU datapath to do any of
		     2 32-bit integer operations
		     4 16-bit integer operations
		     8  8-bit integer operations
		   as a single operation
	OOO   -- Out-Of-Order cluster.  See III.A.3.b
	SIMD  -- Single-Instruction-Multiple-Data, a simple form of
		   parallel computation where multiple execution units
		   perform the same operation on several sets of operands.
	SIU   -- SIMD Integer Unit.  See III.A.4.b.ii.
        SMT   -- Simultaneous Multi-Threading, see III.A.2.c
	SSE   -- "Streaming SIMD Extenstions". Instructions for 128-bit
	             "vectors" that can be treated as any of:
		   2 doubles, 4 floats, 4 32-bit integers, 8 16-bit integers,
		   16 8-bit integers.
	STE   -- "Symbolic Trajectory Evaluation", a form of symbolic
		    simulation.
	vPro  -- Hardware support for security.

    A.  Organization of an i7 computer
      1.  The CPU (central processing unit) interfaces  with
        a.  memory (DRAM)
        b.  storage devices (disk, FLASH, etc.)
        c.  displays
        d.  networks (ethernet, WiFi, Infiniband, etc.)
	e.  other I/O devices
      2.  The CPU consists of multiple "cores"
        a.  These cores form a shared-memory, parallel computer
	  *  abbreviated CMP for "Chip MultiProcessor"
	  *  a.k.a. SMP for "Shared Memory multiProcessor"
        b.  Each core can fetch and execute instructions independently
        c.  In fact, an i7 core can fetch and execute instructions
	      from two different threads or processes and execute them
	      at the same time.  This is known as SMT ("Simultaneous
	      Multi-Threading").  In Intel-marketing lingo, it's called
	      "hyperthreading".
      3.  Each i7 core is organized with the following top-level units
      	    called "clusters" in the paper:
        a.  Front End Cluster (FEC):
	  * fetches instructions from the L1 instruction cache
	  	(see the MEC below).  Each core can fetch multiple
		four instructions per cycle.
	  * translates each x86 instruction into one or more
	  	"microinstructions".  The microinstructions are
		basically RISC instructions.  An x86 instruction may
		perform multiple operations.  RISC instructions, in
		general, each perform a single operation.
		  Translated instructions are stored in a microinstruction
		cache to save power when the same batch of instructions
		is executed multiple times (e.g. in a loop).
	  * branch prediction:
	    +   Each core fetches up to four instructions per cycles.
	    +   It takes multiple cycles to execute a branch.
	    =>  10's of instructions may be fetched between fetching
	    	  a branch and knowing for sure whether or not the
		  branch is taken.
	    ~>  Each core records the history of recently executed branches
	    	  and guesses the outcome based on the history.  If it
		  guesses right (in practice, most of the time), execution
		  continues.  When a guess turns out to be wrong, the CPU
		  aborts instructions that where fetched after the branch,
		  rolls back its state to the point of the branch, and
		  resumes from there.
        b.  Out-Of-Order cluster (OOO):
	  * receives microinstructions from FEC
	  * dynamically computes dependency graph for instructions.
	  * an instruction can execute as soon as its operands are available.
	  * The OOO commits instructions in program order.
	    + If an instruction generates an execution (e.g. a page fault,
	    	an arithmetic exception, etc.), the instruction and all
		subsequent ones are aborted.
	    + This is like committing transactions in databases.
        c.  EXEcution cluster (EXE):
	  *  Has separate hardware for various classes of instructions
	  *  Details given under III.A.4
        d.  MEmory Cluster (MEC):
	  *  on-chip caches
	  *  cache coherence: make sure that reads and writes to the same
	  	location from different cores produce consistent results.
	  *  interface to off-chip DRAM
      4.  The EXEcution cluster
        a.  The focus of this paper.
	b.  Has Separate hardware for various classes of instructions
	    i. scalar integer (IEU)
	         Performs integer add, subtract, comparisons, bitwise
		   logical, shifts, etc.
	   ii. SIMD integer (SIU)
		 Performs SIMD integer instructions (MMX and SSE).
	  iii. floating point (FPU)
		 Performs floating point operations such as
		   FADD, FMULT, FDIV, FCOS, etc.
	   iv. Address generation for memory loads and stores (AGU)
	   	 Computes addresses for loads and stores, access checks, etc.
	    v. Jumps and branches (JIU)
	         Handles jump and branch instructions to confirm (or not)
	           the branch predictor.
	   vi. Memory interface unit (MIU)
		 Handles data transfers with the caches.
		 Receives data for loads.
		 Handles store buffering and bypasses.
	c.  Queues for (roughly) instructions of each class
	d.  Instructions are executed as soon as their operands are ready.
	  * If multiple instructions from the same queue are ready to
	      execute on the same cycle, the oldest one is taken first
	      (I'm guessing because this is the standard way of doing it).
	e.  Some added clever tricks for SSE and AVX instructions that
	    use multiple functional units in parallel.
	f.  More cleverness for vPro instructions, special floating point
	    "shuffling", etc.
	g.  Integer ALUs included on many of the execution ports because
	  * They are relatively small.
	  * They can use the queue and other resources when the program
	      doesn't make heavy use the dedicated unit for this execute
	      port.

    B.  Design verification at Intel
      1.  Overall objectives
        a.  minimize time-to-market:
          *  profit margins are highest when the processor is first sold
          *  trade-off between pre-silicon verification and post-silicon
	         debug
	    +  pre-silicon lets the designer/verifier control and/or observe
	    	 every signal.
	    -  But, simulation is *much* slower than the real chip.
	    +  post-silicon lets the tests run at full speed.
	    -  But, it's much harder to figure out what went wrong
	    	 (or even if something went wrong).
        b.  need to find nearly all the bugs before selling the chip
	  +  recalls are expensive, and lower the value of the brand
	  ~  need to keep bug-rate low enough that crashes can be blamed
	  	on Microsoft.  ;)

      2.  Simulation-based methods
        a.  Many kinds of simulation:
	  * exercise the design: "easy" checks, finds simple bugs.
	  * stress tests: corner cases, etc.
	  * coverage tests: lots of random and biased random tests
	      to look for unanticipated bugs. 
        b.  Multiple levels of simulation
	  *  Unit testing is the norm in hardware
	  *  The paper focuses on cluster simulation/verification, and
	  *  Below the cluster, mix of simulation (for easy cases) and
	  	formal methods (for coverage).
	  *  At the whole chip level, simulation is used.

      3.  Formal verification
        a.  Based on symbolic simulation
	b.  A theorem proving approach is used to combine symbolic
	      simulation results into complete proof.
	c.  Proofs are hard
	  *  This has led to an emphasis on
	    +  Proof engineering
	    +  Proof re-use
	  *  This works well in the Intel environment where successive
	  	microprocessors have very similar specifications and
		architectures.

    C.  Verifying the i7 execute cluster
      1.  Scope
        a.  Formal verification used instead of simulation for
		stress testing and coverage
        b.  Formal verification included control state and bypassing logic.
	c.  Formal methods set up as nightly regression checks.
	d.  But, not all verification completed by tape-out
	  *  The expectation is that some bugs will be found post-Si
	  *  A bug escape may be OK if it doesn't block other tests.
	  *  Especially good if you know what it is and how to handle it
	       by the time the chip is back from fabrication.
	  *  Of course, it's always better to find the bug before fabrication.
      2.  Some technical aspects
        a.  Most instructions verified by symbolic simulation
        b.  Interactions between the execution units complicated the
	        verification
	  *  but this is probably a big reason *why* verification at
	       the cluster level has a value beyond just verifying the
	       individual units.
        c.  Symbolic simulation handles most instructions
	  *  floating multiply and divide must be broken into smaller steps:
            +  BDDs don't represent "product" efficiently
            +  This is where theorem proving excels.
	  *  Some operations required case-splits and parametric simulations,
	       in other words, multiple symbolic simulations to cover all
	       of the cases while keeping each simulation tractable.
	  *  Special cases for the AGU because it involves indexing into
	       tables of addresses for pending memory operations.
        d.  Use of inductive invariants for the control path
	  *  complicated by clock-gating:
	    +  can't assume that all state variables are updated each cycle
	    ?  are they up-to-date when they are needed?
      3.  Success
        a.  Fewest post-Si bugs of any cluster
	b.  Value seen when verification replaces simulation
	  *  argue against verification supplementing simulation:
	    -  incurs costs of both
	    -  the extra cost of formal "only" has the return of the
	         additional bugs that it finds.
	c.  Shows value of moving formal verification to higher levels
	      of the design hierarchy.
      4.  Limitations: five bugs made it past tape-out
	a.  one because of an incomplete formal specification
	b.  two because the verification was incomplete before tape-out
	c.  one because the protocol didn't work as intended
	  *  Formal verification showed the hardware implemented the protocol
	  *  The chip-level outcome of using the protocol had not been
	       verified.
	c.  two more bugs because of re-use of specifications and designs
	     from an earlier processor even though an architectural change
	     had been intended.

 IV.  Review and Preview
    A.  Review
      1.  Summary of SLAM
        a.  model checking finds real bugs in production code
        b.  extensive use of abstraction.  As much as possible,
	  *   ignore data values
	  *   focus on control flow
        c.  to make this useful for real designs, the model checker must
	      be integrated with the software development toolkit to be
	      as "pushbutton" as possible.
      2.  Summary of i7 verification
        a.  symbolic simulation finds real bugs in big chip designs
        b.  As much as possible
	  *   focus on data values using symbolic simulation
	  *   capture control flow using invariants, manual effort
        c.  to make this useful for real designs, 
	  *   formal methods must *replace* entire categories of simulation
	  *   "proof engineering" needed to re-use the large manual effort
	  	required for specifications and proofs.

    B.  Where are we?
      1. Done with course intro
        a.  Jan. 2: course overview and quick sketch of the topic
        b.  Jan. 7: two examples of applications of formal verification
	      to large, industrial designs.
      2. Next: some formal verification basics
        a. symbolic booleans with OBBDs
	     OBDDs are, perhaps, passé, but they are fairly easy to explain
	       and they work well for many examples.
        b. equivalence checking
        c. semantics of programs:
	     from source code to a mathematical representation
        d. model checking
      3. Then, scaling this up to real-world applications
        a.  SAT solvers
        b.  SMT solvers
        c.  abstraction, interpolation
        d.  maybe some theorem proving
      4. Finally, other domains
        a.  robots and vehicles
        b.  analog circuits
        c.  other things involving real numbers

    C.  Preview of the OBDD paper.
      1.  Key ideas:
        a.  OBDDs represent boolean functions as "compressed"
		decision trees
        b.  this representation is canonical, and works well with
		many operations that are commonly needed for boolean
		functions: AND, OR, EXISTS, FORALL, function composition,
		etc.
        c.  while OBDDs work well in practice for many problems, they are
		not magical.  There are problems (e.g. multiplication) for
		which OBBDs require exponential space.
      2.  Applications
        a.  representing finite domains (Section 4)
        b.  logic circuit verification (Section 5.1)
        c.  error correction, logic optimization (Sections 5.1 & 5.2)
	d.  model checking (Section 6)
	e.  Note: the probablistic stuff (Section 5.4), and AI stuff
	      (Section 7) seemed like cool ideas at the time, but I
	      don't think they've really gone anywhere.  Likewise, the
	      intro promises applications of OBDDs to problems from
	      mathematical logic, but the paper doesn't seem to give
	      any examples.
      3.  Now, read the paper, and form your own evaluation, ask your
      	    own questions, etc.

 VI.  Further references:
    A.  Some papers about the i7
      "A fully integrated multi-CPU, GPU and memory controller 32nm processor"
	Two-page paper about the Sandy Bridge family (IEEE Solid State
	Circuits Conf.)
	http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5746311
      "Power-Management Architecture of the Intel Microarchitecture
        Code-Named Sandy Bridge"
	IEEE Micro paper (accessible to general audience) describing
	design decisions in the Sandy Bridge family.
	http://dx.doi.org/10.1109/MM.2012.12
      http://hardware-review24.com/load/cpu/intel_sandy_bridge_microarchitecture/1-1-0-22
        An on-the-web description of Sandy Bridge.  Of the three references
	that I've found, this one has the block diagrams that can help you
	decipher the names of all the clusters and units described in the
	i7 verification paper.