Murphi Annotated Reference Manual Release 3.1 Ralph Melton, David L. Dill Updated by C. Norris Ip and Ulrich Stern July 1996 Contents: 1. Introduction 1.1 Whence "Murphi"? 1.2 Evolution of Murphi 2. Overview of Murphi 2.1 Structure of a Murphi Description 2.2 Four simple steps to use Murphi to verify a system 2.3 Murphi Execution Model 2.4 Options of the Murphi Compiler 2.5 Using the Generated Special Purpose Verifier 3. Basic Concepts 3.1 BNF 3.2 Lexical Conventions 3.3 Program Structure 4. Declarations 4.1 Constant, type, and variable declarations. 4.2 Procedure and Function declarations. 5. Expressions 6. Statements 7. Rules, Startstates, and Invariants 8. Reduction Techniques 8.1 Symmetry and Multiset Reduction 8.2 Hash Compaction A. Symmetry Reduction B. Multiset Reduction 1. INTRODUCTION The Murphi Verification System consists of the Murphi Compiler and the Murphi description language. The Murphi Compiler generates a special purpose verifier from a Murphi description. The special purpose verifier can be used to check the properties of the system, such as error assertion, invariant and deadlock. The Murphi description language is a high-level description language for finite-state asynchronous concurrent systems. Murphi is high-level in the sense that many features found in common high-level programming languages such as Pascal or C are part of Murphi. For example, Murphi has user-defined data types, procedures, and parameterization of descriptions. 1.1 Whence "Murphi" The system used to be called "Trans", because descriptions consist of a collection of transition rules. Aftering seeing 200 references to other systems called Trans, we decided to come up with a more unique name. We thought about calling the system Murphy, after one of Murphy's laws: "The bug is always in the case you didn't test", since the system can be considered an exhaustive tester. In an attempt to enhance individuality, we decided to spell it "Murphi" and use the Greek letter to represent "phi" whenever the font was available. This was probably inspired by Don Knuth's use of "chi" for the "X" in "TEX". 1.2 Evolution of Murphi In the summer of 1990, we set out to verify some cache coherence protocols by two methods. One method was to write the protocols as Boolean formulas and use methods based on Binary decision diagrams for verification. The other was to program the protocol directly in C++, and embed it in a search procedure that explored all the states. By the end of the summer, everyone concluded that it would be faster to develop a better description language and verification system and then verify the protocols than to continue with ad hoc approaches. We then designed a very limited form of Murphi (then called "Trans") that fall. This design was driven largely by the desire to compile into Boolean formulas. After modest experimentation, we decided to re-implement a somewhat less constrained version. During the summer of 1991, several of us applied the system to some designs being developed by Dr. Andreas Nowatzyk at Sun Microsystems, with encouraging results. We changed the name to "Murphi" in the fall of 1991. In the spring and summer of 1992, we re-designed Murphi based on our accumulated experience with the previous version, adding many convenience features such as reasonable loops, local variables, and recursive procedures. Ralph Melton re-implemented it almost entirely from scratch, with the explicit objective of making the implementation clean and easy to modify. 2. Overview of Murphi 2.1 Structure of a Murphi Description: A Murphi description consists of declarations of constants, types, global variables, and procedures; a collection of transition rules; a description of the initial states; and a set of invariants. The behavioral part of Murphi is a collection of transition rules. Each transition rule is an guarded command consists of a condition (a Boolean expression on the global variables) and an action (a statement that can modify the values of the variables). The condition and the action are both written in a Pascal-like language. The action can be an arbitrarily complex statement containing loops and conditionals. No matter how complex it is, the action is executed ATOMICALLY -- no other rule can change the variables or otherwise interfere with it while it is being executed. 2.2 Four simple steps to use Murphi to verify a system: a) Write a program in the Murphi Language, say pingpong.m. b) Run the Murphi compiler over it, typically by typing: $(murphipath)/mu pingpong.m This yields a file pingpong.C. c) Run the c++ compiler over pingpong.C, typically by typing: $(g++path)/g++ pingpong.C -I$(includepath) Remember to specify the path of all the include files. d) Finally, you can run a simulation or a verification by invoking a.out. To see the list of switches, type a.out -h In fact, the whole procedure can be further simplified using the Makefile provided in the Example directory "ex". However, you have to set the variables CPLUSPLUS, MURPHIPATH and INCLUDEPATH appropriately. 2.3 Murphi Execution Model: A Murphi state is an assignment of values to all of the global variables of the description. An execution of the description can be generated by cycling infinitely through the following: Repeat forever: a) Find all rules whose conditions are true in the current state. (i.e. conditional expressions are true, given the current values of the global variables). b) Choose one arbitrarily and execute the action, yielding a new state. Note that Murphi descriptions are nondeterministic, because of the arbitrary choice in step b. The user has no control over how this choice is made, so a "correct" Murphi program must do the right thing no matter which rules are chosen. However, once a rule has been chosen, the action is deterministic (there is a unique next state). This execution model is good for describing asynchronous systems (where different processes run at arbitrary speed) which interact via shared variables (process A affects process B by writing to a variable the process B reads). Message passing can be modelled by reading from and writing to a buffer variable or array. The simulator in Murphi chooses among the rules arbitrarily to get the new state. On the other hand, the verifier considers the results for ALL possible choices. The current verifier does this by either a breadth-first search procedure or a depth-first search procedure, both of which stores states in a large hash table so that it can cut off the search when it encounters a state it has seen before. As states are generated by the verifier, various conditions are checked. There can be run-time errors of various kinds, most notably out-of-bounds errors on assignments or array indexing. There is are explicit "assert" and "error" statement that can be called within an action. If one of these conditions occurs, the verifier halts and prints a diagnostic consisting of a reconstructed sequence of states that leads from the initial state to the error state. The verifier also does this if one of the invariant expressions (given by the user as part of the description) is false for the current state, or if the current state is a "deadlock state" (has no successor states except itself). 2.4 Options of the Murphi Compiler The Murphi Compiler has several command-line options: -h help -l print license -b use bit-compacted states -c use hash compaction. By default, Murphi aligns all state variables on byte boundaries. As a result, the state descriptor typically becomes 2-4 times larger than a state descriptor in which every bit is used (bit- compacted, option "-b"). However, Murphi will run faster by approx- imately 25%. When using hash compaction (option "-c"), compressed values with typically 40 bits will be stored in the hash table instead of full state descriptors. See Section 8.2 for more details. Hash compaction can be combined with bit-compacted states to reduce the memory re- quirements for the queue of active states in which full state de- scriptors have to be used. 2.5 Using the Generated Special Purpose Verifier The Murphi Compiler generates a special-purpose verifier for a particular Murphi description. The list shown below are the command-line options for the special-purpose verifier: 1) General: -h help. -l print license. 2) Verification Strategy: (default: -v) -s simulate. -v or -vbfs verify with breadth-first search. -vdfs verify with depth-first search. -ndl do not check for deadlock. 3) Others Options: (default: -m8, -p3, -loop1000) -m amount of memory for closed hash table in Mb. -k same, but in Kb. -loop allow loops to be executed at most n times. -p make simulation or verification verbose. -p report progress every 10^n events, n in 1..5. -pn print no progress reports. -pr print out rule information. 4) Error Trace Handling: (default: -tn) -tv write a violating trace (with default -td). -td write only state differences from the previous states. (in simulation mode, write only state differences in verbose mode.) -tf write full states in trace. (in simulation mode, write full states in verbose mode.) -ta write all generated states at least once. -tn write no trace (default). 5) Reduction Technique: (default: -sym3 with -permlimit 10 and multiset reduction) -nosym no symmetry reduction (multiset reduction still effective) -nomultiset no multiset reduction -sym reduction by symmetry -permlimit max num of permutation checked in alg 3 (for canonicalization, set it to zero) n | methods ----------------------------------- 1 | exhaustive canonicalize 2 | heuristic fast canonicalization (can be slower or faster than alg 3 canonicalization) (use a lot of auxiliary memory for large scalarsets) 3 | heuristic small mem canonicaliztion/normalization (depends on -permlimit) 4 | heuristic fast normalization (alg 3 with -permlimit 1) 6) Hash Compaction: (default: hash compaction with 40 bits) -b number of bits to store. -d dir write trace info into file dir/sci.trace. The command line option "-s" turn the verifier into a a simulator for Murphi, which chooses among the rules arbitrarily. Without the option "-s", the verifier considers the results for ALL possible choices. The current verifier does this by either a breadth-first search procedure (used as default) or a depth-first search procedure (set by the "-vdfs" command line option) A large hash table is used to store the states that are reachable. The size of which is determined by the option "-m" and "-k". The default size for the memory usage is 8 Mbytes. The constant "gPercentActiveStates" in file "mu_prolog.inc" can be modified to change the capacity of the queue used in breadth-first-search or depth-first-search verification. See Chapter 8 for more details about the "Reduction Technique" and "Hash Compaction". 3. BASIC CONCEPTS 3.1 Backus-Naur Form (BNF) The syntax is specified in this manual in a Backus-Naur Form: <> denote nonterminals; [] denote optional sections; {} denote repetition zero or more times. a | b denotes either a or b. () denote grouping. When any of these symbols are required within the language, they are escaped with backslashes. 3.2 Lexical Conventions The following are reserved words in Murphi: alias array assert begin boolean by case clear const do else elsif end endalias endexists endfor endforall endfunction endif endprocedure endrecord endrule endruleset endstartstate endswitch endwhile enum error exists false for forall function if in interleaved invariant of procedure process program put record return rule ruleset startstate switch then to traceuntil true type var while Reserved words are written out in the BNF. Some of these reserved words do not yet have defined meanings; these are reserved for future expansion. Those words are in, interleaved, process, program, traceuntil. CASE-SENSITIVITY: Murphi is case-sensitive, except for the reserved words. 'foo' and 'Foo' represent different identifiers. 'Begin' and 'BeGiN' represent the same keyword. SYNONYM: The keyword 'end' is a synonym for every specific type of end: 'end' may be used freely in place of 'endrule', 'endfor', IDENTIFIERS: An identifier is any sequence of letters, underscores, and digits beginning with a letter. All identifiers beginning with underscore are reserved for use by the system. Identifiers are referred to in the BNF below as . STRINGS: A string, referred to in the BNF as is a sequence of characters other than double quote (\") enclosed in double quotes. INTEGER CONSTANTS: Integer constants, in the BNF, are specified in base 10. Comments: There are two types of comments in Murphi: Ada-style comments that begin with -- and end with a newline. C-style comments that begin with /* and end with */. C-style comments do not nest. 3.3 Program Structure A Murphi program has the following structure, with comments about each section set off by --: ::= { } -- Constant, type, and variable declarations { } -- Procedure and function declarations { } -- rules, startstates, and invariants A Murphi program implicitly determines a state graph. A state is an assignment of a value to each global variable. The start states of the graph are defined by startstates in the section of the program. The next state relation of the graph is defined by rules within the section of the program. 4. DECLARATIONS 4.1 Constant, type, and variable declarations Declarations have the following syntax: ::= const { ; } | type { ; } | var { ; } Constant declarations: ::= : The of a constant declaration must have a value that can be evaluated at compile time. Type declarations: ::= : The special enumerated type "boolean" is predefined, along with the constants "true" and "false". The type integer is not predefined, because using general integers without restricting them to subranges would consume memory horribly. The simple types are booleans, enumerations, finite subranges of integers. The compound types are arrays of compound or simple types, records of compound or simple types. The index types of arrays must be simple types. ::= -- a previously defined type. ::= .. -- Integer subrange. ::= enum \{ {, } \} -- enumeration. ::= record { } end ::= array \[ \] of Variable declarations: ::= { , } : Example: The following example illustrates declarations. const I: 2; J: 31415 * I / 9; -- J will be 6981 type val_t: 0..99; -- simple types. ind_t: 1..i; enum_t: enum { x, y, z }; b: boolean; r_t: record f:0..1; g: 0..2; end; -- Record. rr_t: record r: r_t; s: r_t; end; -- Record of record. a_t: array [ ind_t ] of 19..29; -- 1-dimensional array. aa_t: array [ ind_t ] of a_t; -- 2-dimensional array. ar_t: array [ 1..2 ] of r_t; -- Array of record. ra_t: record a1: a_t; foo: ind_t; end; -- Record with array. ae_t: array [ enum_t ] of enum_t; -- Array with enum index and range. aae_t: array [ ind_t ] of ae_t; -- 2-dim array, 2nd index is enum. re_t: record f: enum_t; end; -- Record of enum. var val : val_t; arr : ae_t; rec : record foo: ind_t; bar: boolean; end; 4.2 Procedure and function declarations All procedures and functions must be declared at the top level of the program, with the following syntax: ::= | ::= procedure \( [ { ; } ] \) ; [ { } begin ] [ ] end; ::= function \( [ { ; } ] \) : ; [ { } begin ] [ ] end; Unlike Pascal procedures, procedures and functions with no arguments still need the parentheses surrounding the empty parameter list. Functions must return a value with a return statement (q.v.) at some point in the function. Functions can have side effects; however, there are restrictions on the use of functions with side effects. The format of the parameter list in a procedure or a function is: ::= [var] { , } : Formal parameters declared "var" are passed by reference. Formals that are not declared "var" are passed by reference, but the function or procedure is not allowed to modify them. Formal parameter declarations and local declarations shadow declarations outside their scope. Example: the following example illustrates procedures. procedure Swap(var i, j: val_t); var temp: val_t; begin temp := i; i := j; j := temp; end; function plustwo(input: val_t): val_t; const two : 2; begin return (input + two); end; 5. EXPRESSIONS Type equivalence is by name. Expressions of any integer subrange type are legal wherever an integer expression is legal (although it may generate a run-time error; see below about assignments). Booleans are not type-compatible with integer expressions. It is an error to take the value of an variable that has not yet been assigned a value. This error is detected at run time. (All run-time errors is detected by the verifier.) It is an error to use an out-of-bounds index for an array. This too is detected at run time. Designators: := { . | \[ \] } As usual, the form . refers to selecting a field of a record, and the form [] is for selecting an element of an array. Expressions: := \( expr \) | | | \( \) -- a function call. | forall do endforall -- universal quantification. | exists do endexists -- existential quantification. | + | - | * -- multiplication. | / -- integer division. | % -- remainder. | ! -- logical negation. | | -- logical disjunction. | & -- logical conjunction. | -> -- logical implication. | < | <= | > | >= | = | != | ? : -- C-style conditional expression. Operators: The priority of operators is as follows, with lowest-priority operators first and operators on the same line having equal priority: ?: -> | & ! < <= = != >= > + - * / % a) '+', '-', '*', '/', '%', '<', '<=', '>=', and '>' are only defined on integer operands. b) '=' and '!=' are only defined on simple operands. c) '!', '&', '!', and '->' are only defined on boolean operands. d) For the '?:' operator, the test must be a boolean expression, and the two alternatives must be of compatible type. e) '+', '-', '*', '/', and '*' return an integer, the rest return booleans, except for '?:'. 'Forall' and 'Exists' Operators: See below, under the 'for' statement, for the specification of the used in quantified expressions, i.e. exist's and forall's. For a quantified expression, the subexpression must be a boolean expression; it is evaluated once for each value of the quantifier. A forall is true iff its expression is true for every value of the quantifier; an exists is true if its expression is true for some value of the quantifier. 6. STATEMENTS. The followings are the statements in Murphi: ::= {; [] } ::= /* assignment */ | /* if statement */ | /* switch statement */ | /* for statement */ | /* while statement */ | /* alias statement */ | /* procedure call */ | /* clear statement */ | /* error assertion */ | /* assertion */ | /* output statement */ | /* function return */ Assignment: ::= := The target and the expression must have compatible types, and the target must not be declared const. It is an error to assign a value to a variable that is outside the range for that variable. This error is detected at run time. If statement: ::= if then [ ] { elsif then [ ] } [ else [ ] ] endif Each of the 's must be of boolean type. Switch statement: ::= switch { case {, expr} : [ ] } [ else [ ] ] endswitch Each of the expressions in the case must be a constant of a compatible type with the switch expression. If no case expression is matched, the code labelled 'else' is executed. There is no fall through on cases. For statement: ::= for do [stmts] endfor Quantifiers apply to for statements, to quantified expressions, and to rulesets. ::= : | := to [ by ] The first form executes the body of the for statement or evaluates the body of the quantified expression, etc., for each value in the (which must be a simple type), from least to greatest value. The second form corresponds to the Modula-2 FOR statement. The two expressions must be of integer type, and the by expression must be a constant expression. Using a quantifier, in a for statement or a quantified expression, declares the of the quantifier local to the for statement, shadowing any external declarations. It is illegal to modify the quantifier variable from within the body of the for loop. While statement: ::= while do [stmts] end An infinite loop is a runtime error. Although this is decidable because a bound on the total number of states can be computed, infinite loops obviously pose a practical problem for the verifier. Right now, the verifier stops with an error message after 1000 iterations as an default option (which may be changed by changing a constant in the file mu_prolog.inc). The user may change this limit by a command-line argument. Alias statement: ::= alias { ; } do [ ] end ::= : 's declared in aliases shadow external declarations of the same ID. Aliases behave differently depending on whether the is or is not an lvalue. If the is an lvalue, then is defined as the lvalue associated with that expression when the alias statement is entered. If is not an lvalue, then gets the value associated with when the alias statement is entered, and may not be changed within the alias block. Example: -- i = 2 beforehand; arr[2] = 1. alias foo : arr[i] -- foo gets identified with arr[2] bar : arr[i] + 1 -- bar gets identified with 2. do arr[i] := 3; -- now, foo = 3, but bar = 2. i := 1; -- foo is still bound to a[2]. foo := 4; -- arr[2] is now 4. bar = 2. bar := 2; -- Illegal. end Procedure call: ::= \( {, } \) This obeys all the standard rules of procedures. Const formal parameters can be passed an actual of any compatible type; var parameters must be passed an lvalue of the same type; a var parameter of a subrange type must be passed an lvalue of the same subrange type. Clear statement: ::= clear This sets all components of an lvalue to the minimum values of their type. The minimum value of an enumerated type is considered to be the first value declared in the list of names. The minimum value of the type boolean is false. NOTE: Clear is frequently used to set "uninteresting" variables to a fixed value; otherwise, many states would be created during verification with random values in these variables. Use of clear for other purposes is not encouraged. We are thinking about replacing this with an "undefine" operator that assigns a reserved undefined value, which would be safer. Error assertion: ::= error An error statement generates a run-time error. In the verifier, if an error statement is executed, verification terminates, the specified string is printed, and a failure trace is printed if requested. Assertion: ::= assert [ ] "assert " is completely equivalent to "if ! then error end" Output statement: ::= put ( | ) Prints out the indicated value, each time the statement is executed. This is handed straight to printf, so be careful to include a put "\n" after each line you want to print. Generally, this will cause a huge amount of stuff to be printed during verification (with much duplication). We have used it for debugging, and for generating a file of all of the possible values of certain variables (which is then processed to eliminate duplicates). Function return: ::= return [ ] Exit the current procedure, function, rule, or startstate. If exiting a function, the must be provided and must match the return type of the function; otherwise, there must be no return value. 7. RULES, STARTSTATES, AND INVARIANTS The syntax for rules, startstates and invariants is as follows: ::= {; } [;] ::= | | | | Simple rule: ::= rule [] [ ==> ] [ { } begin ] [ stmts ] end A simple rule determines a transition from one state of the nondeterministic finite automaton to another. A simple rule defines a transition between states. Logically, it consists of a body, which is a set of statements to be executed, and a condition, a boolean expression characterizing the states under which the body may be executed. If the condition is true in a state, then the body of the rule may be executed to provide a transition to another state. The condition of a rule is optional. If no condition is specified, then the rule is assumed to always be enabled. It is an error to use an expression with side effects in a rule condition. The rule may declare local variables, constants and types, which are not part of the state. If no variables are declared, the begin that starts the body may be omitted. Unfortunately, if there is neither a condition nor local declarations, the parser often misparses the input. Therefore, rules without conditions should always have the bodies started with the reserved word begin. A rule with a condition can be equivalently expressed as a rule without a condition by the transformation rule rule ==> begin --> if begin end end end Although they are functionally redundant, conditions have allowed speedups in verification by factors of three or four. It is an error if the program does not have at least one simple rule. Startstate: ::= startstate [ ] [ { } begin ] [ ] end A startstate is a special type of rule. It is only executed at the beginning of an execution run. Another way to phrase this is that every execution consists of executing one startstate, and then zero or more simple rules. A startstate must assign a value to every global variable, or it is a run-time error which will be caught by the verifier. It is an error if the program does not have at least one startstate. Invariant: ::= invariant [ ] The form invariant "foo" is syntactic sugar for rule ! ==> Error "Invariant violated: foo" end Many programmers find it more natural to use an embedded specification style with assert and error statements than to use invariants for some conditions. However, for properties that are conveniently expressed as invariants, it is generally more efficient to express them as invariants, because the compiler can then take advantage of the restricted properties of that invariant. It is an error to use an expression with side effects in an invariant. Ruleset: ::= ruleset {; } do [] end A ruleset can be thought of as syntactic sugar for creating a copy of its component rules for every value of its quantifier. rule: ::= alias {; } do [] end An aliased rule creates aliases (see above under the alias statement) which can be used in all the component rules. 8. REDUCTION TECHNIQUES Murphi has two techniques to reduce the memory requirements during verification: symmetry and multiset reduction, and hash compaction. 8.1 Symmetry and Multiset Reduction The basic idea of symmetry reduction is that for many protocol one can use a reduced state graph for verification instead of the full one. Consider, for example, a protocol that has two processors that form a list. The state where processor A is the head of the list and B is the tail is - for verification purposes - equivalent to the state where B is the head and A is the tail. Symmetry and multiset reduction are described in more detail in ap- pendix A and B, respectively. You might also want to look at the papers ID93.ps and ID93A.ps for more details on symmetry reduction. 8.2 Hash Compaction When using hash compaction, compressed values are stored in the state table instead of full state descriptors. The resulting memory savings come at the price of a certain probability that some states of the protocol will be omitted during verification. However, by choosing the number of bits for the compressed values, the user can control this probability and make it very small. Furthermore, if one re-runs the Murphi verifier, the omission probabilities from both runs can be multiplied, since Murphi picks independent compression functions for each run. There are two different ways how one can use the hash compaction scheme: - If one wants to have a small bound (typically 0.1%) on the probability of even one omission, typically 40 bits should be used for the com- pressed values. - However, if one is only concerned about the probability of "false positives", i.e. the verifier claiming an error-free protocol even though it actually has errors, using only 20 bits is typically suf- ficient. The verifier reports the actual omission probabilities at the end of the verification run. However, the probability of false positives (or "even one undetected error") can only be reported when breadth-first search is used. Increasing the number of bits by one roughly halves the reported probabilities. When using hash compaction together with breadth-first search the infor- mation needed to generate the error traces can be stored in a temporary file ("protocol-name.trace"). The user has to specify the directory for this file with the verifier option "-d dir" - otherwise the file will not be created and no trace can be printed in the case of a protocol error. The memory requirements for each state in this file are the compressed value (rounded up to full bytes) plus another 4 bytes. Remember that you have to compile your protocol with "mu -c" to use hash compaction. The papers SD95A.ps and SD96A.ps contain more info on hash compaction. A. SYMMETRY REDUCTION Overview of Symmetry Extension C. Norris Ip March 1994 Content: 1) Introduction 2) Specification of symmetry in Murphi 2.x a) new datatype: scalarset b) new datatype: scalarsetunion c) extended datatypes: undefined value d) new operation: undefine e) new predicate: isundefine(...) f) new predicate: ismember(...) 3) Verification using symmetry reduction 4) Underlying mechanism 5) Things to be aware of for maximium performance 6) Converting from Murphi 1.x with symmetry a) Declaration b) Clearing a scalarset/scalarsetunion c) ismember(...) and isundefined(...) 7) Other Files and Directories 1) Introduction: Symmetry can be used to reduce the amount of time and memory used in verification. This document describes how you may take advantage of symmetry in a system to make verification more efficient. This document assumes the basic knowledge about Murphi 2.x 2) Specification of symmetry in Murphi 2.x: a) new datatype: Scalarset The syntax is as follows: typeExpr : typeid /* An already defined type. */ | enumtype | subrangetype | recordtype | arraytype | scalarsettype /* scalarset */ ... ; scalarsettype : SCALARSET "(" expr ")" You can replace any symmetrical subrange by a scalarset by specifying its size. A subrange is symmetrical if all the operations involving this subrange do not depend on the ordering of the subrange elements. For example, i) the values are not used in any comparison operation except equality testing, ii) the values are not used in any arithmetic operation, iii) the result from the for loop with the subrange as index does not depend on the order of the iteration. The compiler will flag an error if you have operations involving a scalarset that break symmetry. (Note that in the current implmentation, the for loop is not checked.) An example would be the processor id in a multiprocessor system. Type Proc: Scalarset(NumProc); Var Processor: Array [Proc] of processor_state; You cannot use any literal constant to refer to any particular value in the scalarset. You can assign value to a scalarset variable through the following means: i) Forall n : Proc Do ... Endforall ii) Exists n : Proc Do ... Endexists iii) For n : Proc Do ... Endfor iv) Ruleset n : Proc Do ... Endruleset v) assignment from another scalarset variables Note that the "for" construct is a restricted version of the "for" construct for subrange. The set of variables written in an iteration should be disjoint to the set of variables referenced in another iteration. The net effect on the variables are therefore independent to the ordering of the execution of the iterations. There should be no return statement. *** In the current implementation, the user is required *** *** to check this condition manually. *** b) new datatype: Union The syntax is as follows: typeExpr : typeid /* An already defined type. */ | enumtype | subrangetype | recordtype | arraytype | scalarsettype /* scalarset */ | uniontype /* scalarset */ ... ; uniontype : UNION "{" list "}" ; list : list "," listelt | listelt "," listelt /* at least two elements */ ; scalarsetlistelt : ID /* scalarset/enum that has already been declared */ | enumtype This uniontype provides a short hand if we want to refer to a few scalarsets at the same time. For example, in a multiprocessor system, we will have a set of Memory module and a set of processor. Then we may want to have channels between every two of them: Type Mem: Scalarset(NumMem); Proc: Scalarset(NumProc); Node: Union {Mem, Proc}; Var Channel: Array [Node] of incoming_message; If we have only 1 Mem, we can use the following: Type Proc: Scalarset(NumProc); Node: Union { enum {Mem} , Proc}; Var Channel: Array [Node] of incoming_message; c) extended datatypes: undefined value All base datatype (enumtype, subrangetype, scalarsettype and uniontype) have been extended to have a undefined value. This is necessary for scalarsettype and uniontype, because we have to set the value to a undefined value if they are not used. (otherwise, the value will be considered in the symmetry consideration and generates states that are redundant.) This is also a good extension for other datatypes, because the verifier can now check that an intentionally undefined value is not used in any calculation. (For more detailed explaination, take a look at section 5. d) new operation: undefine The syntax is as follows: undefinestmt : UNDEFINE designator; You can set any variable to its respective undefined value using "undefine". If the variable is an aggregate datatype, all elements will be set to the undefined value. e) new predicate: isundefined(...) The syntax is as follows: expr : ... | ISUNDEFINED '(' designator ')'; You can check whether a variable is undefined using this predicate. "designator" must be a simple type (enum, subrange, scalarset or scalarsetunino). f) new predicate: ismember(...) The syntax is as follows: expr : ... | ISMEMBER '(' designator ',' typeExpr ')' ; You can check whether the value of a scalarsetunion variable belongs to a particular member of the union. For example you may have: Alias n: Message.from Do Rule Ismember(n,Proc) ==> handle_message_from_processor() Endrule; Rule Ismember(n,Home) ==> handle_message_from_memory() Endrule; Endalias; 3) verification using symmetry reduction: Symmetry reduction is now automatically invoked if appropriate. You can suppress symmetry reduction by the online flag "-nosym". You can also change the symmetry reduction algorithm by the online flag "-sym" where is the algorithm number for the reduction algorithm. There are three algorithms provided: i) Fast normalization (algorithm 1) This algorithm handles the complexity of symmetry reduction by an approximation. Therefore, the time used to canonicalize a state is much shorter, while maximium saving is not guaranteed. ii) Simple straight-forward canonicalization (algorithm 2) This algorithm is used in the papers memtioned at the end of this document. Most of the time it is slower than the other algorithms. iii) Fast canonicalization (algorithm 3) This algorithm considers the interrelation between scalarset variables in the states and canonicalizes the state much faster then algorithm 1,2) In summary, algorithm 1 is the default. If it is taking too long and memory is not a concern, you should try algorithm 3. If algorithm 1 and 3 give funny error, you have probably discovered a bug in the new algorithm and you should use algorithm 2. 4) Underlying Mechanism: The underlying mechanism for symmetry reduction is very simple. Whenever the standard search algorithm puts a state into the hash table, we convert it to the unique representative (or, in the case of normalization, a member in a subset) of the set of symmetry equivalent states. Therefore, whenever we check whether we have seen a state before, we are checking whether we have seen an equivalent state before. If so, we don't do the redundant work. For details, please take a look at "CHDL93.ps" and "ICCD93.ps". 5) Things to be aware of for maximium performance: You should set variables to their undefined values, if their values do not affect the functionality of the protocol. This is always a good practise. Even if we don't have symmetry, we should set the irrelevent variable to the same value, otherwise we will have two different states which differ only in this irrelevent value. By setting them to the same value, we have a state graph of a smaller size. Because we don't have access to a particular value in scalarset, we have to set it to a value outside the scalarset: the undefined value. On the other hand, the use of the undefined values actually caught some of the errors in the examples, which uses a field in a network message that has not been set by the sender! 6) Converting from Murphi 1.x with symmetry a) Declaration The syntax for declaring scalarset/union is different, but the basic is the same. You have to declare the individual scalarset first before you put them into a union. b) Clearing a scalarset/union You have to convert all Murphi 1.x "clear" operations on scalarset/union to Murphi 2.x "undefine" operations on scalarset/union. It is also advantageous to convert Murphi 1.x "clear" operations on other type to "undefine", if you mean that you will not use the value in the variable. However, if your "clear" operations is just a short hand to set the variables to the base value, don't convet them to "undefine". c) ismember(...) and isundefined(...) you may also find these predicates useful in simplifying your program, or make it possible to model things that are impossible to express in Murphi 1.x symmetry version. 7) Other Files and Directories: a) Please see "CHDL93.ps" and "ICCD93.ps" at the "doc" directory for the papers on the use of symmetry, the behind theory of symmetry reduction and the results from different examples. Please note that the symmetry algorithm in this release has been redesigned and provide a much faster canonicalization or normalization than the one in the papers. Therefore, time reduction is almost guaranteed. B. MULTISET REDUCTION Overview of Multiset Extension C. Norris Ip Feb 1995 Content: 1) Introduction 2) Specification of symmetry in Murphi 2.x a) new datatype: multiset 3) Verification using multiset reduction 4) Underlying mechanism 1) Introduction: Symmetry can be used to reduce the amount of time and memory used in verification. While scalarset (presented in "Symmetry.Extension") handles process symmetry and data symmetry very well, it does not handle temporary symmetry in unordered buffer in which elements are constantly being created and removed. Multiset is used to handle such aspect of the system. Basically a multiset is very similar to an array indexed by a scalarset, except that the "multiset index" cannot be stored in any state variable and most of the local variables. Such restriction actually allows us to achieve a larger reduction and to obtain a much faster canonicalization function. For example, if we have an array of multiset: OutgoingChannels: Array [0..1] of Multiset [Max] of Messages; the two states 1) [ {msg1, msg2}, {msg1, msg2} ] 2) [ {msg1, msg2}, {msg2, msg1} ] are equivalent (since the entries of the two multisets are permuted independently. However if we model it with scalarset: indexes: scalarset [ Max ]; OutgoingChannels: Array [0..1] of Array [indexes] of Messages; the two states 1) [ {msg1, msg2}, {msg1, msg2} ] 2) [ {msg1, msg2}, {msg2, msg1} ] will not be equivalent (since the index to the two array are of the same scalarset, they are permuted in the same way). This document describes how you may take advantage of multiset symmetry in a system to make verification more efficient. You should also take a look at the examples in "ex/multiset+sym" to get a better idea on how to use multiset. This document assumes the basic knowledge about Murphi 2.x and scalarsets. 2) Specification of multiset in Murphi 2.x: a) new datatype: multiset The syntax is as follows: typeExpr : typeid /* An already defined type. */ | enumtype | subrangetype | recordtype | arraytype | scalarsettype | multisettype ... ; multisettype : MULTISET "[" expr "]" OF typeExpr You can replace any unorder buffer by a multiset, by specifying the type of its entries and the maximum number of entries in the buffer. The compiler will flag an error if you have operations involving a multiset that break symmetry. An example would be the processor id in a multiprocessor system. Type message: Record Sender: pid; Receiver: pid; MessageType: msg_type; End Var Network: Multiset [ MaxEntries ] of message; You cannot use any literal constant to refer to any particular entry in the multiset. Assuming "net" is a multiset variable, and "msg" is a local variable representing a new entry: You can put in a new entry by: statement i) MultisetAdd( msg, net ); You can select an entry in the multiset by the Choose construct (similar to "ruleset"): Choose i:net Do Rule net[i].Sender = net[i].Receiver ==> ... EndChoose You can also check the entries in the multiset by: predicates i) MultisetCount(i:net, net[i].Receiver = net[i].Sender) where "i" is a new local variable, which count the number of elements satisfying the predicate You can remove the entries in the multiset by statement i) MultisetRemove( i, net ) where "i" is bounded by a choose construct, which remove the entry bounded by the choose construct. ii) MultisetRemovePred( i:net, net[i].Receiver = net[i].Sender where "i" is a new local variable, which remove the entries satisfying the predicate. 3) verification using multiset reduction: Multiset reduction is automatically invoked. Due to the nature of mulisets, you cannot suppress multiset reduction. (You will get a very large state graph, since the same entry in a different position of the multiset array represents a different state without multiset reduction) 4) Underlying Mechanism: The underlying mechanism for multiset reduction is very simple. Whenever the standard search algorithm puts a state into the hash table, we convert it to the unique representative (or, in the case of normalization, a member in a subset) of the set of symmetry equivalent states, by rearranging the entries of the multiset. Therefore, whenever we check whether we have seen a state before, we are checking whether we have seen an equivalent state before. If so, we don't do the redundant work.