A LSME specification file is split into two basic parts: the pattern and action descriptions, and an (optional) analysis description.
Patterns are associated hierarchically. To indicate the position of a
pattern within the hierarchy, each pattern description is preceded by
a name known as the software structure entry name. The software
structure entry name consists of a sequence of alphanumeric characters
with the dot character acting as a hierarchical separator. For
instance, the name:
directory
indicates a software structure entry at the highest level of the
hierarchy. The name:
directory.file
introduces a software structure entry at the second level of the hierarchy, and so on. The software structure entry name may optionally be followed by an identifier (see below for more information on identifiers). See the SMEGenerator manual page for an example.
Pattern descriptions are attached to software structure entry
names in between %% signs. For example:
directory
%
<aPattern>
%
attaches a simple pattern to the directory software structure entry.
A pattern description consists of pattern characters, one-character
tokens and identifiers. The pattern characters include:
{ }+
indicating anything between the brackets repeats one or more times.
{ }
indicating anything between the brackets repeats exactly once.
[ ]
indicating anything between the brackets appears zero or one times.
( )
indicating alternative choices (each choice is separated by a |).
One-character tokens consist of any single character within a pattern.
For example:
directory
%
!
%
introduces ! as a one-character token. The following characters may
be escaped as one-character tokens: ( ) { } [ ] \n
.<
Identifiers consist of variables names within < and > symbols.
For example:
directory
%
<aPattern>
%
introduces aPattern as a variable name. This identifier will match any non-whitespace sequence of characters other than a one-character token. The aPattern variable may be accessed within action code.
@
sign. The action code then follows. The action code is any Icon
code. (Each line of action code must be terminated by \n). The
action code terminates with another @
sign. For example:
directory
%
<aPattern>@
write ("foo") \n
@
%
will write out foo whenever an identifier is matched in the source artifact.
Pattern characters, one-character tokens, and identifiers must be separated by whitespace in pattern descriptions. This is a limitation of the current parser.
The starting and ending character sequences for comments within
the source artifacts scanned may be described as:
comment start end
where start and end are the character sequences. For example:
comment /* */
comment // \n
describes how to ignore comments in C++ code.
Comments may appear inbetween pattern descriptions or within action
code. Comments appearing inbetween pattern descriptions are lines
beginning with #
. Comments in action code are
lines beginning with #
and ending with \n
.
Initialization (Icon) code (including global variable declarations) may
be placed within an:
init @
@
block at the beginning of the file.
Comments to
murphy@cs.ubc.ca