An Unambiguous Scanner for Special Character Tokens

ID
TR-77-05
Authors
R. A. Fraley
Publishing date
June 1977
Abstract

A fast algorithm for a general purpose scanner is presented. It includes a mechanism for permitting user-defined special character tokens. The scanner is able to separate strings of special characters without imposing arbitrary spacing rules on the programmer. An analysis shows that most special character tokens from selected languages could be handled properly by the scanner, even if they were in the same language. Many of the omitted tokens could be confused for combinations of operators, demonstrating the utility of the scanner for preventing lexical ambiguity. The special character analysis is extended to other classes of tokens.