WART Wart is a program that implements a small subset of the Unix 'lex' lexical analyzer generator. Unlike lex, wart may be distributed without requirement for a Unix license. Wart was written in 1985 by Jeff Damens at the Columbia University Center of Computing Activities to facilitate development of Unix Kermit, and modified over the ensuing years by Frank da Cruz. Wart is intended for production of state table switchers. It allows a set of states to be defined, along with a function for getting input, and a table of state transitions. A C program is generated which performs actions and switches states based on the current state and the input. The following short program demonstrates some of the capabilities and limitations of Wart. The program accepts from the command line a binary number, preceded by an optional minus sign, and optionally containing a fractional part. It prints the decimal equivalent. #include int state, s = 1, m = 0, d; float f; char *b; /* Declare wart states */ %states sign mantissa fraction %% /* Begin state table */ - { s = -1; BEGIN mantissa; } /* Look for sign */ 0 { m = 0; BEGIN mantissa; } /* Got digit, start mantissa */ 1 { m = 1; BEGIN mantissa; } . { fatal("bad input"); } /* Detect bad format */ 0 { m *= 2; } /* Accumulate mantissa */ 1 { m = 2 * m + 1; } $ { printf("%d\n", s * m); return; } . { f = 0.0; d = 1; BEGIN fraction; } /* Start fraction */ 0 { d *= 2; } /* Accumulate fraction */ 1 { d *= 2; f += 1.0 / d; } $ { printf("%f\n", s * (m + f) ); return; } . { fatal("bad input"); } %% input() { /* Define input() function */ int x; return(((x = *b++) == '\0') ? '$' : x ); } fatal(s) char *s; { /* Error exit */ fprintf(stderr,"fatal - %s\n",s); exit(1); } main(argc,argv) int argc; char **argv; { /* Main program */ if (argc < 2) exit(1); b = *++argv; state = sign; /* Initialize state */ wart(); /* Invoke state switcher */ exit(0); /* Done */ } The wart program accepts as input a C program containing lines that start with "%" or a section delimited by "%%" (there can be only one such section). The directive "%states" declares the program's states. The section enclosed by "%%" markers is the state table, with entries of the form X { action } which is read as "if in state with input X perform { action }" The optional field tells the current state or states the program must be in to perform the indicated action. If no state is specified, then it means the action will be performed regardless of the current state. If more than one state is specifed, then the action will be performed in any of the listed states. Multiple states are separated by commas. The required input field consists of a single literal printable 7-bit ASCII character (i.e. in the range 32 through 126). Control characters and 8-bit characters are not allowed. This is to prevent the state-table array (whose size is the product of the number of states and the number of possible input characters) small enough to be handled by any C compiler. When in the indicated state, if the input is the specified character, then the associated action is performed. The character '.' matches any input character. No pattern matching or range notation is provided. The input character is obtained from the input() function, which you must define. It should be alphanumeric, or else one of the characters ".% -$@" (quotes not included). Note that the program above recognizes the binary point '.' through a ruse. The action is a series of zero or more C language statements, enclosed in curly braces (even if the action consists of only one statement). The BEGIN macro is defined simply to be "state = ", as in lex. The wart() function is generated by the wart program based on the state declarations and the state transition table. It loops through calls to input(), using the result to index into a big case statement it has created from the state table. The wart program is invoked as follows: wart (Input from stdin, output to stdout) wart fn1 (Input from fn1, output to stdout) wart fn1 fn2 (Input from fn1, output to fn2. Example: wart a.w a.c) Wart programs have the conventional filetype '.w'. - F. da Cruz, Columbia University, November 1991