Algol 60 Syntactic Chart
M
y involvement with parsing began as soon as I arrived in Grenoble in the early sixties. At that time, Burroughs had published a graphic syntactic chart of Algol that was very handy to use [TTW 61]. Essentially, that chart displayed Backus Naur Form (BNF) as a 2D map: Every appearance of a nonterminal (metavariable) in the right hand side of a grammar rule contained an indicator of the 2D coordinates where that nonterminal is defined. This visualization was unusually influential. We had copies of it posted in our offices and got to know it almost by heart.
It was by that time that Brooker and Morris developed their recursive descent syntaxdirected parsers [BMR 67] in England. I recall reading about another recursive method proposed by E. T. Irons, who was then a doctoral student at Princeton [Iro 61]. The BrookerMorris approach was topdown whereas Iron’s was bottomup. I remember implementing both algorithms, but it is important to understand that recursion in the sixties had not yet reached the widespread usage that it now has. Few compilers were able to handle recursion correctly mainly because the parameters were called in a variety of forms (e.g., value, reference, name) and it was not semantically clear how those parameters should be passed.
During the sixties, there was an articulated interest in Chomsky’s papers on grammars and hierarchies. S. Ginsburg had written a text on formal grammars and languages [Gin 66], but in that work there were no hints as how to apply the theory to practical algorithms that we all sensed were applicable to compilers. There were a number of influential papers that attempted to do precisely that and those papers had a considerable impact on my view of the field of parsing and compiling. Among them I recall the following:

R. Floyd succeeded to explain the empirical work of Dijkstra on operator precedence parsing and its relationship with restricted types of contextfree grammars [Flo 63].

E. Irons illustrated how to use ambiguous grammars in error detection [Iro 63].

D. Knuth proposed an interesting deterministic algorithm for bottomup parsing by looking ahead to a fixed number of input characters [Knu 65]. Knuth himself thought that the proposed algorithm would not be practical but was contradicted a few years later when researchers like deRemer proved otherwise [DeR 71].

N. Wirth and H. Weber were able to generalize Floyd’s precedence algorithm [WW 66], and subsequently, J. Ichbiah and S. Morse improved precedence parsing to its present form [IM 70].

T. Griffiths and S. Petrick published a key paper on the relative efficiency of bottomup and topdown parsers. Their approach consisted of simulating parsers using a twostack Turing machines, simulating their behavior and establishing empirical estimates for efficiency [GP 65].
The aforementioned papers certainly influenced my views on syntaxdirected compilers. They also had a considerable impact on the work of my Grenoble colleague, Alain Colmerauer, who was able to generalize Floyd’s work on precedence grammars by introducing twostack parsers instead of one [Col 70]. His work covered both deterministic and nondeterministic parsers and was, perhaps, inspirational to his subsequent key role in developing Prolog.
My own interest during my time in Grenoble became the development of languages for compiler writing, which ultimately became the subject of my second doctoral dissertation [Coh 67a]. This quest explored almost all the ingredients for implementing packages like YACC and LEX. However, that project would have been too ambitious at that time; computers lacked the necessary memory and speed and the available programming languages were relatively primitive as compared to those that became available in the seventies.
Nevertheless, I have been among the first researchers to propose higherlevel languages for writing compilers. In fact, with the help of Monique Brasseur, a colleague from Grenoble, I wrote a report in which we used Algol 60 to implement a syntaxdirected interpreter for Algol 60. In that work, we used the stack machine of Randell and Russell [RR 64]to carry out the interpretation of the compiled program, pretty much in the spirit of what Java accomplishes nowadays.
It was N. Wirth who later carried out complete and beautiful implementations of Pascal using Pascal itself. It is interesting to note that Wirth started using bottomup precedence parsers and then switched to recursive descent parsers. This is probably a result of his philosophy that one should be able to provide a complete and readable implementation of the language using the language itself, as Lisp implementers have been doing since that language inception.
It was only in the midseventies that I succeeded in implementing a full syntaxdirect compiler generator with the help of some of my very bright Brandeis undergraduate students. To date, that has probably been the largest software implementation project that I have ever supervised. Brandeis had just purchased and made available to students a PDP10. Since portability was extremely desirable at the time, we had the chutzpah to use Fortran as the implementation language. We had the help of Jean Ichbiah in providing the details of his weakprecedence parsers. Lexical scanners were also generated upon user specification of finite state tables.
The project was a success because of its originality — being among the first compiler generators — as well as the fact that it was the result of successful collaboration among highly motivated students implementing a complex software package. Of course, the package lacked the professional veneer of an implementation like YACC and LEX, which were written by the research group at Bell Labs a few years later. A publication describing my experience in developing the Brandeis’ compiler generator appeared in Software – Practice and Experience [CC 75].
My subsequent work in parsing and compiling occurred with the assistance of talented Brandeis undergraduates, who helped me to coauthor of a number of papers on the subject. Among them, I mention with pride the ACM Communications paper with Marty Roth, which I consider to be the first serious analysis of deterministic parsing algorithms [CR 78]. Marty made the clever observation that to specify the number of rule usages in deterministic parsers, it was sufficient to count the number of certain terminals in the input string that also appeared in each grammar rule. By using a variant of Kirchhoff’s law, one can determine the number of times each rule is used by a parser as a function of the number of certain terminals appearing in an input string. For example, in the case of a grammar with rules for specifying arithmetic expressions, a count of the number of +’s, *’s, left and write parentheses, identifiers, etc. that appear in an input string can reveal the number of times each of the grammar rules is used.
An important consideration for performing an average case analysis of deterministic parsing algorithms is to be able to determine the number of strings of a fixed length n, generated by a given Chomsky type grammar. In a paper with Joel Katcoff, we illustrated how that number could be determined (even symbolically) in the case of finitestate grammars [CK 77a, CK 77b]. In a paper with Timothy Hickey, we demonstrated that one could determine the number of strings of length n in the case of contextfree languages [CH 83]. A key to both papers is the generation of finitedifference equations that specify the number of strings of a given length.
Coauthored with Robin Sitver, our paper showed how one could use program transformation rules to demonstrate the equivalence of a precedence parser (a la Floyd) to Dijkstra’s empirical parser using operator’s weights [CS 79]—a work that is related to parser optimization.
Neal Carpenter and I developed a language for inquiring about the execution configuration of programs written in a highlevel language [CC 77]. Basically, we implemented an interpreter for a highlevel computer language like Pascal in which the user could specify labels (identifiers) that preceded any command. During the execution of a program, records were kept of the values of variables before and after passing through a given label. After execution, one could inquire about the state of the program variables by specifying a regular expression on the labels, establishing a given state of the execution (e.g., after passing any number of times through a given loop followed by the execution of a false branch of a conditional).
In a paper written with Eileen Carter, we developed a variant of Fortran in which one could specify nondeterministic primitives like choice, success, and failure [CC 74]. That work was inspired by Floyd’s paper on nondeterministic algorithms [Flo 67].
In the early eighties, after my involvement in Logic Programming, I challenged myself to describe in Prolog, most of the algorithms used in the “Dragon Book”, a wellknown text on compilers written by Aho, Sethi and Ullman [ASU 86]. Alain Colmerauer and David H. Warren had already shown the advantages of using Prolog in certain aspects of parsing and compiling [Col 78], [War 80], but I wanted to show that many other approaches were possible. A paper describing that effort was coauthored with Timothy Hickey [CH 87].