In this paper, the authors present some details of an implementation of Celentano’s suggestion for an efficient incremental LR(1) parser [1], extended also to grammars with &egr;-productions. Familiarity with basic concepts of LR and incremental parsing is assumed.
Let G be and LR(1) grammar and uvw a string in L(G). The problem is to parse a modified string uv′w, retaining as much as possible from the already known parse of uvw. The authors’ solution uses a compact representation of the sequence of parser configurations during the parse of uvw: an essential part of the stack history is factorized in an unordered tree, and the input tokens are made to point to the tree nodes associated with stack top states just before shift actions. Clearly, the parsing configurations of uv′w are those of uvw, at least up to the last token of u. Also, in many practical cases associated with structured programming, from some token of w on, the final parsing configurations of uv′w will match those of uvw. A key problem is the correct identification of the above mentioned resuming token.
Unfortunately, this paper contains some minor bugs. One of them effects the identification of the resuming token. The bug is in Algorithm 2, where a mere stack top equality is taken as a whole stack identity. A simple counterexample may be the following: G is given by the rules A:2WZbAc and A:2WZ&egr;, u=bb, v=bc, w=cc, and Vv′=bbc. For both uvw and uv′w, the parsing configurations before shifting the first token of w are distinct but have the same stack top state. Hence, the parsing of the modified string uv′w is stopped and success is incorrectly declared, despite the differences between stack contents.
A possible correction is to add a pointer equality test beside the value equality test. For example, replace Step 4 of Algorithm 2 by the following:
following:
4. If Action (st, LP.token)=Shift and then
:.BW4. :.BWIf LP.tree-pointer.state=st and then
:.BW4. :.BWIf LP.tree-pointer.parent=TP then Stop
With all of the minor bugs stamped out, the proposed implementation would be indeed more space-and time-efficient than the original one [1]. The topic is interesting, and the reviewer intends to publish a short note on this subject with a slightly revised algorithm. The concepts described in this paper are said to have been implemented in PASCAL by the modification of an existing LALR(1) parser and successfully tested on a subset ADA grammar.