4.6.4 Constructing SLR-Parsing Tables

The SLR method for constructing parsing tables is a good starting point for studying LR parsing. We shall refer to the parsing table constructed by this method as an SLR table, and to an LR parser using an SLR-parsing table as an SLR parser. The other two methods augment the SLR method with lookahead information.

The SLR method begins with LR(0) items and LR(0) automata, introduced in Section 4.5. That is, given a grammar, G, we augment G to produce G', with a new start symbol S'. From G', we construct C, the canonical collection of sets of items for G' together with the GOTO function.

Figure 4.38: Moves of an LR parser on id * id + id

The ACTION and GOTO entries in the parsing table are then constructed using the following algorithm. It requires us to know FOLLOW(A) for each nonterminal A of a grammar (see Section 4.4).

Algorithm 4.46: Constructing an SLR-parsing table.

INPUT: An augmented grammar G' .

OUTPUT: The SLR-parsing table functions ACTION and GOTO for G'.

METHOD:

Construct C {I0, I1,..., In}, the collection of sets of LR(0) items for G'.

State i is constructed from Ii . The parsing actions for state i are determined as follows:

If [A -> α@aβ] is in Ii and GOTO(Ii, a) = Ij , then set ACTION[i, a] to "shift j". Here a must be a terminal.

If [A -> a@] is in Ii, then set ACTION[i, a] to "reduce A -> a" for all a in FOLLOW(A); here A may not be S'.

If [S' -> S@] is in Ii, then set ACTION[i, $] to "accept".

If any conflicting actions result from the above rules, we say the grammar is not SLR(1). The algorithm fails to produce a parser in this case.

The goto transitions for state i are constructed for all nonterminals A using the rule: If GOTO(Ii, A) = Ij , then GOTo[i, A] = j.

All entries not defined by rules (2) and (3) are made "error".

The initial state of the parser is the one constructed from the set of items containing [S' -> @S]. □

The parsing table consisting of the ACTION and GOTO functions determined by Algorithm 4.46 is called the SLR(1) table for G. An LR parser using the SLR(1) table for G is called the SLR (1) parser for G, and a grammar having an SLR (1) parsing table is said to be SLR(1). We usually omit the "(1)" after the "SLR", since we shall not deal here with parsers having more than one symbol of lookahead.

Example 4.47: Let us construct the SLR table for the augmented expression grammar. The canonical collection of sets of LR (0) items for the grammar was shown in Fig. 4.31. First consider the set of items I0:

E'->@E

E->@E+T

E->@T

T->@T*F

T->@F

F->@(E)

F->@id

The item F -> @(E) gives rise to the entry ACTION[O, (] = shift 4, and the item F -> @id to the entry ACTION[O, id] = shift 5. Other items in I0 yield no actions. Now consider I1:

E'->@E

E->E@+T

The first item yields ACTION[1, $] = accept, and the second yields ACTION[1, +] = shift 6. Next consider I2:

E->T@

T->T@*F

Since FOLLOW(E) = {$, +, )}, the first item makes

ACTION[2, $] = ACTION[2, +] = ACTION[2, )] = reduce E -> T

The second item makes ACTION[2, *] = shift 7. Continuing in this fashion we obtain the ACTION and GOTO tables that were shown in Fig. 4.31. In that figure, the numbers of productions in reduce actions are the same as the order in which they appear in the original grammar (4.1). That is, E -> E + T is number 1, E -> T is 2, and so on. □

Example 4.48 : Every SLR(1) grammar is unambiguous, but there are many unambiguous grammars that are not SLR(1). Consider the grammar with productions

S->L=R|R

L->*R|id

R->L

Think of L and R as standing for l-value and r-value, respectively, and * as an operator indicating "contents of".@5 The canonical collection of sets of LR(0) items for grammar (4.49) is shown in Fig. 4.39.

@5: As in Section 2.8.3, an l value designates a location and an r value is a value that can be stored in a location.

I0	S'->@S S->@L=R S->@R S->@*R L->@id R->@L
I1	S'->S@
I2	S->L@=R R->L@
I3	S->R@
I4	L->@R R->@L L->@R L->@id
I5	L->id@
I6	S->L=@R R->@L L->@*R L->@id
I7	L->*R@
I8	R->L@
I9	S->L=R@

Figure 4.39: Canonical LR(0) collection for grammar (4.49)

Consider the set of items I2 , The first item in this set makes ACTION[2, =] be "shift 6." Since FOLLOW(R) contains = (to see why, consider the derivation S => L = R => *R = R), the second item sets ACTION[2, =] to "reduce R -> L". Since there is both a shift and a reduce entry in ACTION[2, =], state 2 has a shift/reduce conflict on input symbol =.

Grammar (4.49) is not ambiguous. This shift/reduce conflict arises from the fact that the SLR parser construction method is not powerful enough to remember enough left context to decide what action the parser should take on input =, having seen a string reducible to L. The canonical and LALR methods, to be discussed next, will succeed on a larger collection of grammars, including grammar (4.49). Note, however, that there are unambiguous grammars for which every LR parser construction method will produce a parsing action table with parsing action conflicts. Fortunately, such grammars can generally be avoided in programming language applications.

发表于 2012-09-24 14:38 cuishengli 阅读(673) 评论(0) 收藏举报