Getting started with parsing – A simple parser

The simplest way to write a parser is to use the recursive descent technique. This creates a top-down parser (which may formally be described a LL(1)). To start the example we first have to establish the grammar rules for our language. In this example we will use simple arithmetic expression assignments for expressions that can only use the plus operator:

 Assignment --> Identifier := Expression
 Expression --> Expression + Term | Term
 Term --> Identifier | (Expression)
 Identifier --> x | y | z 

For each rule of the grammar we can write a procedure to recognise the incoming tokens to the rule. For the purposes of this example we can assume a routine called NextToken which invokes a lexical analyser to supply the token, and a routine called error which is used to output an error message. The language used is based on Pascal.

 var token:char;  (* Updated by NextToken *) 

 procedure identifier;
 begin
    if token in ['x','y','z']
    then
        NextToken
    else
        error('Identifier expected')
 end; 

You can see that this code implements the rule for recognising an Identifier. We can then implement the rule for a Term similarly:

 procedure expression forward; 

 procedure term;
 begin
     if token = '('
     then
         begin
         nextToken;
         expression;
         if token <> ')'
         then
             error(') expected')
         else NextToken
         end
     else
         identifier
 end; 

When we come to implement the rule for Expression we have a problem; the first element of the Expression rule is itself an Expression. This would cause us to write the following:

procedure expression;
begin
expression;
...
end;

This is directly self-recursive and thus would loop forever. Grammar parsed by top-down algorithms cannot be left-recursive for this reason. An easy way out of this problem is to recast the recursion as iteration in the following way:

Expression --> Term { + Term}* 

We can now code up the grammar rule as:

 procedure expression;
 begin
     term;
     while token = '+'
         do
         begin
             NextTerm;
             term
         end
 end; 

We can now complete the parser with the remaining rule for the assignment:

procedure assignment;
begin 
    identifier;
    if token <> '='
    then
        error('= expected')
    else
        begin
        NextToken;
        expression;
        end
 end; 

if you want to reproduce, please indicate the source:
Getting started with parsing – A simple parser - CodeDay