{

                           MODULE 9
 
 
                       Syntax Analysis
                             and
                    Syntax Error Recovery
 
 
 
 
 
 9.1 Syntax analysis.
 
 Syntax analysis is implemented as a set of recursive descent  pro-
 cedures  based  on  syntax  rules given in the ISO definition. The
 order, names, and nesting of the procedures is as follows:
 
    Programme
       ProgramHeading
       Block
          LabelDeclarationPart
          UnsignedConstant
          ConstExpression
             SimpleConstExpression
                ConstTerm
                   ConstFactor
          PragmaList
          ConstDefinitionPart
          NewSubType
          TypeDenoter
             OrdinalType
                EnumeratedType
                SubrangeType
             StructuredType
                ArrayType
                RecordType
                   FieldList
                SetType
                FileType
             PointerType
          TypeDefinitionPart
          DeclarationPart
             FixVarId
             DesignateVars
             Initialize
                ScalarInitialisation
                PointerIntialisation
                SetInitialisation
                RecordInitialisation
                ArrayInitialisation
          PfDeclaration
             PfHeading
                FormalParameterList
          Selector
          Variable
          Call
          Expression
             SimpleExperssion
                Term
                   Factor
          BooleanExpression
          Statement
             Assignment
             GotoStatement
             CompoundStatement
             IfStatement
             CaseStatement
             WhileStatement
             RepeatStatement
             ForStatement
             WithStatement
          StatementSequence
          StatementPart
 
 The procedure Selector  implements  the  analysis  of  the  syntax
 category  variable  (as  does the procedure Variable which follows
 it) but assumes that the  initial  variable  identifier  has  been
 checked and accepted before Selector is called.
 
 The procedure Call is introduced to implement the analysis of  the
 construct  [  actual-parameter-list  ]  as it appears in function-
 designator and procedure-statement.  It does so both for  program-
 defined  procedures and functions, and for the irregular parameter
 lists allowed for standard procedures and functions.
 
 The syntax analysers are written on the assumption that  the  next
 syntactic  goal  can always be selected by inspection of (at most)
 the next incoming symbol ( i.e.  that the  underlying  grammar  is
 LL(1)  ).  This is not so at the following points in the syntax of
 Pascal:-
 
 1.   A statement beginning with an identifier  may  be  either  an
      assignment or a procedure call.
 
 2.   A factor beginning with an identifier may be either  a  vari-
      able, a constant or a function call.
 
 3.   A type beginning with an identifier may  be  either  a  named
      type  already  declared  or a subrange whose lower bound is a
      constant identifier.
 
 In all cases to resolve the choice on  a  purely  syntactic  basis
 would  require a look ahead of a further symbol. However if paral-
 lel semantic analysis  is  used  these  choices  can  be  resolved
 without  further  lookahead, by inspection of the current semantic
 attributes of the identifier involved. For this  reason  syntactic
 resolution of these choices is not attempted.
 
 A similar problem arises with the actual parameter  lists  of  the
 standard  procedures and functions of Pascal which are analysed by
 special purpose syntax routines. These are  again  selected  by  a
 semantic rather than syntactic test.
 
 The syntax of Pascal is also not LL(1) in variant  records,  where
 the syntax is effectively:
 
    'case' [ identifier ':' ] identifier 'of' ....
 
 In this case semantic resolution is not possible,  and  an  ad-hoc
 lookahead technique is used.
 
 
 
 9.2 Syntactic error recovery.
 
 Recovery in the syntax analysis process following the discovery of
 a  syntax  error is incorporated into the syntax procedures on the
 following basis:
 
 1.   Each procedure when called  is  passed  an  actual  parameter
      which  is a set of symbols forming the (right) context of the
      string which it should scan.  This context normally  includes
      all  symbols  which  may legitimately follow the string to be
      scanned, and such additional symbols as a superior  (calling)
      procedure may wish to handle in the event of error recovery.
 
 2.   When entered the procedure may ensure that the current symbol
      is an acceptable starter for the string to be scanned, and if
      not scan forward until such a symbol is found (subject to  4.
      below).
 
 3.   When calling a subsidiary  syntax  procedure,  the  procedure
      passes  on  as  context its own context plus those symbols if
      any which it may determine as right context for the substring
      to be scanned.
 
 4.   To recover from a syntax error the procedure  may  scan  over
      (skip) any symbol provided it is not contained in the context
      passed to it.
 
 5.   On exit, the syntax procedure ensures that the current symbol
      is contained in the context passed to it, flagging a terminal
      error and skipping if this is not initially the case.
 
 In practice this general recovery strategy  may  be  omitted  from
 those  syntax  procedures which are only used as an alternative or
 partial subpath of a superior procedure which inevitably  enforces
 recovery.
 
 
 
 9.3 Syntax utilities.

 The following procedures enable the analysis  and  recovery  stra-
 tegies  outlined  above to be expressed neatly in the the analyser
 itself.
 
 They also  improve  the  quality  of  syntax  error  reporting  by
 suppressing  vague  "unexpected  symbol"  messages whenever a more
 precise subsequent diagnosis is made available, using  the  global
 variables Recovering, SymbolsSkipped, and ErrorReported to do so.
 
 Most syntax errors are reported using  a  code  derived  from  the
 expected  symbol,  by  the formula ord(expected symbol) + 10. Thus
 the codes used are interpreted as follows:
 
 10. An identifer symbol was expected.
 11. An integer-constant was expected.
   .
   .
   .
 69. Other symbols were expected.
 
 In practice only a subset of these can be generated by the  Pascal
 syntax analyser as some symbols are never mandatory.
 
 The syntax analyser makes use of several 'constant' sets  of  sym-
 bols  during  analysis.  The variables used to hold these sets are
 initialised by the procedure  InitSyntax,  which  must  be  called
 before syntax analysis is invoked.

                                                                     }

program SyntaxAnalyser;

#include "globals.x"
#include "source.pf"
#include "lexical.pf"

function Missing(Symbol: SymbolType): integer;
  visible;

  begin Missing := ord(Symbol) + 10 end;

procedure AcceptSymbol;
  visible;

  begin
    if Recovering
    then
      begin
        if not ErrorReported
        then
          Error(Missing(OtherSy) + ord(SymbolsSkipped), StartOfSymbol);
        Recovering := false
      end;
    NextSymbol
  end { acceptsymbol };

procedure Accept(SymbolExpected: SymbolType);
  visible;

  begin
    if Symbol = SymbolExpected
    then AcceptSymbol
    else
      begin
        Error(Missing(SymbolExpected), StartOfSymbol);
        ErrorReported := true
      end
  end { accept };

procedure AcceptEquals;
  visible;

  begin
    if (Symbol = RelOp) and (Operator = EqOp)
    then AcceptSymbol else Error(71, StartOfSymbol)
  end { acceptequals };

procedure RecoverAt(RelevantSymbols: SetOfSymbols);

  begin
    if not Recovering
    then
      begin
        StartError(StartOfSymbol);
        Recovering := true;
        ErrorReported := false;
        SymbolsSkipped := not (Symbol in RelevantSymbols)
      end;
    while not (Symbol in RelevantSymbols) do NextSymbol
  end { recoverat };

procedure CheckContext(ContextExpected: SetOfSymbols);
  visible;

  begin
    if not (Symbol in ContextExpected) then RecoverAt(ContextExpected)
  end { checkcontext };

procedure CheckNextOrContext(SymbolsExpected, DefaultContext:
                               SetOfSymbols);
  visible;

  begin
    if not (Symbol in SymbolsExpected)
    then RecoverAt(SymbolsExpected + DefaultContext)
  end { checknextorcontext };

procedure InitSyntax;
  visible;

  begin
    BlockBegSys :=
      [LabelSy, ConstSy, TypeSy, VarSy, ProcSy, FuncSy, BeginSy];
    if ICLPascal
    then BlockBegSys := BlockBegSys + [PresetSy, ReadOnlySy];
    ConstBegSys :=
      [AddOp, IntConst, RealConst, CharConst, StringConst, Ident];
    if ICLPascal then ConstBegSys := ConstBegSys + [BasedConst];
    SimpTypeBegSys := ConstBegSys + [LeftParent];
    TypeDels := [ArraySy, RecordSy, SetSy, FileSy];
    TypeBegSys := SimpTypeBegSys + TypeDels + [Arrow, PackedSy];
    StatBegSys :=
      [BeginSy, GoToSy, IfSy, CaseSy, WhileSy, RepeatSy, ForSy, WithSy];
    FacBegSys :=
      [IntConst, RealConst, CharConst, StringConst, Ident, NilSy,
       LeftParent, LeftBracket, NotSy];
    if ICLPascal then FacBegSys := FacBegSys + [BasedConst];
    ConstFacBegSys := FacBegSys - [NilSy, LeftBracket];
    SelectSymbols := [Arrow, Period, LeftBracket];
    if InHousePascal
    then SelectSymbols := SelectSymbols + [Shriek];
    ParamBegSys := [ProcSy, FuncSy, VarSy, Ident];
    if ICLPascal then ParamBegSys := ParamBegSys + [ReadOnlySy];
    case OptionValue[Level] of
      Level0 : ConformantParamBegSys := [];
      Level1, Level2, Level3 :
        ConformantParamBegSys := [PackedSy, ArraySy]
    end;
    IndexSpecBegSys := [Ident];
    if ICLPascal then IndexSpecBegSys := IndexSpecBegSys + ConstBegSys;
    Recovering := false
  end { initsyntax };

begin
  { end of module }
end.