SKIMP MK II
D. J. Rees
Introduction
SKIMP is a very simple language designed with the teaching of compilers in
mind. Although simple, it possesses many of the structural features of
full-scale languages, for example the subroutine and parameter structure, making
it a worthwhile non-trivial vehicle for teaching. Many useful features of
full-scale languages are missing but can bp incorporated with very little
difficulty. This is quite intentional. The reasons are firstly that SKIMP is a
sufficiently rich language for the compiler to be written in SKIMP itself with
little inconvenience (the SKIMP MKI compiler was written in SKIMP) and secondly
that incorporation of the missing features provides a very suitable form of
student exercise. Although it would be desirable for students to write a
complete compiler of their own, this is a considerable task which they would
probably not have the time to complete. The task of extending a simple language
such as SKIMP for which a compiler already exists is much more feasible and
still allows students to get to grips with the compiling process and so
understand it more fully.
The name SKIMP arises fron the language's origin. Essentially it is a much
smaller version of the language IMP, an ALGOL-60 type language developed in
Edinburgh for systems implementation and general use.
Since SKIMP is intended for teaching rather than production programming, it
has not been necessary for the compiler to produce object code for a particular
real machine. Instead, the opportunity has been taken to invent a machine whose
order code does not have the inconsistencies and exceptional cases that all
actual machines seen to have. The compiler produces symbolic code for this
invented machine together with a certain amount of optional monitoring of the
compilation.
This document describes the SKIMP language, the object machine and details of
the method of compilation. Examples of compiler output and suggested extensions
to SKIMP suitable for student exercises are given in appendices. To enable
students to test their modified compilers an assembler/interpreter for the
object code is available. The use of this is also described in an appendix.
1
Chapter 1
The SKIMP Language
A SKIMP program consists of a sequence of keywords, identifiers and constants
together with arithmetical operators and various separator characters.
Keywords
A keyword consists of a sequence of upper or lower case letters preceded by
the character '%'. For example:
%ROUTINE %END %THEN
The '%' character is regarded as a shift character in that all following letters
up to the first non-letter are shifted to keyword mode. This implies that no
spaces or other non-letter characters may be inserted into the middle of a
keyword. The '%' can, however, be repeated after a space. Thus,
%INTEGERNAME and %INTEGER %NAME
are valid and equivalent, but
%INTEGER NAME
is invalid.
Identifiers
An identifier consists of a string of letters and digits of any length the
first character of which must be a letter. For example,
SKIMP I J2 X1A
In descriptions of SKIMP syntax below, the phrase <NAME> is used to denote the
presence of an identifier.
Constants
Constants appear as operands in arithmetic expressions and may be of two
forms, decimal and character. Both represent integer valued quantities since
SKIMP only has integer valued variables and performs only integer arithmetic. A
decimal constant is represented by a sequence of digits. For example:
123 98765 4
No decimal points or powers of 10 are allowed.
Strings of up to four characters can be represented by surrounding the
characters with quotes. For example:
'FRED' '123*' 'X'
2
The internal value is formed by packing the (ASCII) values of the characters
right-justified in 8-bit fields of a word. A quote can become one of the
characters of the value by repeating it. For example:
'JIM''S' ''''
Input Lexical Processing
Spaces are deleted on input everywhere except when they appear between
quotes. Before deletion, however, they are significant in terminating the
keyword '%' shift mode. Text between quotes remains intact with the exception
that a repeated quote is regarded as a single quote. Newlines can therefore
appear between quotes. For example, to test for the value of a newline
character:
%IF I = '
' %THEN %RETURN
This is equivalent to:
%IF I = 10 %THEN %RETURN
since the ASCII value of a newline character is 10.
SKIMP statements are separated by either ';' or a newline. Thus several
statements can appear on one line of text. For example:
X = 1 ; Y = 2 ; Z = 3
In order to represent a single statement on more than one line intermediate
newline characters can be inserted when preceded by the continuation keyword %C.
For example:
%IF X = 1 %THEN %C
A = B
This is equivalent to:
%IF X = 1 %THEN A = B
Program Structure
A SKIMP program starts with the statement:
%BEGIN
and ends with the statement:
%ENDOFPROGRAM
Inside the program, routines and functions may appear in sequence or nested
as required. A routine takes the form:
3
%ROUTINE RT
.....
%END
and a function takes the form:
%INTEGERFN FN
.....
%END
The only difference between routines and functions is that functions produce a
value as their result and routines do not. The structure of a program could
take the following form, for example:
%BEGIN
%ROUTINE A
.....
%END
%INTEGERFN B
%ROUTINE C
.....
%END
.....
%END
.....
%ENDOFPROGRAM
Details of routines and functions and their parameters are given later.
Declarations
Names of variables and arrays denoting storage objects are declared at the
start of a routine or function. Execution of declaration statements has the
affect of setting aside storage for those variables or arrays. When the routine
or function in which the declaration occurred is returned from, that storage is
deleted so that it may be re-used for future declarations. When routines and
functions are entered recursively new storage is set aside for the variables and
arrays declared there without the storage for the old incarnation of the routine
or functions being lost. This becomes accessible again when the recursive
activation is left. A stack is used to accomplish this. The only objects SKIMP
manipulates are integer valued.
All names must be declared before they can be used. Scalar declarations take
the form:
%INTEGER <NAME>, <NAME>, <NAME>, . . .
Arrays are declared:
%INTEGERARRAY <NAME>, <NAME>, . . .( <EXPR> : <EXPR> )
where the phrase <EXPR> is used to denote the presence of an arithmetical
expression. For example:
%INTEGER I, J, K
sets aside three locations and names them 'I', 'J' and 'K'. The expressions in
4
the array declaration are the lower and upper bounds on the index of the arrays.
They may be either constants, for example:
%INTEGERARRAY A ( I : 10 )
or expressions which are evaluated at run-time, for example:
%INTEGERARRAY B, C ( -1 : N+1 )
Only one-dimensional arrays are defined in SKIMP (but see possible extensions in
Appendix 2).
Note that the first statement of a routine or function has the effect of
declaring its name. Hence, this must appear before any calls on the names.
This allows self-recursive calls but not cross-recursive calls between two
routines at the same nesting level. (SKIMP does not have routine and function
SPECs as in IMP).
Scope of Variables
When a variable or array is declared, that name can be used anywhere in the
routine or function in which the declaration appears, but not outside it. If
the routine or function has nested routines or functions the name can also be
used in these (as 'global' to them). For example:
%ROUTINE A
%INTEGER I
%ROUTINE B
%INTEGER J
....
%END
%ROUTINE C
%INTEGER K
%ROUTINE D
....
%END
....
%END
....
%END
The name 'I' can be used anywhere in routines 'A', 'B', 'C' and 'D'. The name
'J' can be used only in routine 'B' and the name 'K' can be used in routines 'C'
and 'D'.
Although, clearly, all the names of variables declared in a single routine or
function must be distinct, the same names may be declared in different routines
to refer to different storage objects. When a name is redeclared in a nested
routine or function, the storage for the global object of the same name becomes
inaccessible whilst within that routine or function. The name will always refer
to the most 'local' declaration.
Arithmetic Expressions
An arithmetic expression consists of a sequence of operands separated by
operators, possibly preceded by a unary operator (+, -, \). Operands can be
5
array elements, constants, calls on functions or bracketed sub-expressions.
The operators are:
+ : addition
- : subtraction
* : Multiplication
/ : division (rounded)
** : exponentiation
<< : logical shift left
>> : logical shift right
& : logical AND
! : logical OR
!! : logical Exclusive OR
\ : logical NOT
Their precedences are:
\ : highest
** << >>
* / &
+ - ! !! : lowest
The precedence is left to right for operators of equal precedence. Thus:
1 / J * K
is equivalent to:
( 1 / J ) * K
Array element operands take the form:
<NAME> ( <EXPR> )
For example:
A ( 1 )
B ( 2 * A ( J ) )
Assignment
Variables and array elements are assigned new values using statements of the
form :
<NAME> = <EXPR>
<NAME> ( <EXPR> ) = <EXPR>
For example:
I = 2
A ( I * J ) = ( K + L ) / 2
6
Routines and Functions
A routine is called by executing a statement consisting of the name of the
routine. For example:
%ROUTINE RT
....
%END
would be called by the statement:
RT
Similarly, the call of a function appears as an operand of an expression. For
example:
%INTEGERFN FN
....
%END
might be called by:
I = FN
Routines and functions can be called recursively to any depth. The exit from a
routine is specified by the statement:
%RETURN
This is also implied by executing the %END statement of a routine, The exit
from a function must give in addition a result for the function:
%RESULT = <EXPR>
For example:
%RESULT = I * J - K
Execution of the %END statement of a function is invalid. A routine or function
can only be entered by an appropriate call as described above. If a routine or
function is encountered in the normal flow of control of a program at run-time
it is skipped around.
Routine and Function Parameters
Parameters of the following types can be passed to routines and functions:
%INTEGER
%INTEGERNAME
%INTEGERARRAYNAME
The first is a 'value' type parameter and the other two are 'reference' types.
For the %INTEGER type, the value of an expression in the routine or function
call is calculated and passed to the routine or function as the initial value of
the parameter. The initialised parameter can be regarded thereafter as a local
7
variable. For the %INTEGERNAME type, a reference to some storage object outside
the routine or function is passed and this object will be referred to whenever
the parameter is referenced within the routine or function. Hence the only
admissible forms of parameter in the call are either the name of a variable or
the name of an array element. Similarly for the %INTEGERARRAYNAME type, a
reference to an array is passed and the only admissible form of 'actual'
parameter is the name of an array. For example:
%INTEGER A,B,C
%INTEGERARRAY M,N (1:10)
%ROUTINE JIM(%INTEGER I,%INTEGERNAME J, %INTEGERARRAYNAME K)
....
%END
JIM (A+B,C,M)
JIM (A,M(B),N)
Jumps
SKIMP labels are numerical, For example:
123: 987:
To jump to these labels the instructions would be:
->123 ->987
Labels are local to a routine or function and jumps can only take place within
the routine or function and not either to an outer textual level or to a nested
inner level. Labels are not declared and jumps may precede or follow the
occurrence of the corresponding label without restriction.
Conditional Statements
The general form of the conditional statement is:
%IF <COND> %THEN <INSTN> %ELSE <INSTR>
where <COND> represents the condition to be tested and <INSTR> represents a
restricted class of instructions described below. If the condition is true the
first of these instructions is executed, otherwise the second. The %ELSE clause
can be omitted in which case if the condition is false execution proceeds to the
following statement.
A <COND> is formed by one or more simple relations of the form:
<EXPR> <COMP> <EXPR>
where <COMP> represents one of the following comparators:
= # <= < >= >
For example:
I < J + K
These simple relations can be combined using the keywords %AND and %OR. For
8
example:
I = J %AND K = 0
No precedence is defined between %AND and %OR. Instead, brackets must be used.
A single level of bracketting may only contain a sequence of simple comparisons
connected with either %AND or %OR but not both. For example:
I = 0 %AND ( J = 0 %OR K = 0 %OR L = 0 )
The relations are evaluated left to right but only up to the point where the
truth or falsity of the whole condition can be determined.
An <INSTR> may be any of the following types of statement:
assignment
routine call
jump
%RETURN
%RESULT=<EXPR>
%STOP
For example:
%IF I = J %THEN %RETURN
%IF K > 0 %THEN K = 0 %ELSE K = -1
In addition, groups of statements can take the place of an <INSTR> and be made
conditional. The grouping is made by preceding the first statement by the
statement:
%START
and following the last one by the statement:
%FINISH
For example:
%IF J = K %THEN L = 0 %ELSE %START
%IF J < K %THEN %START
L = 1
M = 2
%FINISH %ELSE %START
L = 2
M = 1
%FINISH
%FINISH
%STARTs and %FINISHes may be nested to any depth but must balance within any
routine or function.
Comments
Comment statements start with the exclamation character '!'. Thereafter, all
text is ignored up to the next statement separator i.e. semi-colon or newline.
For example:
9
! this is a comment
I = 0 ;! so is this ; J = 0
Input-Output
The standard IMP set of input-output routine names are built into SKIMP.
These are:
%ROUTINE READ SYMBOL(%INTEGERNAME X)
Read a single symbol from the current input stream into X.
%INTEGERFN NEXT SYMBOL
Return as the result the next single symbol which will be read.
%ROUTINE SKIP SYMBOL
Skip over the next single symbol in the input.
%ROUTINE PRINT SYMBOL(%INTEGER X)
Output the single symbol X.
%ROUTINE SPACE.
Output a space symbol.
%ROUTINE SPACES(%INTEGER X)
Output X spaces.
%ROUTINE NEWLINE
Output a newline symbol.
%ROUTINE NEWLINES(%INTEGER X)
Output X newlines.
%ROUTINE NEWPAGE
Output a newpage symbol.
%ROUTINE READ(%INTEGERNAME X)
Read a decimal number into X skipping preceding spaces and newlines.
%ROUTINE WRITE(%INTEGER X, Y)
Output the decimal number X of Y digits.
10
Chapter 2
The Object Machine.
The SKIMP compiler produces object code output for an invented machine. The
output takes the form of what would be this machine's assembly code rather than
a binary form so that it can easily be inspected by students. Since the machine
is only an invented one and does not exist, the object code cannot be run in the
ordinary way, but an assembler/interpreter for this code is available which
allows the output to be tested.
No attempt has been made to define the machine completely. Only those
features necessary and relevant for the compiler are defined. Interrupt
structure and input-output instructions, for example, are not defined.
The machine has a number of registers which can either be used as
accumulators or as index registers. In this respect, it is similar to machines
such as IBM 370, DEC 10 etc.. It is not a true multi-accumulator machine,
however, in that no register to register arithmetic operations are defined. The
instruction format is shown below:
operation , register , base , displacement
For example:
LOAD, ACC, DR1, 2
Instructions operate between the register and an effective storage address
formed from the sum of the contents of the base register and the displacement.
The machine is word-addressed and assumed to have sufficient storage and
registers for the purposes of the compiler. Although not strictly necessary, it
is also useful to define a word size and the instruction word layout. A 32-bit
word is envisaged with the layout:
+----------------+------------+--------+------------------+
| operation | register | base | displacement |
+----------------+------------+--------+------------------+
(8) (4) (4) (16)
This provides for 16 registers and a displacement of up to 65535. All sixteen
registers, 0-15, can be used in the register field of the instruction, but only
registers 1-15 in the base field allowing a zero in the base field to indicate
no indexing in the effective address.
The compiler object code output does not in practice use register numbers.
Instead, it uses register mnemonics for convenience of inspection. The
mnemonics used by the compiler are:
ACC, STP, COT, WK, DR0, DR1, DR2, DR3,.....
Their individual use is explained in later chapters.
11
The operations defined for this machine are:
LOAD : Load the register with the contents of the word at the effective
storage address.
LDA : Load the register with the effective storage address.
ADD : Add the contents of the word at the effective storage address to the
register.
SUB : Subtract the contents of the word at the effective storage address
from the register.
MLT : Multiply the register by the contents of the word at the effective
storage address.
DIV : Divide the register by the contents of the word at the effective
storage address.
EXP : Raise the register to the power given by the contents of the word at
the effective storage address.
STR : Store the contents of the register in the location at the effective
storage address.
NEG : Negate the contents of the register (the effective address is
ignored).
NOT : Perform the logical NOT on the register (the effective address is
ignored).
SHL : Shift the register logically left by the amount given by the contents
of the word at the effective storage address (zeros shifted in from
the right, bits shifted out to the left lost).
SHR : Shift the register logically right by the amount given by the
contents of the word at the effective storage address (zeros shifted
in from the right, bits shifted out to the right lost).
AND : Perform the logical AND between the register and the contents of the
word at the effective storage address.
OR : Perform the logical OR between the register and the contents of the
word at the effective storage address.
XOR : Perform the logical Exclusive OR between the register and the
contents of the word at the effective storage address.
BAL : Unconditional branch to the effective address and the address of the
next instruction stored in the specified register.
B : Unconditional branch to the effective address (any register present
is ignored).
BZ : Branch to the effective address if the contents of the register are
zero.
BNZ : Branch to the effective address if the contents of the register are
not zero.
12
BG : Branch to the effective address if the contents of the register are
greater than zero.
BNG : Branch to the effective address if the contents of the register are
not greater than i.e. less than or equal to, zero.
BL : Branch to the effective address if the contents of the register are
less than zero
BNL : Branch to the effective address if the contents of the register are
not less than i.e. grater than or equal to, zero
STOP : Stop execution.
In addition to these instructions, the compiler also outputs an assembler
directive 'FILL' in the same instruction format. This is used to direct the
assembler to fill in previously assembled holes in the object code with
appropriate values. This is intended primarily for dealing with forward jumps
and is described in detail in later chapters.
13
Chapter 3
The Structure of the SKIMP Compiler
The SKIMP compiler consists of a set of external routines written in IMP. It
is a 1-pass system in that the compiler makes one pass over the source text and
generates the object code statement by statement. Each statement is processed
separately and the processing consists of four phases. These are:
1 : line reconstruction
2 : lexical analysis
3 : syntax analysis
4 : code generation
The flow of control through these phases is shown in figure 3.1.
+--------------------+
+--------------->| line reconstruct |
| | next statement |
| +----------+---------+
| |
| +----------+---------+
| | analyse lexical |
| | structure |
| +----------+---------+
| |
| +----------+---------+
| | analyse syntactic |
| | structure |
| +----------+---------+
| |
| +---------+--------+
| | valid syntax ? |
| +-+--------------+-+
| | no yes|
| | |
| +--------------+---+ +---+-----------+
|<---| print "SYNTAX ?" | | generate code |
| +------------------+ +-------+-------+
| |
| +----------+---------+
| | end of program ? |
| +-+---------------+--+
| |no yes|
+-----------------------------+ |
figure 3.1
Syntax Definitions
For convenience when continually modifying and developing the compiler, a
description of the syntax of statements is read in from a file each time a
program is compiled. The syntax takes the form of phrase structure definitions.
14
For example:
<DIGIT> = '0','1','2','3','4','5','6','7','8','9';
This would define a phrase <DIGIT> as any of the digits 0 to 9. Characters such
as these digits are enclosed within single quotes as shown. Groups of
characters can also appear between quotes if required. Commas separate the
alternative forms the phrase may take and a semi-colon terminates the
definition.
Keywords are enclosed within double quote characters. For example:
<PROC> = "ROUTINE" , "INTEGERFN" ;
Note that the '%' character can be omitted. When it is not clear what
constitutes a single keyword, for instance %INTEGERARRAYNAME, it may be divided
up as convenient. For example, the forms:
"INTEGERARRAYNAME"
"INTEGER" "ARRAY" "NAME"
are both valid and equivalent.
A decimal number might be defined by the phrase <N> as follows:
<N> = <DIGIT> <N> , <DIGIT> ;
That is, a number is either a digit followed by a number or is a single digit.
This illustrates how phrases can be defined recursively.
It is also possible to have a null alternative i.e. one with no items. For
example, if it was desired to define a decimal number optionally preceded by a
plus or minus sign, a phrase <SIGN> could be defined:
<SIGN> = '+' , '-' , ;
Then the required definition could be:
<SN> = <SIGN> <N> ;
In other words, the third alternative form of <SIGN>, namely nothing, can
precede the <N>. A phrase definition with only one alternative such as this is
rather redundant but nevertheless quite valid. The particular syntax analysis
scheme used in the SKIMP compiler imposes certain restrictions on the forms
phrase definitions may take but discussion of these is postponed until the
scheme itself has been described.
The syntax definitions for SKIMP are shown in figure 3.2. The types of
statement in SKIMP are defined by the phrase <STATEMENT> and the remaining
definitions follow from this. There are two phrases built into the compiler and
which therefore do not need to be defined. These are <NAME> and <CONST>. They
represent identifiers and constants respectively. The reason for them being
built-in is that identifiers and constants are recognised at the lexical
analysis stage and hence do not require definitions for use by the syntax
analyser.
15
<STATEMENT>= <INSTR>,
"IF"<COND>"THEN"<INSTR><ELSE>,
<CONST>':'<STATEMENT>,
"FINISH"<ELSE>,
"INTEGER"<ARRAY>,
<PROC> <NAME><FORMAL>,
"END"<OFPROG>,
"BEGIN";
<INSTR> = <NAME><ACTUAL><ASSIGN>,
'->'<CONST>,
"START",
"RETURN",
"RESULT"'='<EXPR>,
"STOP";
<EXPR> = <UNARY><OPERAND><EXPRREST>;
<UNARY> = '-','\','+', ;
<OPERAND> = <NAME><ACTUAL>,<CONST>,'('<EXPR>')';
<EXPRREST> = <OP><OPERAND><EXPRREST>, ;
<OP> = '<<','>>','&','!!','!','**','/','*','+','-';
<COND> = <TEST><CONDREST>;
<TEST> = <EXPR><COMP><EXPR>,'('<COND>')';
<COMP> = '=','#','<=','<','>=','>';
<CONDREST> = "AND"<TEST><ANDCOND>,"OR"<TEST><ORCOND>, ;
<ANDCOND> = "AND"<TEST><ANDCOND>, ;
<ORCOND> = "OR"<TEST><ORCOND>, ;
<ELSE> = "ELSE"<INSTR>, ;
<ACTUAL> = '('<EXPR><EXPRS>')', ;
<EXPRS> = ','<EXPR><EXPRS>, ;
<ASSIGN> = '='<EXPR>, ;
<ARRAY> = "ARRAY"<NAME><NAMES>'('<EXPR>':'<EXPR>')',<NAME><NAMES>;
<NAMES> = ','<NAME><NAMES>, ;
<PROC> = "ROUTINE","INTEGERFN";
<FORMAL> = '('"INTEGER"<FORM><NAME><NAMES><FORMALS>')', ;
<FORM> = "ARRAYNAME","NAME", ;
<FORMALS> = ','"INTEGER"<FORM><NAME><NAMES><FORMALS>, ;
<OFPROG> = "OFPROGRAM", ;
figure 3.2
Syntax Reduction
When the syntax definitions are read in, they are reduced to a numerical form
to suit the syntax analyser. The keywords are also picked out and built up into
a dictionary for the benefit of the lexical analyser. This dictionary is a tree
structure of cells which each represent an individual keyletter. When two or
more keywords start with the same keyletters, the same cells are used for the
letters in common. For example, the keywords %START and %STOP would be
represented by the structure shown in figure 3.3.
16
+---+---+---+
| S | | |
+---+-+-+---+
|
+-+-+---+---+
| T | | |
+---+-+-+---+
|
+-+-+---+---+ +---+---+---+
| A | | |---->| O | | |
+---+-+-+---+ +---+-+-+---+
| |
+-+-+---+---+ +-+-+---+---+
| R | | | | P | | |
+---+-+-+---+ +---+---+---+
|
+-+-+---+---+
| T | | |
+---+-+-+---+
figure 3.3
As can be seen, the letters S and T share the same cells. Each cell has two
links in addition. The first links together cells containing successive letters
in keywords and the second links together cells containing alternative letters.
Thus, the cell containing O is linked from the cell containing A through the
second link since A and O are the alternative letters which may follow T. In
addition to the three items already described, ther is a further item in the
cells at the leaves of the tree which contains an identification number of the
keyword which terminates there. These numbers are assigned as the dictionary is
created and range from 128 upwards in order to distinguish them from normal
characters at the lexical analysis stage. To improve dictionary searching
efficiency, the first letter of each keyword is used to index into an array of
26 cells. Tree searching only takes place thereafter.
Just as keywords are assigned identification numbers, so are phrase names.
Phrases <NAME> and <CONST> are assigned the numbers 256 and 257 respectively,
and other phrases the numbers 258 upwards. These numbers are used in the
reduced numerical form of the syntax definitions. In general, the reduced form
of a definition is as shown in figure 3.4:
+-----------------------> +----------------------> +-- - - -->
| | | | | |
+-+-+-----+---------------++-++-----+---------------++-++-- - --+-+-+
| | NT1 | alternative 1 | | NT2 | alternative 2 | | | 0 |
+---+-----+---------------+---+-----+---------------+---+-- - --+---+
figure 3.4
The zero signifies the end of the definition. The contents of the alternatives
fields in the diagram correspond one for one with the items in the definition
read in. They are either the numerical equivalents of the literal characters or
keyword or phrase identification numbers. NT1, NT2 etc. are the number of
phrase items in that alternative. This is of relevance to the syntax analyser.
The definitions reduced in this way are stored one after another in an array as
shown in figure 3.5.
17
1 4 1 259 11 3 135 260 145 259 261 16 2 257 58 258 20
17 1 133 261 24 1 136 262 29 3 263 256 264 33 1 131 265
33 36 0 130 0 42 3 256 266 267 47 1 45 62 257 50 0
49 143 53 0 140 58 1 141 61 268 61 0 144 0 67 3 269
65 270 271 0 71 0 45 74 0 92 77 0 43 79 0 0 84
81 2 256 266 87 1 257 92 1 40 268 41 0 98 3 272 270
97 271 100 0 0 105 0 60 60 109 0 62 62 112 0 38 116
113 0 33 33 119 0 33 123 0 42 42 126 0 47 129 0 42
129 132 0 43 135 0 45 0 140 2 273 274 0 146 3 268 275
145 268 151 1 40 260 41 0 155 0 61 158 0 35 162 0 60
161 61 165 0 60 169 0 62 61 172 0 62 0 178 2 128 273
177 276 183 2 138 273 277 185 0 0 191 2 128 273 276 193 0
193 0 199 2 138 273 277 201 0 0 206 1 132 259 208 0 0
209 215 2 40 268 278 41 217 0 0 223 2 44 268 278 225 0
225 0 230 1 61 268 232 0 0 243 4 129 256 279 40 268 58
241 268 41 247 2 256 279 0 253 2 44 256 279 255 0 0 259
257 0 142 263 0 136 134 0 273 4 40 136 280 256 279 281 41
273 275 0 0 280 0 129 137 283 0 137 285 0 0 294 4 44
289 136 280 256 279 281 296 0 0 300 0 139 302 0 0
figure 3.5
The identification numbers of the keywords and phrases corresponding to this are
shown in figure 3.6. This figure also gives the starting index positions in the
array of each of the phrase definitions.
Keywords Phrases
128 AND 256 0 NAME
129 ARRAY 257 0 CONST
130 BEGIN 258 1 STATEMENT
131 END 259 37 INSTR
132 ELSE 260 136 COND
133 FINISH 261 202 ELSE
134 FN 262 233 ARRAY
135 IF 263 256 PROC
136 INTEGER 264 264 FORMAL
137 NAME 265 297 OFPROG
138 OR 266 209 ACTUAL
139 OFPROGRAM 267 226 ASSIGN
140 RETURN 268 62 EXPR
141 RESULT 269 68 UNARY
142 ROUTINE 270 80 OPERAND
143 START 271 93 EXPRREST
144 STOP 272 101 OP
145 THEN 273 141 TEST
274 173 CONDREST
275 152 COMP
276 186 ANDCOND
277 194 ORCOND
278 218 EXPRS
279 248 NAMES
280 276 FORM
281 286 FORMALS
figure 3.6
If any faults are discovered in the syntax definitions by the reduction routine
18
the compiler will halt without attempting to compile the source.
Implementation Details
The routines which implement everything described in this chapter are
contained in the file 'SKIMPA'. The routine which reads in the syntax
definitions is named 'READ PS' and it takes the definitions from a file named
'SKIMPPS'. The resultant reduced form is stored in the array named 'PS'. The
positions in this where phrase definitions start are given by the array 'PP'
which is indexed by phrase identification number. The information given in
figures 3.2, 3.5 and 3.6 is output to a file named 'SKIMPPSL'.
The keyword dictionary is formed from cells which are records of the record
array named 'KD'. Their format is:
%RECORDFORMAT KDF(%INTEGER L,N,A,B)
'L' represents the key letter, 'N' the identification number at a leaf cell and
'A' and 'B' the two links used in the tree structuring. The creation of the
dictionary is slightly involved since keywords in the dictionary have to be the
'lowest common denominator' of the keywords in the syntax description. This
means that keywords may have to be split up if a new keyword is encountered
which is a prefix of an existing one. A local string array 'KA' of keywords is
used to aid this process.
The main control loop of the compiler which is illustrated in figure 3.1 is
implemented in the main routine named 'SKIMP'. The box 'generate code'
corresponds to routine 'STATEMENT' in file 'SKIMPB'. Subroutines used by this
routine appear in the other files 'SKIMPC', SKIMPD' and 'SKIMPE'. The general
strategy in processing statements is to break them down into component syntactic
parts and process each of these simpler parts individually. At the highest
level, each alternative form in the definition of phrase <STATEMENT> is
processsed separately by executing a switch on the alternative number in the
analysis record of the statement (see next chapter). This is also done at the
lower hierarchical levels.
19
Chapter 4
Line Reconstruction, Lexical and Syntax Analysis
Line Reconstruction
Line reconstruction is the first phase of processing of the raw source text
and is a simple cleaning up operation. Its functions are:
1. to mark keyword letters
2 to remove spaces
3. to concatenate continuation lines
4. to discard comments
Since the use of '%' as a prefix can lead to many different ways of
representing keywords (c.f. %INTEGERARRAYNAME above), this variability must be
removed before they can conveniently be processed any further. This is done
during line reconstruction by deleting the '%' and adding the value 128 to each
keyletter.
The only significance of spaces is in terminating keywords. Hence, after
keyletters have been marked spaces are discarded. All symbols within quotes
(and this can include spaces and newlines) are left intact, however.
One statement at a time is read from the source file i.e. up to the first
semi-colon or newline. Any number of lines may be concatenated using the '%C'
at the end of a line subject only to a limitation of 256 characters in total.
This is dealt with by looking at the previous character whenever a newline is
encountered. If it is the value 'C'+128 i.e. a '%'-shifted 'C', then it and the
newline are discarded and processing continues with the next line. If the first
character of a statement is a '!' the statement is regarded as a comment and
discarded.
Lexical Analysis
The cleaned up text received from the line reconstruct phase is next
processed to find lexical objects. These are:
1. keywords
2. names
3. constants
The start of any lexical object is determined by inspecting each character in
turn. A value greater than 128 indicates the start of a keyword. The adjacent
characters also greater than 128 are concatenated into a string which is then
looked up in the keyword dictionary described previously. The identification
number thus produced (ranging from 128 upwards) is stored in the lexical output
array in place of the keyletters. Potentially more than one identification
number can result from a compound keyword. In this case the look up routine is
called repeatedly until all the keyletters have been identified.
If a character value lies between 'A' and 'Z', it and the following letters
and digits comprise a name which is then looked up in a dictionary. If the name
20
is already in the dictionary then its existing identification number is used.
If it is not already in, it is inserted and a new identification number
generated for it. The output to the lexical array consists of the number 256
(the phrase number of <NAME>) followed by the identification number of the name.
The identification numbers are the positions at which the names are stored in
a 'hash table'. The hash table operates as follows. A calculation is performed
on the characters of the name itself to produce a 'hash index'. If this
position is not already occupied, the name is inserted at that point. If the
position is already occupied the name stored there may or may not be the same as
the new name being looked up. If it is the same then no further action is
required. If it is not, a 'clash' has occurred which must be resolved.
Numerous methods of doing this have been invented but the simplest possible one
is used here. This is to scan forward cyclically inspecting each successive
position in turn. If the new name is encountered in any of these positions,
again no further action is required. If, however, an unoccupied position is
encountered then the new name cannot be in already and so can be inserted there.
The requirements of the hash calculation are that it should be fast and that
as far as possible different names should produce different indexes. This is,
of course, unattainable in general since there are many more potential names
than indexes but a fairly good attempt can be made. The scheme used in this
compiler packs character values into a word, divides this word by a prime number
just less than the size of the hash table and takes the remainder as the hash
index.
Constants start with either a digit or a single quote character. The lexical
action in this case is to evaluate the constant and to insert the number 257
(for phrase <CONST>) followed by the value into the lexical array. For
character constants the value consists of the individual character values packed
into 8-bit fields of a word right justified.
Characters which are not part of keywords, names or constants are passed
through to the lexical array unchanged. Since characters have values less than
128, the type of a lexical item can be determined from the range it lies in.
Syntax Analysis
The compiler uses a top-down recursive descent method of syntax analysis
which is directed by the table of reduced syntax definitions described earlier.
This analyses the output of the lexical phase with respect to the phrase
<STATEMENT>. The algorithm is implemented as a routine which considers just one
phrase definition. Where literal characters or keywords appear as items in the
definition they are compared directly with the lexical information and where a
sub-phrase appears as an item, the routine is called recursively to match it.
The technique involved in matching a phrase is essentially a search in that
each alternative in the phrase definition is considered in turn. Similarly, the
items in an alternative are considered one after another. An alternative
matches if all the items in it have been matched. If any item in the definition
does not match, the whole alternative in which that item appears is abandoned
and the algorithm moves on to try the next alternative. A flow diagram of the
algorithm is given in figure 4.1. Figure 4.2 shows the stages in the analysis
of the statement:
%INTEGER I,J
21
|
compare phrase |
+ - - - - - - - - - - - - - - - -|- - - - - - - - - - - +
| | |
+---------------+
| +------------------>| another |--------------|-------->
| | alternative ? | no failure
| | +---------------+ |
| yes |
| | | |
| +-----------+
| | +--------------->| another |----------------|-------->
| | | item ? | no success
| | | +-----------+ |
| | yes |
| | | | |
| | +---------------+
| | | | phrase ? | |
| | +---------------+
| | | s | | no |
| | | |
| | | + - - - - - -+ +------------+ |
| | | compare | | item |
| | | | phrase | | match ? | |
| | + - - - - - -+ +------------+
| | | success | | failure yes | | no |
| | | | | |
| | +<---------+--|-----------------+ | |
| | |
| +<-----------------+--------------------+ |
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - +
figure 4.1
The restrictions upon the form phrase definitions may take which were
mentioned in the previous chapter arise from this comparison algorithm.
Consider the definition:
<N> = <N><DIGIT> , <DIGIT> ;
If the analyser were to use this a recursive loop would result since <N> appears
as the first item in the first alternative. Similarly, consider the definition:
<N> = <DIGIT> , <DIGIT><N> ;
In this case, the analyser would only be able to recognise single digit numbers
since the first alternative would always prove successful and the second
alternative would never be tried. Situations of these kinds must be avoided.
Fortunately this usually presents no difficulties.
From the efficiency point of view, it is also desirable to avoid other forms
of definition. An example might be:
<EXPR> = <OPERAND><OP><EXPR> , <OPERAND> ;
The inefficiency results from <OPERAND> being the first item in both
alternatives. Clearly, whenever the second alternative is recognised ( the last
22
compare <STATEMENT>
1st alternative
1st item = <INSTR>
compare <INSTR>
1st alternative
1st item = <NAME> / no match
2nd alternative
1st item = '-' / no match
3rd alternative
1st item = "START" / no match
4th alternative
1st item = "RETURN" / no match
5th alternative
1st item = "RESULT" / no match
6th alternative
1st item = "STOP" / no match
failure to match <INSTR>
2nd alternative
1st item = "IF" / no match
3rd alternative
1st item = <CONST> / no match
4th alternative
1st item = "FINISH" / no match
5th alternative
1st item = "INTEGER" / match
2nd item = <ARRAY>
compare <ARRAY>
1st alternative
1st item = "ARRAY" / no match
2nd alternative
1st item = <NAME> / match (I)
2nd item = <NAMES>
compare <NAMES>
1st alternative
1st item = ',' / match
2nd item = <NAME> / match (J)
3rd item = <NAMES>
compare <NAMES>
1st alternative
1st item = ',' / no match
2nd alternative
null item / match
successful match of <NAMES>
no more items
successful match of <NAMES>
no more items
successful match of <ARRAY>
no more items
successful match of <STATEMENT>
figure 4.2
23
operand of an expression) phrase <OPERAND> will have been recognised twice.
This could have been avoided by writing the definition of <EXPR> in the
following way:
<EXPR> = <OPERAND><REST> ;
<REST> = <OP><OPERAND><REST> , ;
Definitions with only one alternative are, strictly, unnecessary and result
in the inefficiency of an extra recursive call, but are occasionally used for
the sake of clarity. <EXPR> and <COND> are examples in the SKIMP syntax.
Efficiency is also partly the reason for recognising names and constants at
the lexical stage. If their recognition were left to this algorithm, two
phrases such as:
<LETTER> = 'A' , 'B' , 'C', . . . . 'Z' ;
<DIGIT> = '0' , '1' , '2' , . . . . '9' ;
would probably have to be introduced. Since the algorithm searches through the
alternatives, the recognition of letters and digits would be very slow.
Also in view of the searching, it will clearly be beneficial to put the most
commonly occurring forms of alternative first in the definition of a phrase. In
particular, <STATEMENT> and <INSTR> are ordered in what is hopefully a sensible
way with this consideration in mind.
Analysis Records
During the course of syntax analysis, a record is kept of which phrases have
been recognised. This information is passed to the code generation phase of the
compiler and directs its course of action. The record consists of a
linearisation of the analysis tree corresponding to the statement recognised.
For instance, the analysis tree corresponding to the example above would take
the form shown in figure 4.3.
<STATEMENT> / 5
| |
| +-------+
| |
"INTEGER" <ARRAY> /2
| |
| +----------+
| |
<NAME> <NAMES> / 1
| | | |
| +----+ | +------+
| | | |
'I' ',' <NAME> <NAMES> /2
| |
| |
'J' null
figure 4.3
The number after the each phrase name in the tree is the number of the
24
alternative in its definition which was matched. These numbers and the pointers
emanating from each phrase comprise the information stored in the linearised
form of analysis record. For each phrase, the record takes the form shown in
figure 4.4:
+--------------------+-----------+-----------+-----------
| alternative number | pointer 1 | pointer 2 | . . . .
+--------------------+-----------+-----------+-----------
figure 4.4
The information is pruned to a certain extent, however. There are only pointers
for the phrase items matched. All reference to keywords, literals and null
alternatives is omitted since their presence can be inferred from the
alternative numbers. Phrases <NAME> and <CONST> are treated specially since
they were recognised at the lexical stage. To be consistent with other phrases,
both are given an alternative number matched of 1. In addition, for <NAME>, the
following word is set to contain the identification number of the name and for
<CONST>, the following word is set to the value of the constant. The analysis
record for the axample is therefore as shown in figure 4.5:
+--------------> +--------------->
+--> +---|--> | +----|---> |
| | | | | | | | | |
+---+---+---+---+---+---+-------+---+----+----+---+-------+---+
| 5 | 3 | 2 | 6 | 8 | 1 | ID(I) | 1 | 11 | 13 | 1 | ID(J) | 2 |
+---+---+---+---+---+---+-------+---+----+----+---+-------+---+
figure 4.5
The phrase names themselves also do not appear in the record as can be seen
above. The pointers are just indexes into the array in which the record is
stored. To aid clarity, when the analysis record is printed out by the compiler
(the ANAL option) the positions at which phrases start are indicated by inserts
such as:
(1/STATEMENT) (3/ARRAY) (8/NAMES)
Implementation Details
Line reconstruction and lexical analysis are implemented together in the one
routine which is named 'READ STATEMENT' and which appears in file 'SKIMPA'.
Subsidiary to this are routines which deal with each kind of lexical object.
These are named 'KEYWORD', 'NAME' and 'CONST'. Reconstructed text is placed in
an array named 'TT' and this is lexically reduced into an array named 'T'. The
name dictionary which is created during lexical processing comprises an array
named 'NAMEDLINK' and a byte array named 'NAMED'. The former is used as the
hash table and the entries in it point to positions in the latter array where
the characters of the names are stored as concatenated strings. For example,
figure 4.6:
25
+----+-+----+-+----+-+------+-+--------+
| | | | | | | | | |
+----+++----+++----+++------+++--------+
| | | |
+---------+ +----|------+ +--+
| | +----+ |
| | | |
+-+-----------+---------+-------+-----------------
| 5 H A R R Y 4 D I C K 3 T O M 4 F R E D . . . .
+-------------------------------------------------
figure 4.6
The syntax analysis routine is named 'COMPARE', also in file 'SKIMPA', and is
a function which returns the value 1 for success and 0 for failure. It takes as
its single parameter the identification number of the phrase to be matched in
this call. Hence, it is initially called with the number of phrase <STATEMENT>.
Since <STATEMENT> is assumed always to be the first phrase definition to be read
in, this number is 258. Phrase numbers 256 and 257, <NAME> and <CONST>
respectively, are picked out on entry to the routine and treated as literals
since they have already been recognised. The analysis record is placed in an
array named 'A'.
Processing Analysis Records
The basis of the processing rests on knowing the starting position in the
analysis record of each phrase which is of concern. The values in these
positions indicate the alternative which was actually recognised by the
comparison routine and it is this information which directs the code generation.
The form of analysis record facilitates finding out where the phrases start.
These are simply the values of the pointers which follow each alernative number.
Consider, for example, the phrase structure associated with the declaration of
variables:
<STATEMENT> = . . . . . ."INTEGER"<ARRAY>, . . . .
<ARRAY> = "ARRAY"<NAME><NAMES>'('<EXPR>':'<EXPR>')',<NAME><NAMES>;
The analysis record for:
%INTEGER I, J, K
would be:
5 3 2 6 8 1 ID(I) 1 11 13 1 1D(J) 1 16 18 1 ID(K) 2
The compiler fairly consistently uses names for the positions of where phrases
start in the analysis record which consist of the name of the phrase with 'P'
(for 'pointer') appended. In the case of phrase <STATEMENT> the variable is
'STATEMENTP' and it is the value in the analysis record at that point which
causes the main switch in routine 'STATEMENT'. For the positions of phrases
<ARRAY>, <NAME> and <NAMES>, the variables are 'ARRAYP', 'NAMEP' and 'NAMESP'
respectively. The following IMP code shows how the values for these are set up
and the declaration of scalars is handled:
26
ARRAYP=A(STATEMENTP+1)
NAMEP=A(ARRAYP+1)
NAMESP=A(ARRAYP+2)
%IF A(ARRAYP)=2 %START
%CYCLE ;! for each name declared
PUSHTAG(A(NAMEP*1), etc. )
NEXTRAD(LEVEL)= etc.
%IF A(NAMESP)=2 %THEN %EXIT
NAMEP=A(NAMESP+1)
NAMESP=A(NAMESP+2)
%REPEAT
%FINISH %ELSE %START
! array declarations
. . . .
The routine call on 'PUSHTAG' and the following line deal with the declaration
of each name. This is described in more detail in chapter 6. Each time round
the loop the variables 'NAMEP' and 'NAMESP' are updated for the next name and
the exit from the loop takes place when the null alternative of phrase <NAMES>,
alternative 2, is encountered.
27
Chapter 5
Storage Organisation
Layout of Store
The compiler produces object code which assumes the layout of store shown in
figure 5.1. The object code itself starts at address zero and the table of
constants and run-time stack follow consecutively. If it was so desired,
however, the code could easily be made relocatable and the two other areas could
be located elsewhere, for instance in separate segments on a machine with such
an architecture.
0 +------------------+
| | |
v | |
| Object Code |
| |
| |
+------------------+
| |
| |
| Table of |
| Constants |
| |
| |
+------------------+
| |
| |
| Run-time |
| Stack |
| |
| |
+------------------+
figure 5.1
The entry point to the object code is at address zero. The code dumped at
this position sets the register 'COT' to the base of the table of contants and
also sets the first display register, 'DR1', to the start of the stack (see
below). Accesses to data are always made via index registers.
Run-time Storage Allocation and Addressing
The block structure of SKIMP leads naturally to the use of a stack for the
allocation of storage. In particular, the possibly recursive calling of
routines and functions is most conveniently dealt with using a stack. Storage
for local variables and arrays is allocated when the routine or function is
entered and deallocated when it is left. These areas are sometimes known as
'local name spaces'. Return addresses can also be preserved on the stack.
Consider a number of routines 'A', 'B', 'C' etc. and the storage allocation
for variables and arrays in those routines. Suppose routine 'A' is called which
then calls 'B' and which then calls 'C'. A picture of the stack would be:
28
-----+------------------+-------------+-------------+-----
| storage for A | for B | for C |
-----+------------------+-------------+-------------+-----
For calls in a different order, say 'A' called 'C', 'C' called 'D' and 'D'
called 'B', the picture would be:
-----+-----------+-----------+-----------+-----------+-----
| A | C | D | B |
-----+-----------+-----------+-----------+-----------+-----
Similarly recursive calls, say 'B' called 'A', 'A' called 'A', 'A' called 'C'
and 'C' called 'B':
--+-----------+------------+-----------+-----------+------------+--
| B (1) | A (1) | A (2) | C | B (2) |
--+-----------+------------+-----------+-----------+------------+--
Note that completely new storage areas are allocated for recursive calls. A
result of this 'allocate when called' scheme is that the storage for routine is
liable to start anywhere on the stack and not in a fixed place. The compiler
cannot therefore assign addresses to the variables as it comes across them in
declarations at compile-time. However, it can assign them addresses relative to
the start of the storage area for that routine and this it does. To access a
variable at run-time therefore requires both this 'relative address' and the
address of the start of the appropriate local name space. These start addresses
can be kept in registers thus allowing the 'base register and displacement' form
in a single object instruction to be used to access variables.
A register need not be allocated to each routine in the program, however.
Only those which are currently active i.e. called, and which contain variables
in scope at the current point of execution of the program, need have
corresponding registers. Since only one routine can be in scope at each textual
level i.e. routine nesting depth, only one register per textual level is needed.
This set of registers is known as a 'Display'. For machines with no or only a
few index registers, a display can be kept in store rather than registers but
then extra overheads are involved whenever variables are accessed. A typical
situation is that in which there are sufficient registers for one to be
permanently allocated to the most recent local name space and perhaps another to
be used for all global accesses. The loss of efficiency here is usually fairly
acceptable. No such difficulties arise with the SKIMP machine.
Clearly, the display will need to be updated on routine entry and exit.
Consider the following section of program with two routines at the same textual
level:
%ROUTINE A
. . .
%END
%ROUTINE B
. . .
%END
When 'A' is called, the stack situation will be:
29
--------+-------------------+-------------------------
| A |
--------+-------------------+-------------------------
|
DR-->
where 'DR' is the register allocated to point at the start of 'A's local name
space. When 'B' is called from 'A', the variables in 'A' are no longer in scope
i.e. are inaccessible, so 'DR' can be re-used to point to 'B's local name space:
---+------------------+-------------------+-------------
| A | B |
---+------------------+-------------------+-------------
|
DR-->
When 'B' returns, 'A's variables must again become accessible. Hence, the
previous value of 'DR' must be preserved. This will be true for all the
registers of the display. Their values must be preserved when they are updated
so that they can eventually be restored. The stack can also be used for this
purpose. The object code the SKIMP compiler produces puts preserved display
values in the first location of each local name space followed by the return
address to the calling routine. Storage for variables starts after that.
This discussion leads to the following scheme for updating the display. The
display is represented by the registers 'DR1', 'DR2', 'DR3', etc. and the
register which is used to point to the start of unallocated storage on the stack
is named 'STP', standing for STack Pointer. The call of a routine or function
is made by means of an instruction of the form;
BAL, WK, , (start address)
which leaves the return address in a 'work' register named 'WK'. The code at
the point of entry to a routine is:
STR, DR(level of routine), STP, 0
LDA, DR(level of routine), STP, 0
STR, WK, STP, 1
ADD, STP, , (storage area size)
At a point of exit i.e. %RETURN, %RESULT=, or %END, the code is:
LDA, STP, DR(level of routine), 0
LOAD, DR(level of routine), STP, 0
LOAD, WK, STP, 1
B, , WK, 0
Array Addressing
As mentioned previously, variables are allocated locations relative to the
start of a local name space as they are declared. A problem arises for arrays
since they may have bounds which are evaluated at run-time which means that the
amount of space they need is not known at compile-time. If the space for arrays
and variables were allocated in the order of declaration, variables declared
after an array would not have a fixed relative address. The solution is to
30
allocate the space for arrays after that for the variables as shown in the
diagram:
--+--------+-------------+-----------------+------------------+--
| old DR | return addr | static area | dynamic area |
--+--------+-------------+-----------------+------------------+--
|
DR-->
Each array is now allocated a single location in the static area and this points
to the place in the dynamic area where the storage for the array is located.
For example;
%INTEGER I, J
%INTEGERARRAY A, B( ? : ? )
%INTEGER K, L
+------------------------->
+-----|-------------> |
| | | |
-+--+--+-----+-----+--+--+--+--+-----+-----+-----------+-----------+-
| | | I | J | | | K | L | A | B |
-+--+--+-----+-----+-----+-----+-----+-----+-----------+-----------+-
| |
<--DR STP-->
When register 'STP' os incremented at routine entry, it is just incremented
past the static storage area. The object code output for an array declaration
allocates space beyond 'STP' and moves 'STP' up as appropriate for the size of
the array. Variables in the static area have an address:
DR(textual level of declaration) + relative address of variable
which allows them to be accessed in one instruction. Array elements on the
other hand can only be accessed indirectly via the pointer. If this points to
the position of the possibly hypothetical zero element of the array rather than
the first actual location, the address of an element such as:
A( expression )
will be:
[ DR(level) + rel. addr. of pointer ] + value of expression
where the square brackets indicate 'contents of'.
In addition to incrementing 'STP' the object code for an array declaration
must also fill in this zero address. For the general case of an array
declaration:
%INTEGERARRAY A,B,....( lower : upper )
the form of the object code is:
31
code to evaluate 'lower' in 'ACC'
STR, ACC, DR(level), temp1
code to evaluate 'upper' in 'ACC'
LDA, ACC, ACC, 1
STR, ACC, DR(level), temp2
SUB, STP, DR(level), temp1
STR, STP, DR(level), rel. addr. of 'A's pointer
ADD, STP. DR(level), temp2
SUB. STP, DR(level), temp1
STR, STP. DR(level), rel. addr. of 'B's pointer
ADD, STP. DR(level), temp2
. . . . .
'temp1' and 'temp2' are temporary working storage locations set aside in the
local name space to hold the values of 'lower' and 'upper+1' respectively.
(Similar working storage is also used for partial results when evaluating
expressions.) For each array declared the value 'STP-lower' before 'STP' is
incremented gives the zero element address and that plus 'upper+1' gives the
final value of 'STP' having allowed space of length 'upper-lower+1' for the
array.
Parameter Passing
Parameters are regarded as initialised local variables of routines in SKIMP.
Passing parameters for a routine call therefore consists of setting these
initial values. This is quite straightforward since the the parameters will
have the first relative addresses to be allocated in the static area of the
routine's local name space. Before the entry code has been executed, the start
of the future local name space will pointed to by the current value of 'STP'.
This allows the values of the parameters to be stored using addresses relative
to 'STP' before the 'BAL' is executed.
The three types of parameter are each treated differently. For %INTEGER
value type parameters, the expression in the call is evaluated in register 'ACC'
and then stored in the appropriate place. For %INTEGERNAME type parameters, the
address of the variable or array element is evaluated in register 'ACC' and then
stored. Thereafter, the parameter has to be accessed indirectly via this
address. For %INTEGERARRAYNAME type parameters, the word containing the zero
element address of the array in the call is copied to the parameter location.
This allows accesses to array name parameter elements to be handled by the
compiler in exactly the same way as those to ordinary array elements.
To illustrate the scheme, consider the stack when routine 'R' in the
following program is called:
%BEGIN
%ROUTINE R(%INTEGER X, %INTEGERNAME Y,%INTEGRRARRAYNAME Z)
. . . . .
%END
%INTEGERARRAY A ( 0 : ? )
%INTEGER I, J, K
R(I+J, K, A)
. . . . .
32
+------------------->+<-------------------------------+
| <-|--------------------------+ |
| | | | |
-+--+--+-----+-----+-----+-----+-----------+--+--+-----+-----+-----+--
| | | | I | J | K | A | | | X | Y | Z |
-+--+--+-----+-----+-----+-----+-----------+--+--+-----+-----+-----+--
| (A0) | (I+J) (A0)|
<--DR1 DR2--> STP-->
The object code for such a call would be:
LOAD, ACC, DR1, 3 \ I+J
ADD, ACC, DR1, 4 /
STR, ACC, STP, 2 -->
LDA, ACC, DR1, 5 \ name
STR, ACC, STP, 3 /
LOAD, ACC, DR1, 2 \ array name
STR, ACC, STP, 4 /
BAL, WK, , (start address of R)
If a variable of %INTEGERNAME type i.e. a parameter, is passed as the actual
parameter for another %INTEGERNAME parameter, the word containing the reference
to the original actual parameter is copied to the position for the new parameter
thereby maintaining just one level of indirection to reach the required address.
Implementation Details
Declarations of both scalar variables and arrays are processed as alternative
5 of phrase <STATEMENT> in file 'SKIMPB'. The routine 'PUSHTAG' which records
details of declarations for future use is described in the next chapter.
Relative addresses in the static area of any local name space are allocated
sequentially. However, routines may be nested and may appear in the middle of
the variable declarations of the enclosing routine. Hence, the next relative
address to be allocated must be retained for each textual level up to the
current level to allow for this. These values are stored in the array 'NEXTRAD'
which has an element for each possible textual level. Since the compiler
variable 'LEVEL' always contains the value of the current textual level, the
assignment of relative addresses always uses 'NEXTRAD(LEVEL)'.
Routine 'GET WORK' allocates locations in the local name space for temporary
storage of partial results, in this instance in connection with array
declarations. They are handled in exactly the same way as scalar declarations.
Routine entry and exit sequences are dumped by the routines 'ENTER' and 'DUMP
RETURN' which appear in file 'SKIMPD'. Note that 'ENTER' also handles the
'%BEGIN' statement since this is similar in many respects.
33
Chapter 6
Identifier Tag Handling
All identifiers in SKIMP either have to be declared or have to be specified
before they can be used. This is to enable the compiler to generate the
appropriate code when the identifier is later encountered. In practice this
means that the compiler must preserve information about each identifier as it is
declared or specified and this will be referred back to each time the identifier
is used. This information on each identifier is often called a 'tag'. There
are several different items of information which are packed together to form the
tag which are:
1. Form of identifier
2. Type of identifier
3. Number of parameters (or dimensions)
4. Textual level of declaration
5. Relative address or start address
1. The form of an identifier is intended to differentiate between scalar
variables and arrays, value and reference types of parameter, and data and
program objects.
2. The type of an identifier specifies whether it is %INTEGER in nature or
not. (In extensions to SKIMP the type would distinguish between %INTEGER, %REAL,
%COMPLEX, %STRING etc.)
The various forms and types in SKIMP as at present implemented are given the
code values shown below:
form type
%INTEGER 0 1
%INTEGERNAME 1 1
%INTEGERARRAY 2 1
%INTEGERARRAYNAME 3 1
%ROUTINE 4 0
%INTEGERFN 4 1
3. If SKIMP implemented arrays of more than one dimension, the number of
dimensions would have to be saved. As this is not the case this information is
redundant. The corresponding information for routines and functions is needed,
however. This is the number of parameters they take. Information about the
parameters occupies further tag words using a scheme described later.
4. The textual level of declaration of each name must be saved since this is
used to select the appropriate display base register for addressing purposes.
5. To complete the information needed for addressing variables and arrays,
the relative addresses assigned must be saved. For routines and functions the
address of their entry points must be stored so that the correct address can be
output for the BAL instructions.
These items are packed into a single word in the following format:
34
+--------+--------+----------+---------+----------------------+
| form | type | params | level | address |
+--------+--------+----------+---------+----------------------+
(4) (4) (4) (4) (16)
As an example, consider the program:
%BEGIN
%INTEGER I,J
%ROUTINE R(%INTEGERNAME K)
%INTEGERARRAY A(1: 10)
.......
%END
.......
%ENDOFPROGRAM
The tag words for these names would be:
+---+---+---+---+-----------+
I | 0 | 1 | 0 | 1 | 2 |
+---+---+---+---+-----------+
+---+---+---+---+-----------+
J | 0 | 1 | 0 | 1 | 3 |
+---+---+---+---+-----------+
+---+---+---+---+-----------+
R | 4 | 0 | 1 | 1 | (addr) |
+---+---+---+---+-----------+
+---+---+---+---+-----------+
K | 1 | 1 | 0 | 2 | 2 |
+---+---+---+---+-----------+
+---+---+---+---+-----------+
A | 2 | 1 | 0 | 2 | 3 |
+---+---+---+---+-----------+
Values of tags can be inspected in the output from the compiler by using the
option '/TAGS' when calling the compiler. They are printed out (in hexadecimal)
at the %END of every routine and function and at %ENDOFPROGRAM for the names
that were locally declared. An example of this can be found in appendix 1.
The packing of the tag word places some limits on the programs the compiler
can compile, for instance only up to fifteen routine parameters and only up to
fifteen textual levels, but these are not serious limits for this compiler. (In
fact, less than fifteen textual levels are permitted due to lack of registers.)
This word of information has to be made available whenever the name it
corresponds to is used. Since names are identified by a dictionary position
i.e. hash position, a further array is used in parallel. This contains pointers
to the list cells which contain the tag information. This is be illustrated in
figure 6.1 for the preceding example again.
35
+---+---+ TAG LINK
| | | v v
+---+---+ +---+---+---+---+------++------+
+-----------| | |------->| 2 | 1 | 0 | 2 | 3 || 0 |
| +---+---+ +---+---+---+---+------++------+
| | | |
+---+ | +---+---+ +---+---+---+---+------++------+
| I |<--|-----------| | |------->| 0 | 1 | 0 | 1 | 2 || 0 |
+---+ | +---+---+ +---+---+---+---+------++------+
| J |<--|-------+ | | |
+---+ | | +---+---+ +---+---+---+---+------++------+
| R |<--|---+ +---| | |------->| 0 | 1 | 0 | 1 | 3 || 0 |
+---+ | | +---+---+ +---+---+---+---+------++------+
| K |<--|---|---+ | | |
+---+ | | | +---+---+ +---+---+---+---+------++------+
| A |<--+ | +---| | |------->| 1 | 1 | 0 | 2 | 2 || 0 |
+---+ | +---+---+ +---+---+---+---+------++------+
| | | | | |
| | | +---+---+ +---+---+---+---+------++------+
+-------| | |------->| 4 | 0 | 1 | 1 |(addr)|| |--->
+---+---+ +---+---+---+---+------++------+ |
| | | <---------------------------------------+
| | | | +---+---+---+---+------++------+
+---+---+ +--->| 1 | 1 | 0 | 2 | 2 || 0 |
+---+---+---+---+------++------+
figure 6.1
The diagram also shows how the information on parameters is stored. This is
in further list cells beyond that for the name of the routine. In general, the
structure is as shown in figure 6.2. Note that there is now a duplication of
the tag information on the parameter, firstly, in the chain of cells after the
tag for the name of the routine and secondly in the dictionary position for the
parameter name. However, this second occurrence is only present during
compilation of the routine i.e. when the parameter name is in scope, whereas the
first occurrence is present whenever the routine name is in scope i.e. whenever
parameter information is needed to compile the routine calls.
| | |
+---+---+ +------+--+ +-------+--+ +-------+--+
| | |----->| name | |--->| par 1 | |--->| par 2 | |---> . . .
+---+---+ +------+--+ +-------+--+ +-------+--+
| | |
figure 6.2
Only those names which are in scope at the position in the program the
compiler is working should have a tag cell. A tag cell is therefore pushed down
whenever a name is declared and popped up again (together with all the parameter
tag cells for routine names) when it is 'undeclared' i.e. at the end of the
routine or function in which it was declared. This also allows for the
redeclaration of the same name at an inner nested routine level. The tag at the
head of the pushdown list is thus always the one for the latest declaration. If
the redeclaration is a routine name, of course, the original tag is beyond all
the parameter information cells in the list.
Consider the following program:
36
%BEGIN
%INTEGER I
%ROUTINE R
%INTEGERARRAY I(1:10)
......
%END
......
%ENDOFPROGRAM
The tags for this whilst the compiler is compiling routine 'R' will be:
| | |
+---+ +---+---+ +-+-+-+-+---+---+ +-+-+-+-+---+---+
| I |<-----| | |---->|2|1|0|2| 2 | |---->|0|1|0|1| 2 | |
+---+ +---+---+ +-+-+-+-+---+---+ +-+-+-+-+---+---+
| | | | |
2nd declaration 1st declaration
If when a call is popped up, nothing remains on the list i.e. this was not a
redeclaration, then the space in the name dictionary for that name can also be
recovered. This is the case since the dictionary works on a last-in-first-out
basis, a consequence of the nested structure of SKIMP programs. The question
arises as to how the compiler knows which tags to pop up when an %END statement
is reached. An obvious method would be to search through the dictionary for
tags having the appropriate textual level in the level field. A more efficient
method, however, which avoids searching, is to remember in a list the ones which
have to be popped up. Each cell in the list contains the dictionary position of
a name, so that this list can be scanned down at the %END statement and a
'popup' performed for every position mentioned in it (plus extra 'popups' for
routine parameter tag cells). Because of the nesting of routines in SKIMP a
separate list has to be maintained for each textual level. An array with one
position per textual level contains pointers to the heads of these lists as
shown in figure 6.3:
+---+ +---+---+ +---+---+ +---+---+
| |----------->| | |--->| | |--->| | |
+---+ +---+---+ +---+---+ +---+---+
| |-------+
+---+ | +---+---+
| |---+ +--->| | |
+---+ | +---+---+
| | |
| | | +---+---+ +---+---+
| | +------->| | |--->| | |
+---+---+ +---+---+
figure 6.3
Implementation Details
Lists of calls are frequently used by the compiler to maintain current status
information during compilation. Some are used as pushdown popup stacks but not
37
all. A list cell in this implementation consists of two words, one an
information word and the other a link word:
+---------------+---------------+
| information | link |
+---------------+---------------+
Information words are taken from an array named 'TAG' and link words from an
array named 'LINK'. Initially, all these pairs of words are formed into an
'Available Space List' or 'Free List' and thereafter cells are taken from this
list as required and returned to it when no longer needed. The variable
'TAGASL' always points to the head of the available space list:
+---+---+ +---+---+ +---+---+
TAGASL--->| | |--->| | |--->| | |---> . . .
+---+---+ +---+---+ +---+---+
By convention, the link of the last call in any list is set to zero. Links to
other list calls take the form of the index of the words in the 'TAG' and 'LINK'
arrays. These are therefore declared from 1 upwards to avoid conflict. Both of
the arrays are initialised in file 'SKIMPA' to set up the tags and links for
built-in routine names such as 'READ SYMBOL' etc. (The name dictionary is also
initialised for the same reason.)
A function 'NEWTAG', in file 'SKIMPD', takes the cell indicated by 'TAGASL'
from the available space list whenever a cell is required. 'TAGASL' is then
moved down the list to the next one. If ever 'TAGASL' is zero, the compiler
stops with a suitable monitor. The result of the 'NEWTAG' function is the index
of the cell in 'TAG' and 'LINK'. A common sequence of instructions might take
the form:
CELL = NEWTAG
TAG (CELL) = ....
LINK (CELL) = ....
A function named 'RETURNTAG',also in file 'SKIMPD', has the reverse effect.
It takes a single parameter which is the index of the cell to be returned to the
available space list. The result of the function is the value in the 'LINK'
word of the cell returned. This is purely a matter of programming convenience.
The two parallel arrays which form the hash dictionary of names and pointers
to tags are named 'NAMEDLINK' and 'TAGLINK' and point to the 'NAMED' and
'TAG/LINK' arrays respectively.
Routines 'PUSHTAG' and 'POPTAGS', in file 'SKIMPD', push and pop tags from
'TAGLINK' as required. 'PUSHTAG' is called with the identification number of the
name whose tag is being pushed together with it's form, type, etc. as
parameters. A subsidiary function of this routine is to check whether the same
name has already been declared at this level and monitor the fault if so.
'POPTAGS' is called at the %END of each routin e or function to pop all the tags
for names declared at that level, monitoring them if the '/TAGS' option was set.
The array which contains pointers to lists of names declared at each level is
named 'NAMELIST'.
38
Chapter 7
Compilation of Expressions
Although all the registers in the object machine can be used as accumulators
the machine is regarded, for the sake of simplicity, as a single-accummulator
machine and all expressions are evaluated in the one register 'ACC'. This
avoids the necessity of having register to register operations which would be
needed for a true multi-accumulator machine and avoids the extra complexity this
introduces for expression compilation.
A 'tree' algorithm is used in the SKIMP compiler. Many other methods are
possible but this one produces reasonably good object code without too many
redundant 'LOAD' and 'STR' operations. It is also straightforward to
comprehend. Expressions are regarded as trees with operators at the nodes and
operands as leaves. For example, the expression:
I * J - K / (L + M)
has the tree form shown in figure 7.1.
-
|
+-----+-----+
| |
* /
| |
+--+--+ +--+--+
| | | |
I J K +
|
+--+--+
| |
L M
figure 7.1
The algorithm examines the 'shape' of the tree in order to decide the best
order of evaluation of the expression. The commutativity of the operators, that
is, whether 'X?Y' is the same as 'Y?X' or not, is taken into account and also
whether the sub-trees of operator nodes are leaves or not. When the sub-tree is
a leaf, that operand is capable of being immediately loaded into 'ACC' or can
operate immediately on 'ACC'.
When the sub-tree is not a leaf i.e. a further tree, the algorithm works
recursively to compile it. The algorithm can be tabulated for the various
combinations possible, the whole thing being regarded as 'evaluate tree' whose
object is to dump code which puts the result in 'ACC'. The table is shown in
figure 7.2.
This algorithm would not be acceptable if the order of evaluation was defined
as part of the language, for instance strictly left to right. As an example,
the object code is improved by evaluating the second operand first for
non-commutative operators. To illustrate the working of the algorithm, consider
the expression which was used above:
I * J - K / (L + M)
39
+--------------+------------+----------------+-------------------------+
| left branch | operator | right branch | actions |
+--------------+------------+----------------+-------------------------+
| leaf 1 | all | leaf 2 | LOAD,ACC,leaf 1 |
| | | | (operation),ACC,leaf 2 |
+--------------+------------+----------------+-------------------------+
| tree | all | leaf | evaluate tree |
| | | | (operation),ACC,leaf |
+--------------+------------+----------------+-------------------------+
| leaf | comm. | tree | evaluate tree |
| | | | (operation),ACC,leaf |
+--------------+------------+----------------+-------------------------+
| leaf | non-comm. | tree | evaluate tree |
| | | | STR,ACC,(work) |
| | | | LOAD,ACC,leaf |
| | | | (operation),ACC,(work) |
+--------------+------------+----------------+-------------------------+
| tree 1 | all | tree 2 | evaluate tree 2 |
| | | | STR,ACC,(work) |
| | | | evaluate tree 1 |
| | | | (operation),ACC,(work) |
+--------------+------------+----------------+-------------------------+
figure 7.2
The steps, with indentation for recursive depth are shown in below. The
'work' locations used for partial results are in the local name space for the
current routine as was described in chapter 5 in connection with array
declarations.
evaluate tree [ I*J-K/(L+M) ]
tree [ I*J ] all [ - ] tree [ K/(L+M) ]
evaluate tree [ K/(L*M) ]
leaf [ K ] non-comm [ / ] tree [ L+M ]
evaluate tree [ L+M ]
leaf [ L ] all [ + ] leaf [ M ]
LOAD,ACC, L
ADD,ACC, M
STR,ACC, work 1
LOAD,ACC, K
DIV,ACC, work 1
STR,ACC, work 1
evaluate tree [ I*J ]
leaf [ I ] all [ * ] leaf [ J ]
LOAD,ACC, I
MLT,ACC, J
SUB,ACC, work 2
The case of array elements and function calls as operands must be considered
in more detail. These cannot be regarded as leaves of the tree since they
involve further computation which needs register 'ACC'. They are therefore
treated as trees from the point of view of the algorithm. Array element
accessing puts the value of the element in 'ACC' and the result of functions is
also left in 'ACC'. This satisfies the requirements of 'evaluate tree'.
Variables of type %INTEGERNAME i.e. parameters, can still be regarded as
leaves by making use of another index register, 'WK', for the indirect
40
reference. Constants are leaves of trees and, in general, are accessed from the
table of constants following the object code using register 'COT' as the base.
The use of a location in the constant table is avoided if a 'LOAD' of the
constant is required and the value is less than 65536. This is because the
'LDA' instruction can be used with the index register set to zero. For
example,an instruction to 'LOAD' the value 1234 would be:
LDA, ACC, , 1234
The tree representation is that of Reverse Polish form together with
pointers. In Reverse Polish notation the operator follows the operands. In
general,
operand 1 operator operand 2
becomes:
operand 1 operand 2 operator
and the same transformation takes place for the operands when they are
expressions. For example, the infix expression:
I * (J + K) / L - M
becomes:
I J K + * L / M -
The extra pointers that were mentioned come after each operator and
correspond to the branches of the tree. In general the form is:
<-------------------------+
<--------------|----------------------+ |
| | | |
+-----------+ +-----------+ +----------+---+---+
| operand 1 | | operand 2 | | operator | | |
+-----------+ +-----------+ +----------+---+---+
When an operand is a further sub-expression, the pointer points to the
operator in it's repesentation. Otherwise, the pointer points to the first word
of a pair of words which describe the operand. The first of the pair is a type
number and the second is consequent on the value of the first. The type values
are: ^operators
|
-1 : function call
-2 : array element
leaves/ -3 : scalar variable
\ -4 : constant
They are negative to distinguish them from values which denote operators and
hence sub-trees. The values for operators are their alternative numbers from
phrase <OP> (with two additional values beyond for the unary operators '-' and
'\') and are thus always positive. Values greater than or equal to -2 cause the
algorithm to be called recursively, a requirement which was mentioned above.
For types -1 and -2, the second word of the pair contains an analysis record
pointer which will allow the function call or array element accession object
code to be generated when it is required. For type -3, the second word is the
41
tag of the variable and for type -4, the second word is the value of the
constant. As an example, consider the expression:
I + 12345
The representation of this is:
<------------------------+
<--------------|-------------------+ |
| | | |
+------+--------+------+-------+-----+-----+-----+
| -3 | tag(I) | -4 | 12345 | 9 | 1 | 3 |
+------+--------+------+-------+-----+-----+-----+
An example to show sub-trees might be:
(I + 99) * (J - 88)
The representation of this is:
<--------------------+
<-------------------------------|----------------+ |
<-------|-------+ <--------|--------+ | |
<---------|--------|---+ | <---------|---------|---+ | | |
| | | | | | | | | | | |
+---+------+---+----+---+---+---+---+------+---+----+----+---+----+---+---+----+
|-3 |tag(I)|-4 | 99 | 9 | 1 | 3 |-3 |tag(J)|-4 | 88 | 10 | 8 | 10 | 8 | 5 | 12 |
+---+------+---+----+---+---+---+---+------+---+----+----+---+----+---+---+----+
The technique used to convert infix expressions to this form combines a
method of converting them to Reverse Polish with a scheme of pseudo-evaluation
to calculate the values of the pointers. The conversion to Reverse Polish is
shown diagrammatically in figure 7.3. A pushdown stack is used for operators to
take account of their precedence. This delays their storage until the
appropriate place in the output. A new stack is declared at each recursive
level to deal with bracketed sub-expressions. The conversion of the
expression:
I + J * (K - L) / M
would proceed as shown in figure 7.4. The dotted box indicates the recursive
call to deal with '(K-L)'.
When operators and operands are 'stored' according to the flow diagram, a
pseudo-evaluation takes place. To explain what is meant by 'pseudo'-evaluation.
first consider an 'evaluation' of a reverse Polish expression. A pushdown stack
is used to retain operand values and partial results during the evaluation.
Proceeding from left to right, whenever an operand is encountered it's value is
pushed down. Whenever a (binary) operator is encountered, the top two items on
the stack are popped and the operation between them is performed. The result is
then pushed back onto the stack. (For unary operators, just one item is popped
from the stack). Proceeding in this way. when the whole of the reverse Polish
string has been dealt with, the value of the whole expression is what remains in
the stack. For example:
10 + 8 * (6 - 4) / 2
42
to reverse Polish |
+ - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - +
|
| +---------+-------+ |
<--------| unary operator ?|
| | yes +-----------------+ |
+----------+ | no
| +-->| stack op |------------->| |
| +----------+ +--------+--------+ + - - - - - -+
| | | next operand a |---->| to reverse | |
| | sub-expression ?| yes | Polish |
| | +--------+--------+ + - - - - - -+ |
| | no |
| | +-------+-------+ | |
| | store operand | |
| | +-------+-------+ | |
| |<-------------------+
| | +---------+---------+ +-----------------+ |
| | another operator ?|--->| unstack & store |----->
| | +---------+---------+ no | operators left | |
| | yes +-----------------+
| | +----------+----------+ |
+------------------| new operator higher |<----+
| yes | precedence ? | | |
+----------+----------+ |
| | no | |
+--------+--------+ |
| | unstack & store |-------+ |
| operator |
| +-----------------+ |
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
figure 7.3
input op stack action cumulative output
I empty store I I
+ + stack + I
J + store J IJ
* +* stack * IJ
+ - - - - - - - - - - - - - - - - - - - - +
| K empty store K | IJK
| - - stack - | IJK
| L - store L | IJKL
| empty unstack & store - | IJKL-
+ - - - - - - - - - - - - - - - - - - - - +
/ + unstack & store * IJKL-*
+/ stack /
M + store M IJKL-*M
+ unstack & store / IJKL-*M/
empty unstack & store + IJKL-*M/+
figure 7.4
43
which in reverse Polish is:
10 8 6 4 - * 2 / +
The stages of evaluation in a pushdown stack are
| | | | | | | | | | | | | | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | | | | | 4 | | | | | | | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | | | 6 | | 6 | | 2 | | | | 2 | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | 8 | | 8 | | 8 | | 8 | | 16 | | 16 | | 8 | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
| 10 | | 10 | | 10 | | 10 | | 10 | | 10 | | 10 | | 10 | | 18 |
+----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
In a 'pseudo'-evaluation, it is not the values of operands that are pushed down
but some other property. In this case, it is positions in the representation of
the tree. In place of evaluation when operators occur, the positions of
operands are popped up and it is these values which are appended to the tree
representation. The position of the operator is then pushed down in their
place. Consider the example used above:
(I + 99) * (J - 88)
which in reverse Polish is:
I 99 + J 88 - *
The successive contents of the pseudo-evaluation stack will be:
| | | | | | | | | | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | | | | | | | | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | | | | | | | 10 | | | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+
| | | 3 | | | | 8 | | 8 | | 12 | | |
+----+ +----+ +----+ +----+ +----+ +----+ +----+
| 1 | | 1 | | 5 | | 5 | | 5 | | 5 | | 15 |
+----+ +----+ +----+ +----+ +----+ +----+ +----+
The representations of the trees corresponding to all the expressions in a
program can be monitored by using the parameter '/EXPR' when calling the
compiler. Examples can be found in appendix 1.
Implementation Details
The routine concerned with the compilation of expressions is named 'EXPR' and
can be found in file 'SKIMPE'. Within 'EXPR', subroutine 'TO TREE' produces the
linear tree representation using a further subroutine 'PS EVAL' for the
pseudo-evaluation part. It also has the subsidiary function of checking that
all the names used in the expression have been declared and are of an
appropriate type, for instance, not routine names.
The tree code generating algorithm is implemented by routine 'EVALUATE' which
makes use of a subroutine 'OPN' to dump primitive operations i.e. routine calls.
array and variable accesses and constant handling.
44
A minor complication of routine 'EXPR' deals with the expressions involved
with simple comparisons in conditional statements. This is because a comparison
such as:
expr1 <COMP> expr2
is compiled as an evaluation of 'expr1 - expr2' and then the value tested
against zero. An external flag named 'CONDFLAG' is set in these circumstances
and its effect is form a tree with branches representing each expression the
topmost node being a '-'. The algorithm then proceeds as normal.
45
Chapter 8
Conditional Statements
The general form of conditional statement is:
%IF condition %THEN instruction 1 %ELSE instruction 2
where the 'instructions' include the case of a group of statements surrounded by
%START and %FINISH. In compiling this, the aim is to produce object code of the
form:
+---------------+
| condition |
+---------------+
->FALSE if false
+---------------+
| instruction 1 |
+---------------+
->NEXT
FALSE: +---------------+
| instruction 2 |
+---------------+
NEXT:
where the box containing 'condition' implies object code instructions which
evaluate the truth or falsity of the condition for subsequent testing. When the
%ELSE clause is absent, this simplifies to the form:
+---------------+
| condition |
+---------------+
->FALSE if false
+---------------+
| instruction 1 |
+---------------+
FALSE:
In addition, since the truth or falsity of the whole condition can sometimes be
determined without evaluating all the simple comparisons, there may be
intermediate jumps out from the middle of the condition to the consequent
instructions. For example, consider;
%IF sc1 %AND (sc2 %OR sc3) %THEN instr 1 %ELSE instr 2
This would produce code of the form shown in the next diagram. Analagously with
arithmetic expressions, however, the language may define conditions to be
evaluated 'in toto' as Boolean expressions which would preclude jumping out.
Fortunately, SKIMP does not require this.
A number of observations can be made with a view to producing a
straightforward simple compilation algorithm. The first concerns whether the
jump following each simple comparison in the condition should be on true or
false. Consider the case:
%IF sc1 %AND sc2 %AND sc3 %THEN instr 1 %ELSE instr 2
46
+ - - - - - - - - - -+
| +----------+ |
| | sc 1 | |
| +----------+ |
| ->FALSE if false |
| +----------+ |
| | sc 2 | |
| +----------+ |
| ->TRUE if true |
| +----------+ |
| | sc 3 | |
| +----------+ |
+ - - - - - - - - - -+
->FALSE if false
TRUE: +--------------------+
| instr 1 |
+--------------------+
->NEXT
FALSE: +--------------------+
| instr 2 |
+--------------------+
NEXT:
The code in this second case should be of the form:
+----------------+
| sc 1 |
+----------------+
->FALSE if false
+----------------+
| sc 2 |
+----------------+
->FALSE if false
+----------------+
| sc 3 |
+----------------+
->FALSE if false
+----------------+
| instr 1 |
+----------------+
->NEXT
FALSE: +----------------+
| instr 2 |
+----------------+
NEXT:
Next consider:
%IF sc1 %OR sc2 %OR sc3 %THEN instr 1 %ELSE instr 2
This should produce the code in the diagram below. It will be apparent that if
an %AND follows the simple comparison, the jump is on false and if an %OR
follows, the jump is on true. To make this rule consistent throughout, since
the jump following the last simple comparison ts always on false we can imagine
an %AND following the last simple comparison:
%IF ( condition ) %AND %THEN instr 1 %ELSE instr 2
47
+----------------+
| sc 1 |
+----------------+
->TRUE if true
+----------------+
| sc 2 |
+----------------+
->TRUE if true
+----------------+
| sc 3 |
+----------------+
->FALSE if false
TRUE: +----------------+
| instr 1 |
+----------------+
->NEXT
FALSE: +----------------+
| instr 2 |
+----------------+
NEXT:
The next observation concerns the destination of the jump. This is not
always to one of the unconditional instructions. It can be to a later simple
comparison. For example:
%IF ((sc1 %AND sc2) %OR sc3) %AND sc4 %THEN instr 1 %ELSE instr 2
This should produce code of the form:
+----------------+
| sc 1 |
+----------------+
->FALSE1 if false
+----------------+
| sc 2 |
+----------------+
->TRUE if true
FALSE1:+----------------+
| sc 3 |
+----------------+
->FALSE2 if false
TRUE: +----------------+
| sc 4 |
+----------------+
->FALSE2 if false
+----------------+
| instr 1 |
+----------------+
->NEXT
FALSE2:+----------------+
| instr 2 |
+----------------+
NEXT:
For a simple comparison followed by an %AND, the jump is to the simple
comparison following the next %OR at an outer bracketting level. For a simple
comparison followed by an %OR, the jump is to the simple comparison following
48
the next %AND at an outer bracketting level. This can again be made consistent
if the conditional statement is regarded as of the form:
%IF ( ( condition ) %AND %THEN instr 1 ) %OR %ELSE instr 2
To demonstrate this, we could redraw the example above showing bracketting
levels with each simple comparison (or instruction) at the level of the
following %AND or %OR:
sc1 %AND
|
| false sc2 %OR
| |
| |true sc3 %AND
| | | |
+-------------|-------> |false sc4 %AND
| | | |
+-----------|--------> |
| |false instr1 %OR
| |
+------------+---------------------->instr2
A further example might be:
%IF (sc1 %OR sc2 %OR sc3) %AND (sc4 % OR sc5) %THEN instr1 %ELSE instr2
The organisation required is:
sc1 %OR cs2 %OR sc4 %OR
| | | |
|true |true sc3 %AND | | true
| | | | |
| | |false | | sc5 %AND
| | | | | |
+---------+-----------|-------> | |false
| | |
| +--------|-------->instr1 %OR
| |
+--------------------+-------------------->instr2
In compiling conditional statements, the plan of action is to make several
passes over the condition, working out the structure required step by step and
finally, in a last pass, generating the object code. Five parallel arrays are
used, each with one position per simple comparison. The first pass traverses
the analysis record and locates the positions of all the simple comparisons.
Simultaneously, the nesting levels and truth or falsity of the branches is
determined and also stored. The next pass goes over this information to
determine the destinations of the branches using the rules of thumb described
above. For those simple comparisons which are the destination of branches, the
next pass allocates 'private' labels to which branches can be compiled using the
normal methods used elsewhere in the compiler (see chapter 9). Private labels
are just values allocated upwards from 10000, which allows them to be
distinguished from ordinary labels. For the sake of consistency, the
instructions made conditional are also allocated positions in the arrays and are
regarded as being at nesting levels -1 and -2. The rules can then be applied
uniformly throughout.
When the instruction made conditional is itself a branch, the object is
optimised slightly to allow branches to be made direct to the required place
49
from within the condition instead of having a branch to a branch. For example:
%IF sc1 %OR sc2 %OR sc3 %THEN ->100 %ELSE instr
The code produced takes the form:
+-------------+
| sc 1 |
+-------------+
->100 if true
+-------------+
| sc 2 |
+-------------+
->100 if true
+-------------+
| sc 3 |
+-------------+
->100 if true
+-------------+
| instr |
+-------------+
This is easily achieved by using the required label instead of a private label
to label the conditional branch instruction in the branch array. The only other
modification is to ensure that the branch on the previous i.e.last, simple
comparison is on true rather than false and to the label specified. This should
be clearer when the implementation details are discussed.
The contents of the five arrays can be displayed during compilation by using
the parameter '/COND'. Examples will be found in appendix 1.
%START-%FINISH Groups
When either of the instructions made conditional is a %START-%FINISH group,
the destinations of the final branches are not immediately known. This can only
be done after the %FINISH has appeared. Information therefore has to be
preserved during the compilation of the group. Since the group may itself
contain further conditional statements and hence nested %START-%FINISH groups, a
pushdown list is used for the purpose. Furthermore, since routine bodies may
also appear anywhere and be nested, a list will potentially be required for each
textual level of the program. An array of headcells to lists is used as
previously to achieve this:
+----+ +---+---+ +---+---+
| |------>| | |---->| | |
+----+ +---+---+ +---+---+
| |---+
+----+ | +---+---+
| | +---| | |
+----+ +---+---+
| |
Two items of information need to be stored for each group. Firstly, a marker
which distinguishes between groups before and those after %ELSE keywords and
secondly, a label number assigned to the destination of branches around the
group. This will be a private label.
50
Implementation Details
The code generation for conditions is done by integerfn 'COND' in file
'SKIMPC'. This is called from the section of 'STATEMENT' in file 'SKIMPB' which
deals with conditional statements. It is at this latter place that the code for
the instructions made conditional is generated by calling routine 'INSTR'. In
order to deal with the branch instruction optimisation, two parameters in
addition to the parameter which indicates where the condition starts in the
analysis record are passed in. These give the label numbers of the branches or
-1 if either of then is a non-branch instruction. These are then used instead
of private labels as will be seen below. The result of this function is the
private label number of the label to be put in front of the %ELSE clause.
%START-%FINISH groups are dealt with after the condition code has been
generated and also where %FINISH statements are processed. Two subroutines are
used to maintain the pushdown lists. These are 'PUSHSTART' and 'POPSTART' and
can be found in file 'SKIMPD'.
The five parallel arrays within function 'COND' are named 'TESTP', 'LEVEL',
'ANDOR', 'BRANCH' and 'LABEL'. They hold. respectively, the analysis record
positions of the simple comparisons (or 'tests'), the nesting levels of the
simple comparisons, the truth or falsity of the jumps(1 for false and 2 for
true), the destination of the jumps (in terms of column position in the table)
and the labels assigned to simple comparisons. For example, consider:
%IF (sc1 %OR sc2 %OR sc3) %AND (sc4 %OR sc5) %THEN instr1 %ELSE instr2
The contents of the arrays will be:
1 2 3 4 5 6 7
TESTP ap1 ap2 ap3 ap4 ap5
LEVEL 1 1 0 1 -1 -2
ANDOR 2 2 1 2 1 2
BRANCH 4 4 7 6 7
LABEL -1 -1 -1 10000 -1 10002 10001
where the 'ap?' represent analysis record positions. The first pass over the
statement is performed by the subroutine 'PROCESSCOND' and fills in rows
'TESTP', 'LEVEL' and 'ANDOR'. It works recursively following the recursive
definition of conditions. Subsequently, two extra columns are filled in with -1
and -2 for the pseudo-%AND and the pseudo-%OR. On the next pass the row
'BRANCH'is filled in and following that labels are assigned as necessary. (The
-1 indicates that no jump takes place to this simple comparison and hence no
label is required. The final pass generates the object code by calling 'EXPR'
with the 'CONDFLAG' set as described in the previous chapter to evaluate LHS
minus RHS, and dumping the appropriate intermediate jump instructions.
Another example may help to clarify the picture further . Consider the case:
%IF sc1 %OR ((sc2 %OR sc3) %AND sc4 %AND sc5) %THEH instr1 %ELSE instr2
The structure of the statement is:
51
sc2 %OR
|
|true sc3 %AND sc4 %AND
| | | |
sc1 %OR | |false | |false
| | | | |
| true +-----------|------> | sc5 %AND
| | | |
| | | |false
| | | |
+----------------------|----------|-----------|-------->instr1 %OR
| | |
+----------+-----------+---------------------->instr2
The table for this statement will be:
1 2 3 4 5 6 7
TESTP ap1 ap2 ap3 ap4 ap5
LEVEL 0 2 1 1 -1 -2
ANDOR 2 2 1 1 1 2
BRANCH 6 4 7 7 7
LABEL -1 -1 -1 10004 -1 10003 10005
As a final example, consider a statement containing conditional branches:
%IF sc1 %OR sc2 %OR sc3 %THEN ->88 %ELSE ->99
The table will be:
1 2 3 4 5
TESTP ap1 ap2 ap3
LEVEL 0 0 -1 -2
ANDOR 2 2 2 2
BRANCH 4 4 4
LABEL -1 -1 -1 88 99
52
Chapter 9
Labels and Jumps
As the SKIMP compiler is a one-pass compiler, jumps to labels not yet
encountered cannot be completely generated. Backward jumps are, of course, no
problem. For the forward jumps, all that can be done is to leave a hole in the
branch instruction dumped which can be filled in at some later stage. In this
case, the assembler which forms the first stage of the 'SKIMPAI' interpreter
system has the job to do. In order for this to be possible, information must be
recorded somewhere concerning the positions of the holes to be filled in and
what to fill them in with. The method adopted is to use the holes themselves as
links in lists which relate to each label and for the compiler to dump 'FILL'
assembler directives which are really the headcells to these lists and which
also include the value to be filled in. The 'FILL' directives are dumped
whenever a label which has been previously referenced is encountered. An
example should make this clear. Consider the program:
. . . .
->88
. . . .
->99
. . . .
->99
. . . .
99: . . . .
. . . .
The object code dumped will take the form:
. . . . . .
+------+
12$ B,,,| 0 |
+------+
. . . . . .
+------+
34$ B,,,| 12 |
+------+
. . . . . .
+------+
56$ B,,,| 34 |
+------+
As labels are local to each level of routine nesting, once again separate
lists must be used for each level. Each cell of a list in this case contains
three items of information, the label number, a flag and an address:
+---------------+------+-----------+
| label | flag | address |
+---------------+------+-----------+
(16) (1) (15)
53
The flag indicates whether the address is that of the latest forward reference
to the label i.e. the current top of the forward reference list, or the actual
address of the label which will be needed for any backward references. The
value 1 indicates the former and 0 the latter. When additional forward
references appear therefore, the address is pushed down 'through' this cell.
A cell of this form is created whenever a label is encountered for the first
time - either as a label or as a jump. The list for the current textual level
is first searched and a new cell added if no existing cell refers to that label.
The flag allows two vital checks to be made. Firstly that the label has not
occurred before (as a label) and secondly at the end of the routine that the
label has in fact appeared.
Routine calls use the 'BAL' instruction and are always backward references
since SKIMP does not allow routine %SPECs. The address of the routine body is,
in other words, already known when any routine calls are made. SKIMP allows
routines to be placed anywhere in a program. In particular at the head of
surrounding routines. Jumps around these routine bodies must therefore be
dumped. These are forward references but are simple in that only one backward
reference will occur. They are therefore treated specially without having to
invent a private label.
Calls on the built-in input-output routines are also treated specially.
Since these routines are implemented in the 'SKIMPAI' assembler/interpreter, the
'BAL' instructions dumped contain the marker 'EXT' and a number which refers to
the routine required. Were external references part of the language, the names
themselves would have to appear in the object file and likewise for any external
variables in the program. A linker would then resolve the references between
programs.
Implementation Details
The routines which accomplish the functions decrlbed above are named 'FILL
LABEL' and 'FILL BRANCH', in file 'SKIMPD'. A function 'GET LABEL' extracts a
label from the analysis record and checks that it is valid i.e.less than 10000.
'POP LABELS' pops up the cells and checks for missing labels at %END (and
%ENDOFPROGRAM).
54
Appendix 1
Examples of SKIMP Object Code Files
55
LEX, ANAL example
SKIMP COMPILER MKII
FILE: TESTL
OPTIONS: LEX ANAL
%begin
130
(1/STATEMENT) 8
0$ LDA,COT,,0
1$ LDA,DR1,,0
2$ LDA,STP,DR1,0
%integerfn r
136 134 256 82
(1/STATEMENT) 6 5 6 8 (5/PROC) 2 (6/NAME) 1
82 (8/FORMAL)
3$ B,,,0
4$ STR,DR2,STP,0
5$ LDA,DR2,STP,0
6$ STR,WK,STP,1
7$ LDA,STP,STP,0
%integer i,j,k
136 256 73 44 256 74 44 256 75
(1/STATEMENT) 5 3 (3/ARRAY) 2 6 8 (6/NAME) 1
73 (8/NAMES) 1 11 13 (11/NAME) 1 74 (13/NAMES) 1
16 18 (16/NAME) 1 75 (18/NAMES) 2
i=j+k
256 73 61 256 74 43 256 75
(1/STATEMENT) 1 3 (3/INSTR) 1 7 9 10 (7/NAME) 1
73 (9/ACTUAL) 2 (10/ASSIGN) 1 12 (12/EXPR) 1
16 17 23 (16/UNARY) 4 (17/OPERAND) 1 20 22
(20/NAME) 1 74 (22/ACTUAL) 2 (23/EXPRREST) 1 27
28 34 (27/OP) 9 (28/OPERAND) 1 31 33 (31/NAME) 1
75 (33/ACTUAL) 2 (34/EXPRREST) 2
8$ LOAD,ACC,DR2,3
9$ ADD,ACC,DR2,4
10$ STR,ACC,DR2,2
%if i>1234 %then %stop
135 256 73 62 257 1234 145 144
56
(1/STATEMENT) 2 5 36 37 (5/COND) 1 8 35 (8/TEST) 1
12 24 25 (12/EXPR) 1 16 17 23 (16/UNARY) 4
(17/OPERAND) 1 20 22 (20/NAME) 1 73 (22/ACTUAL) 2
(23/EXPRREST) 2 (24/COMP) 6 (25/EXPR) 1 29 30
34 (29/UNARY) 4 (30/OPERAND) 2 32 (32/CONST) 1
1234 (34/EXPRREST) 2 (35/CONDREST) 3 (36/INSTR) 6
(37/ELSE) 2
11$ LOAD,ACC,DR2,2
12$ SUB,ACC,COT,0
13$ BNG,ACC,,0
14$ STOP,,,0
15$ FILL,10000,13,15
%results=i+4321
141 61 256 73 43 257 4321
(I/STATEMENT) 1 3 (3/INSTR) 5 5 (5/EXPR) 1
9 10 16 (9/UNARY) 4 (10/OPERAND) 1 13 15 (13/NAME) 1
73 (15/ACTUAL) 2 (16/EXPRREST) 1 20 21 25 (20/OP) 9
(21/OPERAND) 2 23 (23/CONST) 1 4321 (25/EXPRREST) 2
15$ LOAD,ACC,DR2,2
16$ ADD,ACC,COT,1
17$ LDA,STP,DR2,0
18$ LOAD,DR2,STP,0
19$ LOAD,WK,STP,1
20$ B,,WK,0
%end
131
(1/STATEMENT) 7 3 (3/OFPROG) 2
21$ FILL,ALLOC,7,5
21$ STOP,,,0
22$ FILL,SKIP,3,22
%endofprogram
131 139
(1/STATEMENT) 7 3 (3/OFPROG) 1
22$ FILL,ALLOC,2,2
22$ STOP,,,0
23$ FILL,COT,0,23
23$ CONST...1234
24$ CONST,,,4321
25$ FILL,STACK,1,25
$ 0 FAULTS IN PROGRAM
57
EXPR example
SKIMP COMPILER MKII
FILE: TESTE
OPTIONS: EXPR
%begin
0$ LDA,COT,,0
1$ LDA,DR1,,0
2$ LDA,STP,DR1,0
%integer i,j,k,l
%integerarray a(1:10)
-4* 1
3$ LDA,ACC,,1
4$ STR,ACC,DR1,6
-4* 10
5$ LDA,ACC,,10
6$ LDA,ACC,ACC,1
7$ STR,ACC,DR1,7
8$ SUB,STP,DR1,6
9$ STR,STP,DR1,8
10$ ADD,STP,DR1,7
i=j+k
+
-3 01010003 -3 01010004 9* 1 3
11$ LOAD,ACC,DR1,3
12$ ADD,ACC,DR1,4
13$ STR,ACC,DR1,2
a(j+k)=i*l-j*k
+
-3 01010002 -3 01010005 8 1 3 -3 01010003 -3 01010004
8 8 10 10* 5 12
* -
14$ LOAD,ACC,DR1,3
15$ MLT,ACC,DR1,4
16$ STR,ACC,DR1,6
17$ LOAD,ACC,DR1,2
18$ MLT,ACC,DR1,5
19$ SUB,ACC,DR1,6
20$ STR,ACC,DR1,6
+
-3 01010003 -3 01010004 9* 1 3
21$ LOAD,ACC,DR1,3
22$ ADD,ACC,DR1,4
23$ ADD,ACC,DR1,8
24$ LOAD,WK,DR1,6
25$ STR,WK,ACC,0
58
i=i*(j+k)/(l-i**2)
+ *
-3 01010002 -3 01010003 -3 01010004 9 3 5 8 1 7
-3 01010005 -3 01010002 -4 2 6 15 17 10 13 19 7*
10 22 ** -
26$ LOAD,ACC,DR1,2
27$ EXP,ACC,COT,0
28$ STR,ACC,DR1,6
29$ LOAD,ACC,DR1,5
30$ SUB,ACC,DR1,6
31$ STR,ACC,DR1,6
32$ LOAD,ACC,DR1,3
33$ ADD,ACC,DR1,4
34$ MLT,ACC,DR1,2
35$ DIV,ACC,DR1,6
36$ STR,ACC,DR1,2
i=a(j)+a(k)
+
-2 17 -2 43 9* 1 3
-3* 01010004
37$ LOAD,ACC,DR1,4
38$ ADD,ACC,DR1,8
39$ LOAD,ACC,ACC,0
40$ STR,ACC,DR1,6
-3* 01010003
41$ LOAD,ACC,DR1,3
42$ ADD,ACC,DR1,8
43$ LOAD,ACC,ACC,0
44$ ADD,ACC,DR1,6
45$ STR,ACC,DR1,2
%endofprogram
46$ FILL,ALLOC,2,9
46$ STOP,,,0
47$ FILL,COT,0,47
47$ CONST,,,2
48$ FILL,STACK,1,48
$ 0 FAULTS IN PROGRAM
59
TAGS Example
SKIMP COMPILER MKII
FILE: TESTT
OPTIONS: TAGS
%begin
0$ LDA,COT,,0
1$ LDA,DR1,,0
2$ LDA,STP,DR1,0
%routine a(%integer i,j,k)
3$ B,,,0
4$ STR,DR2,STP,0
5$ LDA,DR2,STP,0
6$ STR,WK,STP,1
7$ LDA,STP,STP,0
%end
8$ FILL,ALLOC,7,5
75 K 01020004
74 J 01020003
73 I 01020002
8$ LDA,STP,DR2,0
9$ LOAD,DR2,STP,0
10$ LOAD,WK,STP,1
11$ B,,WK,0
12$ FILL,SKIP,3,12
%integerfn b(%integername l)
12$ B,,,0
13$ STR,DR2,STP,0
14$ LDA,DR2,STP,0
15$ STR,WK,STP,1
16$ LDA,STP,STP,0
%routine c(%integerarrayname m,n)
17$ B,,,0
18$ STR,DR3,STP,0
19$ LDA,DR3,STP,0
20$ STR,WK,STP,1
21$ LDA,STP,STP,0
%end
22$ FILL,ALLOC,21,4
78 N 31130003
77 M 31130002
62
22$ LDA,STP,DR3,0
23$ LOAD,DR3,STP,0
24$ LOAD,WK,STP,1
25$ B,,WK,0
26$ FILL,SKIP,17,26
%end
26$ FILL,ALLOC,16,3
67 C 40220012
31130002
31130003
76 L 11020002
26$ STOP,,,0
27$ FILL,SKIP,12,27
%integer i,j
%endofprogram
27$ FILL,ALLOC,2,4
74 J 01010003
73 I 01010002
66 B 4111000D
11020002
65 A 40310004
01020002
01020003
01020004
27$ STOP,,,0
28$ FILL,COT,0,28
28$ FILL,STACK,1,28
$ 0 FAULTS IN PROGRAM
63
Basic Example
SKIMP COMPILER MKII
FILE: HANOI
OPTIONS:
%begin
0$ LDA,COT,,0
1$ LDA,DR1,,0
2$ LDA,STP,DR1,0
%routine hanoi(%integer n,p1,p2)
3$ B,,,0
4$ STR,DR2,STP,0
5$ LDA,DR2,STP,0
6$ STR,WK,STP,1
7$ LDA,STP,STP,0
%if n>0 %then %start
8$ LOAD,ACC,DR2,2
9$ BNG,ACC,,0
hanoi(n-1,p1,6-p1-p2)
10$ LOAD,ACC,DR2,2
11$ SUB,ACC,COT,0
12$ STR,ACC,STP,2
13$ LOAD,ACC,DR2,3
14$ STR,ACC,STP,3
15$ LDA,ACC,,6
16$ SUB,ACC,DR2,3
17$ SUB,ACC,DR2,4
18$ STR,ACC,STP,4
19$ BAL,UK,,4
write(p1,1) ;
20$ LOAD,ACC,DR2,3
21$ STR,ACC,STP,2
22$ LDA,ACC,,1
23$ STR,ACC,STP,3
24$ BAL,WK,EXT,11
write(p2,1) ;
25$ LOAD,ACC,DR2,4
26$ STR,ACC,STP,2
27$ LDA,ACC,,1
28$ STR,ACC,STP,3
29$ BAL,HK,EXT,11
newline
30$ BAL,WK,EXT,7
64
hanoi(n-1,6-p1-p2,p2)
31$ LOAD,ACC,DR2,2
32$ SUB,ACC,COT,O
33$ STR,ACC,STP,2
31$ LDA,ACC,,6
35$ SUB,ACC,DR2,3
36$ SUB,ACC,DR2,4
37$ STR,ACC,STP,3
38$ LOAD,ACC,DR2,4
39$ STR,ACC,STP,4
40$ BAL,WK,,4
%finish
11$ FILL,10000,9,41
%end
41$ FILL,ALLOC,7,5
41$ LDA,STP,DR2,0
42$ LOAD,DR2,STP,0
43$ LOAD,WK,STP,1
44$ B,,WK,0
45$ FILL,SKIP,3,45
%integer a,b,c
1:read(a)
45$ LDA,ACC,DR1,2
46$ STR,ACC,STP,2
47$ BAL,WK,EXT,10
%if a=0 %then %stop
48$ LOAD,ACC,DR1,2
49$ BNZ,ACC,,0
50$ STOP,,,0
51$ FILL,10001,49,51
read(b) ;
51$ LDA,ACC,DR1,3
52$ STR,ACC,STP,2
53$ BAL,WK,EXT,10
read(c)
54$ LDA,ACC,DR1,4
55$ STR,ACC,STP,2
56$ BAL,WK,EXT,10
65
hanoi(a,b,c)
57$ LOAD,ACC,DR1,2
58$ STH,ACC,STP,2
59$ LOAD,ACC,DR1,3
60$ STR,ACC,STP,3
61$ LOAD,ACC,DR1,4
62$ STR,ACC,STP,4
63$ BAL,WK,,4
->1
64$ B,,,45
%endofprogram
65$ FILL,ALLOC,2,5
65$ STOP,,,0
66$ FILL,COT,0,66
66$ CONST,,,1
67$ FILL,STACK,1,67
$ 0 FAULTS IN PROGRAM
66
Appendix 2
Possible Extensions to SKIMP
67
1. a. Identifier labels instead of numerical labels:
LOOP: ->NEXT
NEXT: ->LOOP
b. Switches:
%SWITCH TYPE(1:9)
TYPE (I): ->TYPE(I*J)
2. a. Indefinite loops:
%CYCLE
. . . .
%REPEAT
b. Incremental loops:
%FOR I=1,1,N %CYCLE
. . . .
%REPEAT
c. 'While' loops:
FRED %WHILE I>0
%WHILE I>0 %CYCLE
. . . .
%REPEAT
d. 'Until' loops:
FRED %UNTIL I>0
%CYCLE
. . . .
%REPEAT %UNTIL I>0
e. %EXIT and %CONTINUE statements.
3. a. Reversed order conditional statements:
I=J*K %IF I=0
b. 'Unless' conditions:
%UNLESS I=J %THEN K=L
c. 'else-if' clauses:
%IF I=0 %THEN J=0 %ELSE %IF K=0 %THEN L=0
d. Multi-sided comparisons:
%IF A<B<C<D %THEN E=F
68
4. a. Multi-dimensional arrays:
%INTEGERARRAY A(1:9,-1:1),B,C(1:N,1:M*N,0:1)
b. Array bound checking:
5. a. Own variables:
%OWNINTEGER I,J=123,K=456
b. Own arrays:
%OWNINTEGERARRAY L(1:10)=1,2,3.4,5,6(5)
c. Constant variables and arrays:
%CONSTINTEGER I=1234
%CONSTINTEGERARRAY J(1:3)=2,7,4
6. Real variables:
%REAL X,Y,Z
%REALARRAY A(1:N)
%REALFN RF(%REAL X,%REALNAME Y,%REALARRAYNAME Z)
SKIMPAI has, built-in, the real operations:
ADDF, SUBF, MLTF, DIVF, EXPF, NEGF, FLT
7. String variables:
%STRING(15) S
%STRING(7)%ARRAY T(1:10)
%STRING(99)%FN SF(%STRING(3) S)
S=S."FRED".SF("JOE").T(I)
8. Records:
%RECORDFORMAT RF(%INTEGER I,J,%RECORD(RF)%NAME K)
%RECORD(RF) R
R_I=R_J+1
X=R_K_K_I
9. a. INTERDATA object code
b. PDP-11 object code
c. VAX 11-780 object code
69
Appendix 3
The SKIMP Assembler/Interpreter
70
The object file output from the SKIMP compiler forms the input to the
assembler/interpreter. The first stage identifies statements to be assembled by
means of the '$' symbol following code addresses. It is the remainder of such
lines that are regarded as significant. Any other lines such as compiler
monitoring are ignored. This also allows interpreter directives to be inserted
in the program by using SKIMP comments as shown:
! $ . . . .
The interpreter directives available are:
MONITOR
TRON
TROFF
The first of these, 'MONITOR', when encountered during interpretation prints out
the contents of the registers and the stack. Since this may generate a
substantial amount of output, this facility should be used sparingly. The
second and third turn on and off respectively a trace facility which prints out
the value of the program counter for each instruction interpreted. This can
also be turned on by means of a '/TR' parameter in the call of 'SKIMPAI'.
Tracing may also generate large amounts of output! For example:
! $ MONITOR
The interpreter will not attempt to interpret programs which failed to
compile correctly. Furthermore, the system makes various checks during
interpretation in an endeavour to ensure that the program has not gone out of
control. These include unassigned variable checking, that data addresses lie
within the stack or constant table and that program counter addresses lie within
the code area. A limit of 10000 instructions executed is also imposed to catch
loops.
To demonstrate the use of these facilities, consider the following program:
%BEGIN
%INTEGERARRAY A(1:10)
%INTEGER I
! $ TRON
I=1
1: A(I)=I*I
I=I+1
%IF I<=10 %THEN ->1
! $ TROFF
! $ MONITOR
%ENDOFPROGRAM
The output from the compiler and the interpreter are shown below:
71
SKIMP COMPILER MKII
FILE: TEST
OPTIONS:
%BEGIN
0$ LDA,COT,,0
1$ LDA,DR1,,0
2$ LDA,STP,DR1,0
%INTEGERARRAY A(1:10)
3$ LDA,ACC,,1
4$ STR,ACC,DR1,2
5$ LDA,ACC,,10
6$ LDA,ACC,ACC,1
7$ STR,ACC,DR1,3
8$ SUB,STP,DR1,2
9$ STR,STP,DR1,4
10$ ADD,STP,DR1,3
%INTEGER I
! $ TRON
I=1
11$ LDA,ACC,,1
12$ STR,ACC,DR1,5
1: A(I)=I*I
13$ LOAD,ACC,DR1,5
14$ MLT,ACC,DR1,5
15$ STR,ACC,DR1,2
16$ LOAD,ACC,DR1,5
17$ ADD,ACC,DR1,4
18$ LOAD,WK,DR1,2
19$ STR,WK,ACC,0
I=I+1
20$ LOAD,ACC,DR1,5
21$ ADD,ACC,COT,0
22$ STR,ACC,DR1,5
%IF I<=10 %THEN ->1
23$ LOAD,ACC,DR1,5
24$ SUB,ACC,COT,1
25$ BHG,ACC,,13
72
! $ TROFF
! $ MONITOR
%ENDOFPROGRAM
26$ FILL,ALLOC,2,6
26$ STOP,,,0
27$ FILL,COT,0,27
27$ CONST,,,1
28$ CONST,,,10
29$ FILL,STACK,1,29
$ 0 FAULTS IN PROGRAM
11$ 12$ 13$ 11$ 15$ 16$ 17$ 18$ 19$ 20$
21$ 22$ 23$ 24$ 25$ 13$ 14$ 15$ 16$ 17$
18$ 19$ 20$ 21$ 22$ 23$ 24$ 25$ 13$ 14$
15$ 16$ 17$ 18$ 19$ 20$ 21$ 22$ 23$ 24$
25$ 13$ 14$ 15$ 16$ 17$ 18$ 19$ 20$ 21$
22$ 23$ 24$ 25$ 13$ 14$ 15$ 16$ 17$ 18$
19$ 20$ 21$ 22$ 23$ 24$ 25$ 13$ 14$ 15$
16$ 17$ 18$ 19$ 20$ 21$ 22$ 23$ 24$ 25$
13$ 14$ 15$ 16$ 17$ 18$ 19$ 20$ 21$ 22$
23$ 24$ 25$ 13$ 14$ 15$ 16$ 17$ 18$ 19$
20$ 21$ 22$ 23$ 24$ 25$ 13$ 14$ 15$ 16$
17$ 18$ 19$ 20$ 21$ 22$ 23$ 24$ 25$ 13$
14$ 15$ 16$ 17$ 18$ 19$ 20$ 21$ 22$ 23$
24$ 25$
COT 27
DR1 29
STP 45
ACC 1
WK 100
29$ ? ? 100 11 34 11 1 4 9 16 25 36 49 64 81
100 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
? ? ?
STOPPED AT 26$, 143 INSTRUCTIONS EXECUTED
73
Source file: SKIMPA.IMP Compiled on 25-OCT-1979 at 09:38:57
Computer Science IMP77 Compiler. Version 6.01
1 %externalintegerarray a(1:500)
2 ! initialisation for i/o routines
3 %externalbyteintegerarray named(1:1024)=10,'R','E','A','D','S','Y','M',
4+ 'B','O','L',10,'N','E','X','T','S','Y','M','B','O','L',10,'S','K',
5+ 'I','P','S','Y','M','B','O','L',11,'P','R','I','N','T','S','Y','M',
6+ 'B','O','L',5,'S','P','A','C','E',6,'S','P','A','C','E','S',7,'N',
7+ 'E','W','L','I','N','E',8,'N','E','W','L','I','N','E','S',7,'N','E',
8+ 'W','P','A','G','E',4,'R','E','A','D',5,'W','R','I','T','E',0(930)
9 %externalintegerarray namedlink(0:255)=0,76,0(12),89,0(54),84,0(118),
10+ 52,0(11),1,12,23,34,0(4),67,0(23),46,0(5),59,0(17)
11 %externalintegerarray taglink(0:255)=0,13,0(12),16,0(54),14,0(118),8,
12+ 0(11),1,3,4,5,0(4),11,0(23),7,0(5),10,0(17)
13 %externalintegerarray tag(1:512)=16_40100001,16_01010002,16_41000002,
14+ 16_40000003,16_40100004,16_01010002,16_40000005,16_40100006,
15+ 16_01010002,16_40000007,16_40100008,16_01010002,16_40000009,
16+ 16_4010000A,16_11010002,16_4020000B,16_01010002,16_01010003,0(494)
17 %externalintegerarray link(1:512)=2,0,0,0,6,0,0,9,0,0,12,0,0,
18+ 15,0,17,18,0,0(494)
19 %externalinteger namedp=95
20 %externalinteger tagasl=19
21 %externalinteger expropt=0
22 %externalinteger condopt=0
23 %externalinteger tagsopt=0
24 !-----------------------------------------------------------------------
25 %externalroutinespec statement(%integer statementp)
26 %externalstring(255)%fnspec strint(%integer n,p)
27 %externalroutinespec fault(%string(63) mess)
28 %externalroutinespec dump(%string(7) opn,reg,base,%integer disp)
29 %externalstring(255)%fnspec name(%integer ident)
30 !-----------------------------------------------------------------------
31 %externalroutine skimp(%string(63) s)
32 %routinespec readps
33 %routinespec read statement
34 %routinespec rpsym(%integername l)
35 %integerfnspec findk(%string(*)%name k)
36 %integerfnspec compare(%integer p)
37 %recordformat kdf(%byteinteger l,n,a,b)
38 %record(kdf)%array kd(1:255)
39 %string(15)%array pn(256:319)
40 %integerarray pp(256:319)
41 %integerarray ps(1:512)
42 %integerarray t,tt(1:256)
43 %integer tp,ap,ttp,ttpp,i,j,psflag
44 %string(63) file,object,options,option,as
45 %owninteger lexopt=0
46 %owninteger analopt=0
47 %if s->("/").options %then %start
48 %unless options->options.(" ").file %then file=""
49 %cycle
50 %unless options->option.("/").options %then %c
51+ option=options %and options=""
52 %if option->("NO").option %then i=0 %else i=1
53 %if option="LEX" %then lexopt=i %else %c
54+ %if option="ANAL" %then analopt=i %else %c
55+ %if option="EXPR" %then expropt=i %else %c
56+ %if option="COND" %then condopt=i %else %c
57+ %if option="TAGS" %then tagsopt=i %else %c
58+ print string(option." OPTION ?
59+ ") %and %stop
60 %repeat %until options=""
61 %finish %else file=s
62 readps
63 %if file="" %then open output(1, ".TT") %else open output(1,file.".LIS")
64 select output(1)
65 print string("
66+
67+ SKIMP COMPILER MKII
68+
69+ FILE: ".file."
70+ OPTIONS: ")
71 %if lexopt=1 %then print string("LEX ")
72 %if analopt=1 %then print string("ANAL ")
73 %if expropt=1 %then print string("EXPR ")
74 %if condopt=1 %then print string("COND ")
75 %if tagsopt=1 %then print string("TAGS ")
76 newline
77 %if psflag#0 %then fault("PHRASE STRUCTURE FAULTY") %and %stop
78 %if file="" %then open input(1,".tt") %else open input(1,file.".IMP")
79 select input(1)
80 ! set up tags available space list
81 %for i=tagasl,1,511 %cycle
82 link(i)=i+1
83 %repeat
84 %cycle ;! for each statement
85 read statement
86 ttp=tp-1
87 tp=1
88 ap=1
89 %if compare(258)=0 %or tp#ttp %then fault("SYNTAX ?") %else %start
90 %if analopt=1 %then %start
91 newline
92 j=0
93 %for i=1,1,ap-1 %cycle ;! print analysis record
94 %if a(i)<0 %then as=(" (".strint(i,1)."/".pn(a(i)<<1>>17). %c
95+ ")") %and %c
96+ a(i)=a(i)&16_FFFF
97 write(a(i),4)
98 j=j+5
99 %if j>60 %then newline %and j=0
100 %repeat
101 newlines(2)
102 %finish %else %start
103 %for i=1,1,ap-1 %cycle ;! remove phrase numbers
104 %if a(i)<0 %then a(i)=a(i)&16_FFFF
105 %repeat
106 %finish
107 statement(1) ;! generate code for statement
108 %finish
109 %repeat
110 !-----------------------------------------------------------------------
111 %routine readps
112 ! read phrase structure from file 'SKIMPPS' and reduce it
113 %string(31)%array ka(1:128)
114 %integerarray kna(1:128)
115 %string(31) k
116 %integer kap,kdasl,kn,i,l,psp,pnp,alt
117 %integername np
118 %routinespec insert(%string(15) k)
119 %routinespec extract(%integer i,%string(15) k)
120 %routinespec assign(%integer i)
121 %integerfnspec newkd
122 %routinespec returnkd(%integer i)
123 %routinespec returnlist(%integer i)
124 %integerfnspec phrase
125 %routinespec literal
126 %routinespec keyword
127 open input(2,"SKIMPPS.IMP")
128 select input(2)
129 open output(2,"SKIMPPSL.LIS")
130 select output(2)
131 print string("
132+
133+ PHRASE STRUCTURE
134+
135+ ")
136 ! scan file to build keyword dictionary
137 kap=1
138 %cycle
139 rpsym(l)
140 %if l='$' %then %exit
141 %if l='"' %then %start
142 k=""
143 %cycle
144 rpsym(l)
145 %if l='"' %then %exit
146 %if 'A'<=l<='Z' %then k=k.tostring(l)
147 %repeat
148 ka(kap)=k
149 kap=kap+1
150 %finish
151 %repeat
152 %for i=1,1,26 %cycle
153 kd(i)=0
154 %repeat
155 %for i=27,1,254 %cycle
156 kd(i)_b=i+1
157 %repeat
158 kdasl=27
159 i=1
160 insert(ka(i)) %and i=i+1 %until i=kap
161 kn=128
162 %for i=1,1,26 %cycle
163 %if kd(i)_l#0 %then assign(i)
164 %repeat
165 kap=1
166 %for i=1,1,26 %cycle
167 %if kd(i)_l#0 %then extract(i,"")
168 %repeat
169 print string("
170+
171+ KEYWORDS
172+
173+ ")
174 %for i=1,1,kap-1 %cycle
175 print string(strint(kna(i),3)." ".ka(i)."
176+ ")
177 %repeat
178 ! reread file and reduce phrase structure
179 reset input
180 pn(256)="NAME"
181 pp(256)=0
182 pn(257)="CONST"
183 pp(257)=0
184 pnp=258
185 psp=1
186 %cycle ;! for each phrase definition
187 read symbol(l)
188 %if l='$' %then %exit
189 %if l='<' %then %start ;! start of phrase definition
190 pp(phrase)=psp
191 %cycle ;! for each alternative
192 alt=psp
193 np==ps(psp+1)
194 np=0 ;! number of phrases
195 psp=psp+2
196 %cycle ;! for each item
197 read symbol(l)
198 %if l='<' %then ps(psp)=phrase %and psp=psp+1 %and np=np+1
199 %if l='''' %then literal
200 %if l='"' %then keyword
201 %if l=',' %or l=';' %then %exit
202 %repeat
203 ps(alt)=psp
204 %if l=';' %then %exit
205 %repeat
206 ps(psp)=0
207 psp=psp+1
208 %finish
209 %repeat
210 psflag=0
211 %for i=258,1,pnp-1 %cycle
212 %if pp(i)=0 %then fault("<".pn(i)."> NOT DEFINED") %and psflag=1
213 %repeat
214 print string("
215+
216+ PHRASES
217+
218+ ")
219 %for i=256,1,pnp-1 %cycle
220 print string(strint(i,3).strint(pp(i),6)." ".pn(i)."
221+ ")
222 %repeat
223 print string("
224+
225+ REDUCED PHRASE STRUCTURE
226+
227+ ")
228 %for i=1,1,psp-1 %cycle
229 %if (i-1)&15=0 %then print string("
230+ ".strint(i,3)." ")
231 write(ps(i),3)
232 %repeat
233 newlines(2)
234 %return
235 !-----------------------------------------------------------------------
236 %routine insert(%string(15) k)
237 ! search for and insert keyword into dictionary
238 %integer i,j,l
239 l=charno(k,1)
240 k->(tostring(l)).k
241 i=l-'A'+1
242 %if kd(i)_l#0 %then %start
243 search:%if k="" %then %start
244 %if kd(i)_a#0 %then extract(kd(i)_a,"") %and %c
245+ returnlist(kd(i)_a) %and kd(i)_a=0
246 %return
247 %finish
248 %if kd(i)_a=0 %then insert(k) %and %return
249 l=charno(k,1)
250 k->(tostring(l)).k
251 i=kd(i)_a
252 %cycle
253 %if kd(i)_l=l %then ->search
254 %if kd(i)_b=0 %then %exit
255 i=kd(i)_b
256 %repeat
257 j=i
258 i=newkd
259 kd(j)_b=i
260 %finish
261 ! insert remainder of letters
262 %cycle
263 kd(i)_l=l
264 %if k="" %then %return
265 l=charno(k,1)
266 k->(tostring(l)).k
267 j=i
268 i=newkd
269 kd(j)_a=i
270 %repeat
271 %end
272 !-----------------------------------------------------------------------
273 %routine extract(%integer i,%string(15) k)
274 %string(15) kk
275 %if i=0 %then %return
276 kk=k.tostring(kd(i)_l)
277 %if kd(i)_a=0 %then ka(kap)=kk %and kna(kap)=kd(i)_n %and kap=kap+1%c
278+ %else extract(kd(i)_a,kk)
279 extract(kd(i)_b,k)
280 %end
281 !-----------------------------------------------------------------------
282 %routine assign(%integer i)
283 %if i=0 %then %return
284 %if kd(i)_a=0 %then kd(i)_n=kn %and kn=kn+1 %else assign(kd(i)_a)
285 assign(kd(i)_b)
286 %end
287 !-----------------------------------------------------------------------
288 %integerfn newkd
289 %integer i
290 %if kdasl=0 %then print string("KD ASL EMPTY") %and %stop
291 i=kdasl
292 kdasl=kd(i)_b
293 kd(i)=0
294 %result=i
295 %end
296 !-----------------------------------------------------------------------
297 %routine returnkd(%integer i)
298 kd(i)_b=kdasl
299 kdasl=i
300 %end
301 !-----------------------------------------------------------------------
302 %routine returnlist(%integer i)
303 %if i=0 %then %return
304 returnlist(kd(i)_a)
305 returnlist(kd(i)_b)
306 returnkd(i)
307 %end
308 !-----------------------------------------------------------------------
309 %integerfn phrase
310 %string(15) p
311 %integer i,l
312 p=""
313 %cycle
314 read symbol(l)
315 %if l='>' %then %exit %else p=p.tostring(l)
316 %repeat
317 %for i=256,1,pnp-1 %cycle
318 %if pn(i)=p %then %result=i
319 %repeat
320 pn(pnp)=p
321 pp(pnp)=0
322 pnp=pnp+1
323 %result=pnp-1
324 %end
325 !-----------------------------------------------------------------------
326 %routine literal
327 %integer l
328 %cycle
329 read symbol(l)
330 %if l='''' %then %return %else ps(psp)=l %and psp=psp+1
331 %repeat
332 %end
333 !-----------------------------------------------------------------------
334 %routine keyword
335 %string(31) k
336 %integer l
337 k=""
338 %cycle
339 read symbol(l)
340 %if l='"' %then %exit
341 %if 'A'<=l<='Z' %then k=k.tostring(l)
342 %repeat
343 ps(psp)=findk(k) %and psp=psp+1 %until k=""
344 %end
345 %end
346 !-----------------------------------------------------------------------
347 %routine read statement
348 %routinespec store(%integer l)
349 %routinespec keyword
350 %routinespec name
351 %routinespec const
352 %integer i,l,ksh
353 ! line reconstruct phase
354 newlines(3)
355 ttp=1
356 ksh=0
357 %cycle ;! for each character
358 rpsym(l)
359 %if l='%' %then ksh=128 %else %start
360 %unless 'A'<=l<='Z' %then ksh=0
361 %if l#' ' %then %start ;! discard spaces
362 %if l='!' %and ttp=1 %then %start
363 rpsym(l) %until l=';' %or l=nl ;! discard comments
364 %finish %else %start
365 store(l)
366 %if l='''' %then %start
367 rpsym(l) %and store(l) %until l=''''
368 %finish %else %start
369 %if l=';' %or l=nl %then %start
370 %if ttp=2 %then ttp=1 %else %start
371 %if l=';' %then newline %and %exit
372 %if tt(ttp-2)='C'+128 %then ttp=ttp-2 %else %exit
373 %finish
374 %finish
375 %finish
376 %finish
377 %finish
378 %finish
379 %repeat
380 ! lexical phase
381 tp=1
382 ttpp=1
383 %cycle ;! for each lexical item
384 i=tt(ttpp)
385 %if i>=128 %then keyword %else %start
386 %if 'A'<=i<='Z' %then name %else %start
387 %if '0'<=i<='9' %or i='''' %then const %else %c
388+ t(tp)=i %and tp=tp+1 %and ttpp=ttpp+1
389 %finish
390 %finish
391 %repeat %until ttpp=ttp
392 %if lexopt=1 %then %start
393 newline
394 %for ttpp=1,1,tp-2 %cycle
395 write(t(ttpp),4)
396 %if ttpp&16_f=0 %then newline
397 %repeat
398 newline
399 %finish
400 %return
401 !-----------------------------------------------------------------------
402 %routine store(%integer l)
403 %if ttp>256 %then fault("STATEMENT TOO LONG") %and %stop
404 tt(ttp)=l+ksh
405 ttp=ttp+1
406 %end
407 !-----------------------------------------------------------------------
408 %routine keyword
409 %string(255) k
410 %integer i
411 k=""
412 %while tt(ttpp)>128 %then k=k.tostring(tt(ttpp)-128) %and ttpp=ttpp+1
413 i=findk(k) %and t(tp)=i %and tp=tp+1 %until k="" %or i=0
414 %end
415 !-----------------------------------------------------------------------
416 %routine name
417 %string(*)%name sname
418 %integer i,l,hash
419 sname==string(addr(named(namedp)))
420 hash=0
421 sname=""
422 l=tt(ttpp)
423 %cycle
424 %if namedp+length(sname)>=1022 %then fault("NAME DICTIONARY FULL")%c
425+ %and %stop
426 %if length(sname)=255 %then fault("NAME TOO LONG") %and %stop
427 sname=sname.tostring(l)
428 hash=hash<<8!l
429 ttpp=ttpp+1
430 l=tt(ttpp)
431 %repeat %until l<'0' %or '9'<l<'A' %or l>'Z'
432 hash=hash-hash//251*251
433 i=hash
434 %cycle ;! scan dictionary
435 %if namedlink(i)=0 %then namedlink(i)=namedp %and %c
436+ namedp=namedp+length(sname)+1 %and %exit ;! insert name
437 %if sname=string(addr(named(namedlink(i)))) %then %exit
438 i=(i+1)&255
439 %if i=hash %then fault("NAME DICTIONARY FULL") %and %stop
440 %repeat
441 t(tp)=256 ;! <name>
442 t(tp+1)=i ;! ident
443 tp=tp+2
444 %end
445 !-----------------------------------------------------------------------
446 %routine const
447 %integer l,value,flag,count,maxby10,maxld
448 value=0
449 flag=0
450 %if tt(ttpp)='''' %then %start
451 count=0
452 %cycle
453 ttpp=ttpp+1
454 %if tt(ttpp)='''' %then %start
455 ttpp=ttpp+1
456 %if tt(ttpp)#'''' %then %exit
457 %finish
458 value=value<<8!tt(ttpp)
459 count=count+1
460 %repeat
461 %unless 1<=count<=4 %then flag=1
462 %finish %else %start
463 maxby10=16_7FFFFFFF//10
464 maxld=16_7FFFFFFF-maxby10*10
465 l=tt(ttpp)
466 %cycle
467 %if value>maxby10 %or (value=maxby10 %and l>maxld) %then flag=1 %c
468+ %else value=value*10+l-'0'
469 ttpp=ttpp+1
470 l=tt(ttpp)
471 %repeat %until l<'0' %or l>'9'
472 %finish
473 t(tp)=257 ;! <const>
474 %if flag#0 %then fault("CONSTANT INVALID") %and value=0
475 t(tp+1)=value
476 tp=tp+2
477 %end
478 %end
479 !-----------------------------------------------------------------------
480 %routine rpsym(%integername l)
481 read symbol(l)
482 print symbol(l)
483 %if 'a'<=l<='z' %then l=l-'a'+'A'
484 %end
485 !-----------------------------------------------------------------------
486 %integerfn findk(%string(*)%name k)
487 ! look keyword up in dictionary
488 %integer i,l
489 l=charno(k,1)
490 k->(tostring(l)).k
491 i=l-'A'+1
492 %if kd(i)_l=0 %then %result=0
493 search:%if k="" %or kd(i)_a=0 %then %result=kd(i)_n
494 l=charno(k,1)
495 k->(tostring(l)).k
496 i=kd(i)_a
497 %cycle
498 %if kd(i)_l=l %then ->search
499 %if kd(i)_b=0 %then %result=0
500 i=kd(i)_b
501 %repeat
502 %end
503 !-----------------------------------------------------------------------
504 %integerfn compare(%integer p)
505 %integer app,tpp,alt,altend,psp,psi
506 a(ap)=p<<16!16_80000001 ;! phrase number & alternative 1
507 %if p<=257 %then %start ;! <name> or <const>
508 %if p=t(tp) %then %start ;! success
509 a(ap+1)=t(tp+1)
510 ap=ap+2
511 tp=tp+2
512 %result=1
513 %finish %else %result=0
514 %finish
515 tpp=tp ;! preserve text pointer
516 app=ap ;! preserve analysis record pointer
517 psp=pp(p) ;! start of phrase definition
518 %cycle ;! for each alternative
519 alt=ap+1
520 altend=ps(psp)
521 ap=alt+ps(psp+1) ;! leave gap for forward pointers
522 %if ap>255 %then fault("ANALYSIS RECORD TOO LONG") %and %stop
523 psp=psp+2
524 %cycle ;! for each item
525 %if psp=altend %then %result=1 ;! success
526 psi=ps(psp)
527 %if psi>=256 %then %start ;! phrase
528 a(alt)=ap ;! forward pointer
529 %if compare(psi)=0 %then %exit
530 alt=alt+1
531 %finish %else %start ;! literal or keyword
532 %if psi#t(tp) %then %exit
533 tp=tp+1
534 %finish
535 psp=psp+1
536 %repeat
537 %if ps(altend)=0 %then %result=0 ;! failure
538 psp=altend
539 tp=tpp ;! backtrack text pointer
540 ap=app ;! backtrack analysis record pointer
541 a(ap)=a(ap)+1 ;! next alternative number
542 %repeat
543 %end
544 %endofprogram
Code 9996 bytes Glap 10136 bytes Diags 2355 bytes Total size 22487 bytes
464 statements compiled
Source file: SKIMPB.IMP Compiled on 25-OCT-1979 at 09:39:49
Computer Science IMP77 Compiler. Version 6.01
1 %externalintegerarrayspec a(1:500)
2 %externalintegerarrayspec taglink(0:255)
3 %externalintegerarrayspec tag(1:512)
4 %externalintegerarrayspec link(1:512)
5 !-----------------------------------------------------------------------
6 %externalroutinespec expr(%integer exprp)
7 %externalintegerfnspec cond(%integer condp,tlabel,flabel)
8 %externalstring(255)%fnspec strint(%integer n,p)
9 %externalintegerfnspec getwork
10 %externalroutinespec returnwork(%integer work)
11 %externalroutinespec clearwork
12 %externalintegerfnspec newtag
13 %externalroutinespec pushtag(%integer ident,form,type,dim,level,rad)
14 %externalroutinespec poptags
15 %externalintegerfnspec getlabel(%integer constp)
16 %externalroutinespec filllabel(%integer label)
17 %externalintegerfnspec fillbranch(%integer label)
18 %externalroutinespec poplabels
19 %externalintegerfnspec nextplabel
20 %externalroutinespec dump(%string(7) opn,reg,base,%integer disp)
21 %externalroutinespec fault(%string(63) mess)
22 %externalstring(255)%fnspec name(%integer ident)
23 %externalroutinespec pushstart(%integer flag,plab)
24 %externalroutinespec popstart(%integername flag,plab)
25 %externalroutinespec clearstart
26 %externalintegerfnspec enter
27 %externalroutinespec dump return
28 %externalroutinespec proc(%integer procp)
29 %externalroutinespec array(%integer arrayp)
30 %externalroutinespec endofprog
31 !-----------------------------------------------------------------------
32 %externalintegerarray nextrad(0:15)
33 %externalstring(4)%array display(0:15)="DR0","DR1","DR2","DR3","DR4",
34+ "DR5","DR6","DR7","DR8","DR9","DR10","DR11","DR12","DR13","DR14","DR15"
35 %externalinteger level,nextcad
36 !-----------------------------------------------------------------------
37 %ownintegerarray proctype(0:15)
38 %ownintegerarray staticalloc(0:15)
39 %ownintegerarray skipproc(0:15)
40 !-----------------------------------------------------------------------
41 %externalroutine statement(%integer statementp)
42 %routinespec instr(%integer instrp)
43 %switch sttype(1:8)
44 %integer condp,instrp,elsep,constp,arrayp,namep,namesp,expr1p,expr2p, %c
45+ instr2p,tlabel,flabel,label,fplabel,tplabel,work1,work2,flag,plabel,%c
46+ procp,formalp,formp,params,procid,ident,form,paramt,paraml,dim
47 ->sttype(a(statementp))
48 !-----------------------------------------------------------------------
49 sttype(1):! <instr>
50 instr(a(statementp+1))
51 %return
52 !-----------------------------------------------------------------------
53 sttype(2):! "IF"<cond>"THEN"<instr><else>
54 condp=a(statementp+1)
55 instrp=a(statementp+2)
56 elsep=a(statementp+3)
57 %if a(instrp)=2 %then %start ;! branch
58 constp=a(instrp+1)
59 tlabel=getlabel(constp)
60 %if a(elsep)=2 %then filllabel(cond(condp,tlabel,-1)) %else %start
61 instrp=a(elsep+1)
62 %if a(instrp)=2 %then %start ;! branch
63 constp=a(instrp+1)
64 flabel=getlabel(constp)
65 filllabel(cond(condp,tlabel,flabel))
66 dump("B","","",fillbranch(flabel))
67 %finish %else %start
68 filllabel(cond(condp,tlabel,-1))
69 %if a(instrp)=3 %then pushstart(1,-1) %else instr(instrp)
70 %finish
71 %finish
72 %finish %else %start
73 %if a(elsep)=2 %then %start
74 fplabel=cond(condp,-1,-1)
75 %if a(instrp)=3 %then pushstart(0,fplabel) %else %c
76+ instr(instrp) %and filllabel(fplabel)
77 %finish %else %start
78 instr2p=a(elsep+1)
79 %if a(instr2p)=2 %then %start ;! branch
80 constp=a(instr2p+1)
81 fplabel=cond(condp,-1,getlabel(constp)) ;! result always -1
82 instr(instrp)
83 %finish %else %start
84 fplabel=cond(condp,-1,-1)
85 instr(instrp)
86 tplabel=nextplabel
87 dump("B","","",fillbranch(tplabel))
88 filllabel(fplabel)
89 %if a(instr2p)=3 %then pushstart(1,tplabel) %else %c
90+ instr(instr2p) %and filllabel(tplabel)
91 %finish
92 %finish
93 %finish
94 %return
95 !-----------------------------------------------------------------------
96 sttype(3):! <const>':'<statement>
97 constp=a(statementp+1)
98 statementp=a(statementp+2)
99 label=getlabel(constp)
100 filllabel(label)
101 statement(statementp)
102 %return
103 !-----------------------------------------------------------------------
104 sttype(4):! "FINISH"<else>
105 elsep=a(statementp+1)
106 popstart(flag,plabel)
107 %if flag=0 %then %start ;! first %start/%finish
108 %if a(elsep)=1 %then %start
109 instrp=a(elsep+1)
110 tplabel=nextplabel
111 dump("B","","",fillbranch(tplabel))
112 filllabel(plabel)
113 %if a(instrp)=3 %then pushstart(1,tplabel) %else %c
114+ instr(instrp) %and filllabel(tplabel)
115 %finish %else filllabel(plabel)
116 %finish %else %start ;! second %start/%finish
117 %if a(elsep)=1 %then fault("SPURIOUS %ELSE") %else filllabel(plabel)
118 %finish
119 %return
120 !-----------------------------------------------------------------------
121 sttype(5):! "INTEGER"<array>
122 arrayp=a(statementp+1)
123 namep=a(arrayp+1)
124 namesp=a(arrayp+2)
125 %if a(arrayp)=1 %then %start ;! array declaration
126 expr1p=a(arrayp+3)
127 expr2p=a(arrayp+4)
128 expr(expr1p)
129 work1=getwork
130 dump("STR","ACC",display(level),work1)
131 expr(expr2p)
132 dump("LDA","ACC","ACC",1)
133 work2=getwork
134 dump("STR","ACC",display(level),work2)
135 %cycle
136 pushtag(a(namep+1),2,1,1,level,nextrad(level))
137 dump("SUB","STP",display(level),work1)
138 dump("STR","STP",display(level),nextrad(level))
139 dump("ADD","STP",display(level),work2)
140 nextrad(level)=nextrad(level)+1
141 %if a(namesp)=2 %then %exit
142 namep=a(namesp+1)
143 namesp=a(namesp+2)
144 %repeat
145 returnwork(work1)
146 returnwork(work2)
147 %finish %else %start
148 %cycle
149 pushtag(a(namep+1),0,1,0,level,nextrad(level))
150 nextrad(level)=nextrad(level)+1
151 %if a(namesp)=2 %then %exit
152 namep=a(namesp+1)
153 namesp=a(namesp+2)
154 %repeat
155 %finish
156 %return
157 !-----------------------------------------------------------------------
158 sttype(6):! <proc><name><formal>
159 %if level=0 %then fault("PROCEDURE BEFORE %BEGIN")
160 %if level=15 %then fault("PROCEDURE NESTING TOO DEEP")
161 procp=a(statementp+1)
162 namep=a(statementp+2)
163 formalp=a(statementp+3)
164 procid=a(namep+1)
165 skipproc(level)=nextcad
166 dump("B","","",0) ;! branch round procedure
167 pushtag(procid,4,a(procp)-1,0,level,nextcad)
168 level=level+1
169 proctype(level)=a(procp)
170 staticalloc(level)=enter
171 nextrad(level)=2
172 %if a(formalp)=2 %then %return ;! no parameters
173 params=0
174 paraml=taglink(procid)
175 %until a(formalp)=2 %cycle
176 formp=a(formalp+1)
177 namep=a(formalp+2)
178 namesp=a(formalp+3)
179 formalp=a(formalp+4)
180 %if a(formp)=1 %then form=3 %and dim=1 %else %start
181 %if a(formp)=2 %then form=1 %else form=0
182 dim=0
183 %finish
184 %cycle
185 ident=a(namep+1)
186 ! declare parameters as locals
187 pushtag(ident,form,1,dim,level,nextrad(level))
188 nextrad(level)=nextrad(level)+1
189 ! append parameter tag cells to procedure tag cell
190 paramt=newtag
191 tag(paramt)=tag(taglink(ident))
192 link(paramt)=link(paraml)
193 link(paraml)=paramt
194 paraml=paramt
195 params=params+1
196 %if params>15 %then fault(name(procid). %c
197+ " HAS TOO MANY PARAMETERS") %and %stop
198 %if a(namesp)=2 %then %exit
199 namep=a(namesp+1)
200 namesp=a(namesp+2)
201 %repeat
202 %repeat
203 ! insert number of parameters into tag cell
204 tag(taglink(procid))=tag(taglink(procid))!params<<20
205 %return
206 !-----------------------------------------------------------------------
207 sttype(7):! "END"<ofprog>
208 dump("FILL","ALLOC",strint(staticalloc(level),1),nextrad(level))
209 poptags
210 poplabels
211 clearstart
212 clearwork
213 %if proctype(level)=1 %then dump return %else dump("STOP","","",0)
214 level=level-1
215 %if a(a(statementp+1))=2 %then %start ;! %end
216 %if level<=0 %then fault("SPURIOUS %END") %and endofprog
217 dump("FILL","SKIP",strint(skipproc(level),1),nextcad)
218 %finish %else %start ;! %endofprogram
219 %if level#0 %then fault("TOO FEW %ENDS")
220 endofprog
221 %finish
222 %return
223 !-----------------------------------------------------------------------
224 sttype(8):! "BEGIN"
225 %if level#0 %then fault("SPURIOUS %BEGIN") %else %start
226 level=1
227 proctype(1)=0
228 staticalloc(1)=enter
229 %finish
230 %return
231 !-----------------------------------------------------------------------
232 %routine instr(%integer instrp)
233 %switch instype(1:6)
234 %string(4) base
235 %integer namep,assignp,constp,ident,actualp,exprp,nametag,disp,work
236 ->instype(a(instrp))
237 !-----------------------------------------------------------------------
238 instype(1):! <name><actual><assign>
239 namep=a(instrp+1)
240 actualp=a(instrp+2)
241 assignp=a(instrp+3)
242 ident=a(namep+1)
243 %if taglink(ident)=0 %then fault(name(ident)." NOT DECLARED") %c
244+ %and %return
245 nametag=tag(taglink(ident))
246 %if a(assignp)=1 %then %start
247 %if nametag>>28=4 %then fault(name(ident)." NOT A DESTINATION") %c
248+ %and %return
249 exprp=a(assignp+1)
250 %if nametag>>28>=2 %then %start ;! array variable
251 expr(exprp)
252 work=getwork
253 dump("STR","ACC",display(level),work)
254 array(instrp)
255 dump("LOAD","WK",display(level),work)
256 dump("STR","WK","ACC",0)
257 returnwork(work)
258 %finish %else %start
259 expr(exprp)
260 base=display(nametag>>16&16_F)
261 disp=nametag&16_FFFF
262 %if nametag>>28=1 %then %start ;! %name variable
263 dump("LOAD","WK",base,disp)
264 dump("STR","ACC","WK",0)
265 %finish %else dump("STR","ACC",base,disp)
266 %if a(actualp)=1 %then fault(name(ident)." DECLARED AS SCALAR")
267 %finish
268 %finish %else %start
269 %if nametag>>28=4 %and nametag>>24&16_F=0 %then proc(instrp) %c
270+ %else fault(name(ident)." NOT A ROUTINE NAME")
271 %finish
272 %return
273 !-----------------------------------------------------------------------
274 instype(2):! '->'<const>
275 constp=a(instrp+1)
276 label=getlabel(constp)
277 dump("B","","",fillbranch(label))
278 %return
279 !-----------------------------------------------------------------------
280 instype(3):! "START"
281 fault("ILLEGAL %START")
282 %return
283 !-----------------------------------------------------------------------
284 instype(4):! "RETURN"
285 %if proctype(level)#1 %then fault("%RETURN OUT OF CONTEXT")
286 dumpreturn
287 %return
288 !-----------------------------------------------------------------------
289 instype(5):! "RESULT"'='<expr>
290 %if proctype(level)#2 %then fault("%RESULT OUT OF CONTEXT")
291 expr(a(instrp+1))
292 dumpreturn
293 %return
294 !-----------------------------------------------------------------------
295 instype(6):! "STOP"
296 dump("STOP","","",0)
297 %end
298 %end
299 %endoffile
Code 7292 bytes Glap 1256 bytes Diags 1276 bytes Total size 9824 bytes
253 statements compiled
Source file: SKIMPC.IMP Compiled on 25-OCT-1979 at 09:51:22
Computer Science IMP77 Compiler. Version 6.01
1 %externalintegerarrayspec a(1:500)
2 %externalintegerspec condopt
3 !-----------------------------------------------------------------------
4 %externalroutinespec expr(%integer exprp)
5 %externalroutinespec dump(%string(7) opn,reg,base,%integer disp)
6 %externalroutinespec filllabel(%integer label)
7 %externalintegerfnspec fillbranch(%integer label)
8 %externalintegerfnspec nextplabel
9 %externalroutinespec fault(%string(63) mess)
10 !-----------------------------------------------------------------------
11 %externalinteger condflag=0
12 !-----------------------------------------------------------------------
13 %externalintegerfn cond(%integer condp,tlabel,flabel)
14 %routinespec processcond(%integer condp)
15 %routinespec test(%integer ltestp)
16 %routinespec condrest(%integer condrestp)
17 %routinespec store(%integer testp,level,andor)
18 %routinespec show(%string(7) an,%integerarrayname a,%integer p)
19 %conststring(3)%array true(1:6)="BZ","BNZ","BNG","BL","BNL","BG"
20 %conststring(3)%array false(1:6)="BNZ","BZ","BG","BNL","BL","BNG"
21 %string(3) opn
22 %constintegerarray index(1:17)=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17
23 %integerarray testpa,levela,andora,brancha(1:16),labela(1:17)
24 %integer p,pp,ppp,testp,level,andor,comp
25 level=0
26 p=1
27 processcond(condp)
28 store(testp,-1,1) ;! pseudo-%and
29 store(0,-2,2) ;! pseudo-%or
30 p=p-2
31 %for pp=1,1,p %cycle ;! find branch destinations
32 level=levela(pp)
33 andor=andora(pp)
34 %for ppp=pp+1,1,p+1 %cycle
35 %if levela(ppp)<level %then %start
36 level=levela(ppp)
37 %if andora(ppp)#andor %then %exit
38 %finish
39 %repeat
40 brancha(pp)=ppp+1
41 %repeat
42 %if tlabel>=0 %then %start
43 andora(p)=2 ;! change last branch to branch on true
44 brancha(p)=p+1
45 labela(p+1)=tlabel
46 %finish
47 labela(p+2)=flabel
48 %for pp=1,1,p %cycle ;! assign private labels where needed
49 %if labela(brancha(pp))<0 %then labela(brancha(pp))=nextplabel
50 %repeat
51 %if condopt=1 %then %start
52 newline
53 show(" ",index,p+2)
54 show("TESTP ",testpa,p)
55 show("LEVEL ",levela,p+1)
56 show("ANDOR ",andora,p+1)
57 show("BRANCH",brancha,p)
58 show("LABEL ",labela,p+2)
59 newline
60 %finish
61 %for pp=1,1,p %cycle ;! generate test code and fill labels
62 %if labela(pp)>=0 %then filllabel(labela(pp))
63 condflag=1
64 expr(testpa(pp))
65 comp=a(a(testpa(pp)+2))
66 %if andora(pp)=1 %then opn=false(comp) %else opn=true(comp)
67 dump(opn,"ACC","",fillbranch(labela(brancha(pp))))
68 %repeat
69 %if labela(p+1)>=0 %and tlabel<0 %then filllabel(labela(p+1))
70 %if flabel>=0 %then %result=-1 %else %result=labela(p+2)
71 !-----------------------------------------------------------------------
72 %routine processcond(%integer condp)
73 test(a(condp+1))
74 condrest(a(condp+2))
75 %end
76 !-----------------------------------------------------------------------
77 %routine test(%integer ltestp)
78 %if a(ltestp)=1 %then testp=ltestp %else level=level+1 %and %c
79+ processcond(a(ltestp+1)) %and level=level-1
80 %end
81 !-----------------------------------------------------------------------
82 %routine condrest(%integer condrestp)
83 %integer andor
84 andor=a(condrestp)
85 %unless andor=3 %then %start
86 store(testp,level,andor) %and test(a(condrestp+1)) %and %c
87+ condrestp=a(condrestp+2) %until a(condrestp)=2
88 %finish
89 %end
90 !-----------------------------------------------------------------------
91 %routine store(%integer testp,level,andor)
92 %if p>16 %then fault("CONDITION TOO LONG") %and %stop
93 testpa(p)=testp
94 levela(p)=level
95 andora(p)=andor
96 labela(p)=-1
97 p=p+1
98 %end
99 !-----------------------------------------------------------------------
100 %routine show(%string(7) an,%integerarrayname a,%integer p)
101 %integer pp
102 print string(an." ")
103 %for pp=1,1,p %cycle
104 write(a(pp),5)
105 %repeat
106 newline
107 %end
108 %end
109 %endoffile
Code 2492 bytes Glap 432 bytes Diags 613 bytes Total size 3537 bytes
99 statements compiled
Source file: SKIMPD.IMP Compiled on 25-OCT-1979 at 09:40:47
Computer Science IMP77 Compiler. Version 6.01
1 %externalintegerarrayspec a(1:500)
2 %externalbyteintegerarrayspec named(1:1024)
3 %externalintegerarrayspec namedlink(0:255)
4 %externalintegerarrayspec taglink(0:255)
5 %externalintegerarrayspec tag(1:512)
6 %externalintegerarrayspec link(1:512)
7 %externalintegerarrayspec nextrad(0:15)
8 %externalstring(4)%arrayspec display(0:15)
9 %externalintegerspec tagasl,level,tagsopt,nextcad,namedp
10 !-----------------------------------------------------------------------
11 %externalroutinespec expr(%integer exprp)
12 !-----------------------------------------------------------------------
13 %ownintegerarray worklist(0:15)=0(16)
14 %ownintegerarray namelist(0:15)=0(16)
15 %ownintegerarray branchlist(0:15)=0(16)
16 %ownintegerarray startlist(0:15)=0(16)
17 %ownintegerarray cot(0:127)
18 %owninteger cotp,faults,params
19 !-----------------------------------------------------------------------
20 %externalstring(255)%fn strint(%integer n,p)
21 %string(255) r
22 %string(1) s
23 %if n<0 %then s="-" %and n=-n %else s=""
24 r=""
25 r=tostring(n-n//10*10+'0').r %and n=n//10 %until n=0
26 r=s.r
27 %while length(r)<p %then r=" ".r
28 %result=r
29 %end
30 !-----------------------------------------------------------------------
31 %externalstring(8)%fn strhex(%integer n)
32 %conststring(1)%array h(0:15)="0","1","2","3","4","5","6","7","8",
33+ "9","A","B","C","D","E","F"
34 %integer i
35 %string(8) sh
36 sh=""
37 %for i=1,1,8 %cycle
38 sh=h(n&16_F).sh
39 n=n>>4
40 %repeat
41 %result=sh
42 %end
43 !-----------------------------------------------------------------------
44 %externalroutine fault(%string(63) mess)
45 print string("* ".mess."
46+
47+ ")
48 faults=faults+1
49 %end
50 !-----------------------------------------------------------------------
51 %externalroutine dump(%string(7) opn,reg,base,%integer disp)
52 print string(strint(nextcad,5)."$ ". %c
53+ opn.",".reg.",".base.",".strint(disp,1)."
54+ ")
55 nextcad=nextcad+1 %unless opn="FILL"
56 %end
57 !-----------------------------------------------------------------------
58 %externalstring(255)%fn name(%integer ident)
59 %unless 0<=ident<=255 %and namedlink(ident)#0 %then %result=""
60 %result=string(addr(named(namedlink(ident))))
61 %end
62 !-----------------------------------------------------------------------
63 %externalintegerfn newtag
64 %integer i
65 %if tagasl=0 %then fault("TAG SPACE FULL") %and %stop
66 i=tagasl
67 tagasl=link(tagasl)
68 %result=i
69 %end
70 !-----------------------------------------------------------------------
71 %externalintegerfn returntag(%integer tagi)
72 %integer l
73 l=link(tagi)
74 link(tagi)=tagasl
75 tagasl=tagi
76 %result=l
77 %end
78 !-----------------------------------------------------------------------
79 %externalintegerfn getwork
80 %integername cell
81 cell==worklist(level)
82 %while cell#0 %cycle
83 %if tag(cell)<0 %then tag(cell)=-tag(cell) %and %result=tag(cell)
84 cell==link(cell)
85 %repeat
86 cell=newtag
87 tag(cell)=nextrad(level)
88 nextrad(level)=nextrad(level)+1
89 link(cell)=0
90 %result=tag(cell)
91 %end
92 !-----------------------------------------------------------------------
93 %externalroutine returnwork(%integer work)
94 %integer cell
95 cell=worklist(level)
96 %while cell#0 %cycle
97 %if tag(cell)=work %then tag(cell)=-work %and %return
98 cell=link(cell)
99 %repeat
100 %end
101 !-----------------------------------------------------------------------
102 %externalroutine clearwork
103 %integer cell
104 cell=worklist(level)
105 %while cell#0 %then cell=returntag(cell)
106 worklist(level)=0
107 %end
108 !-----------------------------------------------------------------------
109 %externalintegerfn getcoti(%integer const)
110 %integer coti
111 %if cotp>0 %then %start
112 %for coti=0,1,cotp-1 %cycle
113 %if cot(coti)=const %then %result=coti
114 %repeat
115 %finish
116 %if cotp=128 %then fault("CONSTANT TABLE FULL") %and %stop
117 cot(cotp)=const
118 cotp=cotp+1
119 %result=cotp-1
120 %end
121 !-----------------------------------------------------------------------
122 %externalroutine pushtag(%integer ident,form,type,dim,level,rad)
123 %integer tagi
124 %if taglink(ident)#0 %and tag(taglink(ident))>>16&16_F=level %then %c
125+ fault("NAME ".name(ident)." DECLARED TWICE")
126 tagi=newtag
127 tag(tagi)=form<<28!type<<24!dim<<20!level<<16!rad
128 link(tagi)=taglink(ident)
129 taglink(ident)=tagi
130 tagi=newtag
131 tag(tagi)=ident
132 link(tagi)=namelist(level)
133 namelist(level)=tagi
134 %end
135 !-----------------------------------------------------------------------
136 %externalroutine poptags
137 %integer cell,ident,nametag,params
138 %string(63) s
139 %if tagsopt=1 %then newline
140 cell=namelist(level)
141 %while cell#0 %cycle
142 ident=tag(cell)
143 cell=returntag(cell)
144 nametag=tag(taglink(ident))
145 taglink(ident)=returntag(taglink(ident))
146 %if tagsopt=1 %then %start
147 s=name(ident)
148 print string(strint(ident,3)." ".s)
149 spaces(10-length(s))
150 print string(strhex(nametag))
151 %finish
152 %if nametag>>28=4 %then %start ;! procedure type
153 params=nametag>>20&16_F
154 %while params#0 %cycle
155 %if tagsopt=1 %then print string(" ". %c
156+ strhex(tag(taglink(ident))))
157 taglink(ident)=returntag(taglink(ident))
158 params=params-1 ;! pop up parameter tags
159 %repeat
160 %finish
161 %if tagsopt=1 %then newline
162 %if taglink(ident)=0 %then namedp=namedlink(ident) %c
163+ %and namedlink(ident)=0 ;! backtrack name dictionary
164 %repeat
165 %if tagsopt=1 %then newline
166 namelist(level)=0
167 %end
168 !-----------------------------------------------------------------------
169 %externalintegerfn getlabel(%integer constp)
170 %integer label
171 label=a(constp+1)
172 %if label>9999 %then fault("LABEL ".strint(label,1)." TOO LARGE") %c
173+ %and %result=-1 %else %result=label
174 %end
175 !-----------------------------------------------------------------------
176 %externalroutine filllabel(%integer label)
177 %integer cell
178 %return %if label<0 ;! for conditional statements
179 cell=branchlist(level)
180 %while cell#0 %cycle
181 %if tag(cell)>>16=label %then %start
182 %if tag(cell)&16_8000=0 %then fault("DUPLICATE LABEL ". %c
183+ strint(label,1)) %else %start
184 dump("FILL",strint(label,1),strint(tag(cell)&16_7FFF,1),nextcad)
185 tag(cell)=label<<16!nextcad
186 %finish
187 %return
188 %finish
189 cell=link(cell)
190 %repeat
191 cell=newtag
192 link(cell)=branchlist(level)
193 branchlist(level)=cell
194 tag(cell)=label<<16!nextcad
195 %end
196 !-----------------------------------------------------------------------
197 %externalintegerfn fillbranch(%integer label)
198 %integer cell,cad
199 %result=0 %if label<0
200 cell=branchlist(level)
201 %while cell#0 %cycle
202 %if tag(cell)>>16=label %then %start
203 cad=tag(cell)&16_7FFF
204 %if tag(cell)&16_8000#0 %then tag(cell)=label<<16!16_8000!nextcad
205 %result=cad
206 %finish
207 cell=link(cell)
208 %repeat
209 cell=newtag
210 link(cell)=branchlist(level)
211 branchlist(level)=cell
212 tag(cell)=label<<16!16_8000!nextcad
213 %result=0
214 %end
215 !-----------------------------------------------------------------------
216 %externalroutine poplabels
217 %integer cell
218 cell=branchlist(level)
219 %while cell#0 %cycle
220 %if tag(cell)&16_8000#0 %then fault("LABEL ".strint(tag(cell)>>16,%c
221+ 1)." NOT SET (BRANCH LIST ".strint(tag(cell)&16_7FFF,1).")")
222 cell=returntag(cell)
223 %repeat
224 branchlist(level)=0
225 %end
226 !-----------------------------------------------------------------------
227 %externalintegerfn nextplabel
228 %owninteger plabel=9999
229 plabel=plabel+1
230 %result=plabel
231 %end
232 !-----------------------------------------------------------------------
233 %externalroutine pushstart(%integer flag,plab)
234 %integer cell
235 cell=newtag
236 tag(cell)=flag<<16!plab&16_FFFF ;! plab may be -1
237 link(cell)=startlist(level)
238 startlist(level)=cell
239 %end
240 !-----------------------------------------------------------------------
241 %externalroutine popstart(%integername flag,plab)
242 %integer cell
243 cell=startlist(level)
244 %if cell=0 %then %start
245 fault("SPURIOUS %FINISH")
246 flag=0
247 plab=0
248 %finish %else %start
249 flag=tag(cell)>>16
250 plab=tag(cell)&16_FFFF
251 %if plab=16_FFFF %then plab=-1
252 startlist(level)=returntag(cell)
253 %finish
254 %end
255 !-----------------------------------------------------------------------
256 %externalroutine clearstart
257 %integer cell
258 cell=startlist(level)
259 %while cell#0 %then fault("%FINISH MISSING") %and cell=returntag(cell)
260 startlist(level)=0
261 %end
262 !-----------------------------------------------------------------------
263 %externalintegerfn enter
264 %string(4) base
265 %integer cad
266 %if level=1 %then %start
267 %if nextcad#0 %then fault("%BEGIN NOT FIRST STATEMENT")
268 dump("LDA","COT","",0) ;! cot base address to be filled
269 dump("LDA","DR1","",0) ;! stack base address to be filled
270 base="DR1"
271 %finish %else %start
272 dump("STR",display(level),"STP",0)
273 dump("LDA",display(level),"STP",0)
274 dump("STR","WK","STP",1)
275 base="STP"
276 %finish
277 cad=nextcad
278 dump("LDA","STP",base,0) ;! static allocation to be filled
279 nextrad(level)=2
280 %result=cad
281 %end
282 !-----------------------------------------------------------------------
283 %externalroutine dumpreturn
284 dump("LDA","STP",display(level),0)
285 dump("LOAD",display(level),"STP",0)
286 dump("LOAD","WK","STP",1)
287 dump("B","","WK",0)
288 %end
289 !-----------------------------------------------------------------------
290 %externalroutine array(%integer arrayp)
291 %integer namep,actualp,exprp,exprsp,ident,nametag
292 namep=a(arrayp+1)
293 actualp=a(arrayp+2)
294 ident=a(namep+1)
295 %if a(actualp)=1 %then %start
296 exprp=a(actualp+1)
297 exprsp=a(actualp+2)
298 expr(exprp)
299 nametag=tag(taglink(ident))
300 dump("ADD","ACC",display(nametag>>16&16_F),nametag&16_FFFF)
301 %if a(exprsp)=1 %then fault("ARRAY ".name(ident)." HAS EXTRA INDEX")
302 %finish %else fault("ARRAY ".name(ident)." HAS NO INDEX")
303 %end
304 !-----------------------------------------------------------------------
305 %externalroutine proc(%integer procp)
306 %string(4) opn,base
307 %integer namep,ident,nametag,ptagl,l,actualp,exprp,unaryp,operandp, %c
308+ npars,ptag,pnamep,pident,pnametag,pactualp,disp,exprrestp,exprsp, %c
309+ oldparams
310 %if params>2 %then dump("LDA","STP","STP",params)
311 oldparams=params
312 params=2
313 namep=a(procp+1)
314 actualp=a(procp+2)
315 ident=a(namep+1)
316 l=taglink(ident)
317 nametag=tag(l)
318 ptagl=link(l)
319 npars=nametag>>20&16_F
320 %if npars=0 %then %start
321 %if a(actualp)=1 %then fault(name(ident)." HAS PARAMETERS") %c
322+ %and %return
323 %finish %else %start
324 %if a(actualp)=2 %then fault(name(ident)." MISSING PARAMETERS") %c
325+ %and %return
326 exprp=a(actualp+1)
327 exprsp=a(actualp+2)
328 %cycle ;! for each parameter
329 ptag=tag(ptagl)
330 %if ptag>>28=0 %then expr(exprp) %else %start
331 unaryp=a(exprp+1)
332 operandp=a(exprp+2)
333 exprrestp=a(exprp+3)
334 %unless a(unaryp)=4 %and a(operandp)=1 %and a(exprrestp)=2 %c
335+ %then fault("NOT A %NAME PARAMETER") %else %start
336 pnamep=a(operandp+1)
337 pactualp=a(operandp+2)
338 pident=a(pnamep+1)
339 %if taglink(pident)=0 %then fault(name(pident). %c
340+ " NOT DECLARED") %else %start
341 pnametag=tag(taglink(pident))
342 %if pnametag>>28=4 %then fault(name(pident). %c
343+ " NOT A %NAME") %else %start
344 base=display(pnametag>>16&16_F)
345 disp=pnametag&16_FFFF
346 %if ptag>>28=1 %then %start ;! %name
347 %if pnametag>>28>=2 %then array(operandp) %else %start
348 %if pnametag>>28=1 %then opn="LOAD" %else opn="LDA"
349 dump(opn,"ACC",base,disp)
350 %if a(pactualp)=1 %then fault(name(pident). %c
351+ " DECLARED AS SCALAR")
352 %finish
353 %finish %else %start
354 dump("LOAD","ACC",base,disp) ;! %array
355 %if a(pactualp)=1 %then fault("%ARRAYNAME ". %c
356+ name(pident)." HAS INDEX")
357 %finish
358 %finish
359 %finish
360 %finish
361 %finish
362 dump("STR","ACC","STP",params)
363 params=params+1
364 npars=npars-1
365 %if npars=0 %then %start
366 %if a(exprsp)=1 %then fault(name(ident)." HAS EXTRA PARAMETERS")
367 %exit
368 %finish
369 ptagl=link(ptagl)
370 %if a(exprsp)=2 %then fault(name(ident). %c
371+ " IS MISSING PARAMETERS") %and %exit
372 exprp=a(exprsp+1)
373 exprsp=a(exprsp+2)
374 %repeat
375 %finish
376 ! external i/o routines at level 0
377 %if nametag>>16&16_F=0 %then base="EXT" %else base=""
378 dump("BAL","WK",base,nametag&16_FFFF)
379 params=oldparams
380 %if params>2 %then dump("SUB","STP","COT",getcoti(params))
381 %end
382 !-----------------------------------------------------------------------
383 %externalroutine endofprog
384 %integer i
385 dump("FILL","COT","0",nextcad)
386 i=0
387 %while i<cotp %cycle
388 dump("CONST","","",cot(i))
389 i=i+1
390 %repeat
391 dump("FILL","STACK","1",nextcad)
392 print string("
393+ $".strint(faults,4)." FAULTS IN PROGRAM
394+ ")
395 %stop
396 %end
397 %endoffile
Code 10012 bytes Glap 1832 bytes Diags 2391 bytes Total size 14235 bytes
345 statements compiled
Source file: SKIMPE.IMP Compiled on 25-OCT-1979 at 09:41:31
Computer Science IMP77 Compiler. Version 6.01
1 %externalintegerarrayspec a(1:500)
2 %externalintegerarrayspec taglink(0:255)
3 %externalintegerarrayspec tag(1:512)
4 %externalstring(4)%arrayspec display(0:15)
5 %externalintegerspec level,condflag,expropt
6 !-----------------------------------------------------------------------
7 %externalstring(255)%fnspec strint(%integer n,p)
8 %externalstring(8)%fnspec strhex(%integer n)
9 %externalroutinespec fault(%string(63) s)
10 %externalstring(255)%fnspec name(%integer ident)
11 %externalroutinespec dump(%string(7) opn,reg,base,%integer disp)
12 %externalintegerfnspec getwork
13 %externalroutinespec returnwork(%integer work)
14 %externalroutinespec proc(%integer ap)
15 %externalroutinespec array(%integer ap)
16 %externalintegerfnspec getcoti(%integer const)
17 !-----------------------------------------------------------------------
18 %externalroutine expr(%integer exprp)
19 %integerfnspec totree(%integer exprp)
20 %routinespec evaluate(%integer nodep)
21 %integerarray tree(1:64)
22 %integer treep,treenode,treenode1,treenode2,testp,expr1p,expr2p,compp,%c
23+ i,j,l
24 %constintegerarray reversecomp(1:6)=1,2,5,6,3,4
25 treep=1
26 %if condflag=0 %then treenode=totree(exprp) %else %start
27 condflag=0
28 testp=exprp ;! for <test>=<expr><comp><expr>
29 expr1p=a(testp+1)
30 compp=a(testp+2)
31 expr2p=a(testp+3)
32 treenode1=totree(expr1p)
33 treenode2=totree(expr2p)
34 %if tree(treenode1)=-4 %and tree(treenode1+1)=0 %then %start
35 a(compp)=reversecomp(a(compp))
36 treenode=treenode2
37 %finish %else %start
38 %if tree(treenode2)=-4 %and tree(treenode2+1)=0 %then %c
39+ treenode=treenode1 %else %start
40 tree(treep)=10 ;! -
41 tree(treep+1)=treenode1
42 tree(treep+2)=treenode2
43 treenode=treep
44 %finish
45 %finish
46 %finish
47 %if expropt=1 %then %start
48 newline
49 %if 0<tree(treenode)<=10 %then l=treenode+2 %else l=treenode+1
50 j=0
51 %for i=1,1,l %cycle
52 write(tree(i),4)
53 j=j+5
54 %if i=treenode %then print string("*")
55 %if tree(i)=-3 %then i=i+1 %and print string(" ". %c
56+ strhex(tree(i))) %and j=j+10
57 %if j>70 %then newline %and j=0
58 %repeat
59 newlines(2)
60 %finish
61 evaluate(treenode)
62 %return
63 !-----------------------------------------------------------------------
64 %integerfn totree(%integer exprp)
65 ! create tree form of expression
66 %routinespec pseval(%integer type,datum)
67 %integerarray os(1:4),ps(1:5) ;! operator & pseudo-evaluation stacks
68 %integer osp,psp,unaryp,operandp,exprrestp,opp,namep,actualp, %c
69+ ident,nametag
70 %constintegerarray prec(1:12)=3,3,2,1,1,3,2,2,1,1,1,4
71 ! <<,>>,&,!!,!,**,/,*,+,-,-(unary),\
72 unaryp=a(exprp+1)
73 operandp=a(exprp+2)
74 exprrestp=a(exprp+3)
75 %if a(unaryp)<=2 %then os(1)=a(unaryp)+10 %and osp=1 %else osp=0
76 psp=0
77 %cycle ;! for each operand
78 %if a(operandp)=1 %then %start ;! <name><actual>
79 namep=a(operandp+1)
80 actualp=a(operandp+2)
81 ident=a(namep+1)
82 %if taglink(ident)=0 %then %start
83 fault(name(ident)." NOT DECLARED")
84 pseval(-3,0) ;! pseval dummy tag
85 %finish %else %start
86 nametag=tag(taglink(ident))
87 %if nametag>>28<=1 %then %start ;! scalar variable
88 %if a(actualp)=1 %then %start
89 fault("SCALAR ".name(ident)." HAS PARAMETER")
90 pseval(-3,0)
91 %finish %else pseval(-3,nametag)
92 %finish %else %start
93 %if nametag>>28<=3 %then pseval(-2,operandp) %else %start
94 %if nametag>>24&16_F=0 %then %start
95 fault("ROUTINE NAME ".name(ident)." IN EXPRESSION")
96 pseval(-3,0)
97 %finish %else pseval(-1,operandp)
98 %finish
99 %finish
100 %finish
101 %finish %else %start
102 %if a(operandp)=2 %then pseval(-4,a(a(operandp+1)+1)) %else %c
103+ psp=psp+1 %and ps(psp)=totree(a(operandp+1))
104 %finish
105 %if a(exprrestp)=2 %then %exit ;! no more operands
106 opp=a(exprrestp+1)
107 operandp=a(exprrestp+2)
108 exprrestp=a(exprrestp+3)
109 %while osp>0 %and prec(a(opp))<=prec(os(osp)) %then %c
110+ pseval(os(osp),0) %and osp=osp-1 ;! unstack while prec(new op)<=
111 osp=osp+1 ;! stack new operator
112 os(osp)=a(opp)
113 %repeat
114 %while osp>0 %then pseval(os(osp),0) %and osp=osp-1 ;! unstack rest
115 %result=ps(1)
116 !-----------------------------------------------------------------------
117 %routine pseval(%integer type,datum)
118 %routinespec store(%integer t)
119 %integer nodep
120 nodep=treep
121 store(type)
122 %if type>0 %then %start ;! operator
123 %if type>10 %then store(ps(psp)) %else store(ps(psp-1)) %and %c
124+ store(ps(psp)) %and psp=psp-1
125 %finish %else store(datum) %and psp=psp+1
126 ps(psp)=nodep
127 !-----------------------------------------------------------------------
128 %routine store(%integer t)
129 %if treep>64 %then fault("EXPRESSION TOO LONG") %and %stop
130 tree(treep)=t
131 treep=treep+1
132 %end
133 %end
134 %end
135 !-----------------------------------------------------------------------
136 %routine evaluate(%integer nodep)
137 ! dump code to evaluate expression
138 %routinespec opn(%integer op,p)
139 %conststring(4)%array strop(0:12)="LOAD","SHL","SHR","AND","XOR","OR",
140+ "EXP","DIV","MLT","ADD","SUB","NEG","NOT"
141 %constintegerarray commut(1:10)=0,0,1,1,1,0,0,1,1,0
142 %integer op,opd1p,opd2p,work
143 %if tree(nodep)<0 %then opn(0,nodep) %and %return ;! operand
144 op=tree(nodep)
145 opd1p=tree(nodep+1)
146 %if op>10 %then %start ;! unary operator
147 %if tree(opd1p)>=-2 %then evaluate(opd1p) %else opn(0,opd1p)
148 dump(strop(op),"ACC","",0)
149 %return
150 %finish
151 opd2p=tree(nodep+2)
152 %if tree(opd1p)>=-2 %then %start ;! operand1 a node
153 %if tree(opd2p)>=-2 %then %start ;! operand2 a node
154 evaluate(opd2p)
155 work=getwork
156 dump("STR","ACC",display(level),work)
157 evaluate(opd1p)
158 dump(strop(op),"ACC",display(level),work)
159 returnwork(work)
160 %finish %else %start
161 evaluate(opd1p)
162 opn(op,opd2p)
163 %finish
164 %finish %else %start
165 %if tree(opd2p)>=-2 %then %start
166 evaluate(opd2p)
167 %if commut(op)#0 %then opn(op,opd1p) %and %return
168 work=getwork
169 dump("STR","ACC",display(level),work)
170 opn(0,opd1p)
171 dump(strop(op),"ACC",display(level),work)
172 returnwork(work)
173 %finish %else %start
174 opn(0,opd1p)
175 opn(op,opd2p)
176 %finish
177 %finish
178 %return
179 !-----------------------------------------------------------------------
180 %routine opn(%integer op,p)
181 ! dump object code for simple operation
182 %string(7) base
183 %integer type,datum,disp
184 type=tree(p)
185 datum=tree(p+1)
186 %if type=-1 %then proc(datum) %else %start ;! procedure
187 %if type=-2 %then %start ;! array
188 array(datum)
189 dump("LOAD","ACC","ACC",0)
190 %finish %else %start
191 %if type=-3 %then %start ;! scalar
192 base=display(datum>>16&16_F)
193 disp=datum&16_FFFF
194 %if datum>>28=1 %then %start ;! %name type
195 dump("LOAD","WK",base,disp)
196 dump(strop(op),"ACC","WK",0)
197 %finish %else dump(strop(op),"ACC",base,disp)
198 %finish %else %start ;! constant
199 %if op>0 %or datum>16_FFFF %then dump(strop(op),"ACC", %c
200+ "COT",getcoti(datum)) %else dump("LDA","ACC","",datum)
201 %finish
202 %finish
203 %finish
204 %end
205 %end
206 %end
207 %endoffile
?STRINT unused
Code 5164 bytes Glap 680 bytes Diags 1023 bytes Total size 6867 bytes
187 statements compiled
Source file: SKIMPAI.IMP Compiled on 30-OCT-1979 at 10:38:02
Computer Science IMP77 Compiler. Version 6.01
1 %externalroutine skimpai(%string(63) file)
2 %string(15)%fnspec readmn(%integername sep)
3 %routinespec dump(%integer on,rn,bn,dn)
4 %integerfnspec intstr(%string(15) s)
5 %string(15)%fnspec strint(%integer n)
6 %routinespec fault(%string(255) mess)
7 %routinespec fail(%string(255) mess)
8 %routinespec monitor
9 %conststring(7)%array imn(0:32)="STOP","LOAD","LDA","ADD","SUB","MLT",
10+ "DIV","EXP","STR","NEG","NOT","SHL","SHR","AND","OR","XOR","BAL","B",
11+ "BZ","BNZ","BG","BNG","BL","BNL","ADDF","SUBF","MLTF","DIVF","EXPF",
12+ "NEGF","FLT","FILL","CONST"
13 %string(15)%array rmn(0:15)
14 %integerarray r(0:15)
15 %integerarray s(0:4095)
16 %string(15) opn,reg,base,disp
17 %constinteger sl=4095
18 %constinteger datamask=2_0011111000000001111100011111010
19 %integer rp,sp,mon,tron,troff,on,rn,bn,dn,sep,i,faults,inst,trace,pc, %c
20+ pcl,tracepc,tracen,efad,stackb,stpr,accr,cyc,ic
21 %switch ins(0:31)
22 %switch ext(1:11)
23 %if file->("/TR").file %then trace=1 %and tracen=0 %else %start
24 %if file="" %then print string("file ?
25+ ") %and %stop
26 trace=0
27 %finish
28 open input(2,file.".lis")
29 select input(2)
30 select output(1)
31 rmn(0)=""
32 r(0)=0
33 rp=1
34 sp=0
35 mon=0
36 tron=0
37 troff=0
38 faults=0
39 instr:! process next instruction
40 read symbol(i) %until i='$'
41 %cycle
42 i=next symbol
43 %if i=' ' %then skip symbol %else %exit
44 %repeat
45 %if '0'<=i<='9' %then %start ;! "FAULTS IN PROGRAM"
46 %if i>'0' %then fault("PROGRAM WAS FAULTY") %and %stop
47 %if faults>0 %then fault("ASSEMBLY FAULTY") %and %stop
48 read symbol(i) %until i=nl
49 ->interpret
50 %finish
51 opn=readmn(sep)
52 %for on=0,1,32 %cycle
53 %if opn=imn(on) %then ->gotopn
54 %repeat
55 %if opn="MONITOR" %then mon=1 %and ->instr
56 %if opn="TRON" %then tron=1 %and ->instr
57 %if opn="TROFF" %then troff=1 %and ->instr
58 fault("INVALID OPERATION : ".opn.tostring(sep))
59 ->instr
60 gotopn:! get operands
61 %if sep#',' %then fault("INVALID FORMAT : ".opn.tostring(sep)) %c
62+ %and ->instr
63 reg=readmn(sep)
64 %if sep#',' %then fault("INVALID FORMAT : ".opn.",".reg. %c
65+ tostring(sep)) %and ->instr
66 base=readmn(sep)
67 %if sep#',' %then fault("INVALID FORMAT : ".opn.",".reg. %c
68+ ",".base.tostring(sep)) %and ->instr
69 disp=readmn(sep)
70 %if opn="FILL" %then %start
71 bn=intstr(base)
72 %if bn<0 %then fault("INVALID FILL : ".reg.",".base) %and ->instr
73 dn=intstr(disp)
74 %unless 0<=dn<=16_FFFF %then fault("INVALID FILL : ".reg. %c
75+ ",".base.",".disp) %and ->instr
76 cyc=0
77 %cycle
78 cyc=cyc+1
79 %if bn>=sp %or cyc>sp %then fault("INVALID FILL LIST AT ". %c
80+ strint(sp)) %and ->instr
81 i=bn
82 bn=s(bn)&16_FFFF
83 s(i)=s(i)&16_FFFF0000!dn
84 %repeat %until bn=0
85 %if reg="COT" %then pcl=dn-1 ;! program counter limit
86 ->instr
87 %finish
88 %if opn="CONST" %then %start
89 dn=intstr(disp)
90 %if dn<0 %then fault("INVALID CONSTANT : ".disp) %and ->instr
91 dump(0,0,0,dn)
92 ->instr
93 %finish
94 %if reg="" %and opn#"B" %and opn#"STOP" %then %c
95+ fault("REGISTER MISSING AT ".strint(sp)) %and rn=0 %and ->gotreg
96 %for rn=0,1,rp-1 %cycle
97 %if reg=rmn(rn) %then ->gotreg
98 %repeat
99 %if rp=16 %then fault("EXCESS REGISTER : ".reg) %and rn=0 %c
100+ %else rmn(rp)=reg %and rn=rp %and rp=rp+1
101 gotreg:%if base="EXT" %and opn="BAL" %then on=31 %and bn=0 %and->gotbase
102 %for bn=0,1,rp-1 %cycle
103 %if base=rmn(bn) %then ->gotbase
104 %repeat
105 %if rp=16 %then fault("EXCESS REGISTER : ".base) %and bn=0 %c
106+ %else rmn(rp)=base %and bn=rp %and rp=rp+1
107 gotbase:dn=intstr(disp)
108 %unless 0<=dn<=16_FFFF %then fault("INVALID DISPLACEMENT : ". %c
109+ disp) %and dn=0
110 dump(on,rn,bn,dn)
111 ->instr
112 !-----------------------------------------------------------------------
113 interpret:! interpretation section
114 trace=0
115 stpr=0
116 accr=0
117 %for i=0,1,rp-1 %cycle
118 %if rmn(i)="STP" %then stpr=i
119 %if rmn(i)="ACC" %then accr=i
120 r(i)=0
121 %repeat
122 stackb=sp
123 %while sp<=sl %then s(sp)=16_80000000 %and sp=sp+1
124 ic=0
125 pc=0
126 exi:! execute instruction
127 %if ic>10000 %then fail("10000 INSTRUCTIONS EXECUTED")
128 %unless 0<=pc<=pcl %then fail("PC OUT OF BOUNDS")
129 inst=s(pc)
130 tracepc=pc
131 pc=pc+1
132 %if inst&16_80000000#0 %then monitor
133 %if inst&16_40000000#0 %then trace=1
134 %if inst&16_20000000#0 %then trace=0
135 on=inst>>24&16_1F
136 rn=inst>>20&16_F
137 bn=inst>>16&16_F
138 dn=inst&16_FFFF
139 efad=r(bn)+dn
140 %if 1<<on&datamask#0 %then %start
141 %unless pcl<efad<=sl %then fail("DATA ADDRESS ". %c
142+ strint(efad)." OUT OF BOUNDS")
143 %if s(efad)=16_80000000 %then fail("UNASSIGNED VARIABLE AT ". %c
144+ strint(efad))
145 %finish %else %start
146 %if on=8 %then %start
147 %unless stackb<=efad<=sl %then fail("STR ADDRESS ". %c
148+ strint(efad)." OUT OF BOUNDS")
149 %finish
150 %finish
151 ->ins(on)
152 ins(0):print string("
153+ STOPPED AT ".strint(tracepc).", ".strint(ic)." INSTRUCTIONS EXECUTED
154+ ") %and %stop
155 ins(1):r(rn)=s(efad) ; ->tracep
156 ins(2):%if rn=stpr %and efad<r(stpr) %then %start
157 ! clear old area back to unassigned
158 %for i=efad+2,1,r(stpr)+2 %cycle
159 s(i)=16_80000000 %unless i>sl
160 %repeat
161 %finish
162 r(rn)=efad
163 ->tracep
164 ins(3):r(rn)=r(rn)+s(efad) ; ->tracep
165 ins(4):r(rn)=r(rn)-s(efad) ; ->tracep
166 ins(5):r(rn)=r(rn)*s(efad) ; ->tracep
167 ins(6):r(rn)=r(rn)//s(efad) ; ->tracep
168 ins(7):r(rn)=r(rn)\\s(efad) ; ->tracep
169 ins(8):s(efad)=r(rn) ; ->tracep
170 ins(9):r(rn)=-r(rn) ; ->tracep
171 ins(10):r(rn)=\r(rn) ; ->tracep
172 ins(11):r(rn)=r(rn)<<s(efad) ; ->tracep
173 ins(12):r(rn)=r(rn)>>s(efad) ; ->tracep
174 ins(13):r(rn)=r(rn)&s(efad) ; ->tracep
175 ins(14):r(rn)=r(rn)!s(efad) ; ->tracep
176 ins(15):r(rn)=r(rn)!!s(efad) ; ->tracep
177 ins(16):r(rn)=pc ; pc=efad ; ->tracep
178 ins(17):pc=efad ; ->tracep
179 ins(18):%if r(rn)=0 %then pc=efad ; ->tracep
180 ins(19):%if r(rn)#0 %then pc=efad ; ->tracep
181 ins(20):%if r(rn)>0 %then pc=efad ; ->tracep
182 ins(21):%if r(rn)<=0 %then pc=efad ; ->tracep
183 ins(22):%if r(rn)<0 %then pc=efad ; ->tracep
184 ins(23):%if r(rn)>=0 %then pc=efad ; ->tracep
185 ins(24):real(addr(r(rn)))=real(addr(r(rn)))+real(addr(s(efad)))
186 ->tracep
187 ins(25):real(addr(r(rn)))=real(addr(r(rn)))-real(addr(s(efad)))
188 ->tracep
189 ins(26):real(addr(r(rn)))=real(addr(r(rn)))*real(addr(s(efad)))
190 ->tracep
191 ins(27):real(addr(r(rn)))=real(addr(r(rn)))/real(addr(s(efad)))
192 ->tracep
193 ins(28):real(addr(r(rn)))=real(addr(r(rn)))\s(efad)
194 ->tracep
195 ins(29):real(addr(r(rn)))=-real(addr(r(rn))) ; ->tracep
196 ins(30):real(addr(r(rn)))=r(rn) ; ->tracep
197 ins(31):%if stpr=0 %then fail("REGISTER STP NOT DEFINED FOR I/O". %c
198+ " ROUTINE CALL")
199 ->ext(dn)
200 ext(1):read symbol(s(s(r(stpr)+2))) ; ->tracep
201 ext(2):%if accr=0 %then fail("REGISTER ACC NOT DEFINED FOR ". %c
202+ "'NEXT SYMBOL' I/O FUNCTION CALL")
203 r(accr)=next symbol
204 ->tracep
205 ext(3):skip symbol ; ->tracep
206 ext(4):print symbol(s(r(stpr)+2)) ; ->tracep
207 ext(5):space ; ->tracep
208 ext(6):spaces(s(r(stpr)+2)) ; ->tracep
209 ext(7):newline ; ->tracep
210 ext(8):newlines(s(r(stpr)+2)) ; ->tracep
211 ext(9):newpage
212 ext(10):read(s(s(r(stpr)+2))) ; ->tracep
213 ext(11):write(s(r(stpr)+2),s(r(stpr)+3))
214 tracep:ic=ic+1
215 %if trace#0 %then %start
216 write(tracepc,4)
217 print symbol('$')
218 tracen=tracen+1
219 %if tracen=10 %then newline %and tracen=0
220 %finish
221 ->exi
222 !-----------------------------------------------------------------------
223 %string(15)%fn readmn(%integername sep)
224 %string(255) s
225 %integer flag
226 s=""
227 flag=0
228 %cycle
229 read symbol(sep)
230 %if '0'<=sep<='9' %or 'A'<=sep<='Z' %then %start
231 %if length(s)<255 %then s=s.tostring(sep) %else flag=1
232 %finish %else %start
233 %if sep#' ' %then %exit
234 %finish
235 %repeat
236 %if length(s)>15 %or flag#0 %then fault("INVALID MNEMONIC : ".s) %c
237+ %and %result="" %else %result=s
238 %end
239 !-----------------------------------------------------------------------
240 %routine dump(%integer on,rn,bn,dn)
241 %if sp>sl %then fault("PROGRAM TOO BIG") %and %stop
242 s(sp)=mon<<31!tron<<30!troff<<29!on<<24!rn<<20!bn<<16!dn
243 sp=sp+1
244 mon=0
245 tron=0
246 troff=0
247 %end
248 !-----------------------------------------------------------------------
249 %integerfn intstr(%string(15) s)
250 %integer value,d,i
251 value=0
252 %for i=1,1,length(s) %cycle
253 d=charno(s,i)
254 %unless '0'<=d<='9' %then %result=-1
255 value=value*10+d-'0'
256 %repeat
257 %result=value
258 %end
259 !-----------------------------------------------------------------------
260 %string(15)%fn strint(%integer n)
261 %string(15) r
262 %string(1) s
263 r=""
264 %if n<0 %then n=-n %and s="-" %else s=""
265 r=tostring(n-n//10*10+'0').r %and n=n//10 %until n=0
266 %result=s.r
267 %end
268 !-----------------------------------------------------------------------
269 %routine fault(%string(255) mess)
270 print string("
271+ ".strint(sp)."$ ".mess."
272+ ")
273 faults=faults+1
274 %end
275 !-----------------------------------------------------------------------
276 %routine fail(%string(255) mess)
277 print string("
278+ * ".mess."
279+ PC=".strint(tracepc)."$")
280 monitor
281 %stop
282 %end
283 !-----------------------------------------------------------------------
284 %routine monitor
285 %string(15) v
286 %integer i,j
287 newline
288 %if rp>1 %then %start
289 %for i=1,1,rp-1 %cycle
290 print string(rmn(i)." ".strint(r(i))."
291+ ")
292 %repeat
293 %finish
294 %if stpr#0 %and r(stpr)>stackb %then %start
295 write(stackb,2)
296 print symbol('$')
297 j=1
298 %for i=stackb,1,r(stpr)+17 %cycle
299 %if s(i)=16_80000000 %then print string(" ?") %c
300+ %else write(s(i),3)
301 j=j+1
302 %if j=16 %then newline %and j=0
303 %repeat
304 newline
305 %finish
306 %end
?V unused
307 %endofprogram
Code 9580 bytes Glap 1456 bytes Diags 1428 bytes Total size 12464 bytes
299 statements compiled