The IMP Intermediate Code
A Brief Summary
The IMP intermediate code may be considered a sequence of
instructions to a stack-oriented machine which generates
programs for specific computers. It is important to note
that the intermediate code describes the compilation process
necessary to generate an executable form of a program; it
does not directly describe the computation defined by the
program.
The machine which accepts the intermediate code has two
main components:
1 A Descriptor area. This is used to hold
descriptors containing machine-dependent
definitions of the objects the program is to
manipulate. This area is maintained in a
block-structured fashion, that is new descriptors
are added to the area during the definition of a
block and are removed from the area at the end of
the block.
2 A Stack. The stack holds copies of descriptors
taken from the descriptor area or the parameter
area, or created specially. Items on the stack
are modified by intermediate code control items
to reflect operations specified in the source
program. Such modifications may or may not
result in code being generated. From the point
of view of this definition stack elements are
considered to have at least three components:
i Type
ii Value
iii Access rule
The "Access rule" defines how the "Type" and
"Value" attributes are to interpreted in order to
locate the described object.
For example, the access rule for a constant could
be "Value contains the constant" while for a
variable it could be "Value contains the address
of the variable". Clearly, the access rules are
target-machine dependent. Descriptors may be
combined to give fairly complex access rules, as
in the case of applying "PLUS" to the stack when
the top two descriptors are for the variable X
and the constant 1 resulting in one descriptor
with the access rule "take the value in X and add
1 to it". The complexity of these access rules
may be restricted by a code-generator. In the
example above code could be generated to evaluate
X+1 resulting in an access rule "the value is in
register 1", say.
The importance of the code not describing the actual
computation which the source program specified but the
compilation process required is seen when attempting to use
the code for statements of the form:
A := if B = C then D else E;
This could not be encoded as:
PUSH A
PUSH B
PUSH C
JUMP # L1
PUSH D
BR L2
LOC L1
PUSH E
LOC L2
ASSVAL
The reason is that the items on the stack at the time of the
ASSVAL would be (from top to bottom) [E], [D], [A], because
no items were given which would remove them from the stack.
hence the ASSVAL would assign the value of E to D and then
leave A dangling on the stack.
Unless otherwise stated, all constants in the
intermediate code are represented in octal.
Descriptors
DEF TAG TEXT TYPE FORM SIZE SPEC PREFIX
This item causes a new descriptor to be generated
and placed in the descriptor area. On creation,
the various fields of the DEF are used to
construct the machine-dependent representation
required for the object.
TAG is an identification which will
be used subsequently to refer to
the descriptor.
TEXT is the source-language identifier
given to the object (a null
string if no identifier was
specified).
TYPE is the type of the object:
GENERAL, INTEGER, REAL, STRING,
RECORD, LABEL, SWITCH, FORMAT.
FORM is one of: SIMPLE, NAME, ROUTINE,
FN, MAP, PRED, ARRAY, NARRAY,
ARRAYN, NARRAYN.
SIZE is either the TAG of the
appropriate record format
descriptor for records, the
maximum length of a string
variable, or the precision of
numerical variables: DEFAULT,
BYTE, SHORT. LONG.
SPEC has the value SPEC or NONE
depending on whether or not the
item is a specification.
PREFIX is one of: NONE, OWN, CONST,
EXTERNAL, SYSTEM, DYNAMIC, PRIM,
PERM or SPECIAL. If SPECIAL is
given there will follow an
implementation-dependent
specification of the properties
of the object (such as that it is
to be a register, for example).
Parameters and Formats
The parameters for procedures and the elements of record
formats are defined by a list immediately following the
procedures or format descriptor definition:
START Start of definition list
FINISH End of definition list
ALTBEG Start of alternative sequence
ALT Alternative separator
ALTEND End of alternative sequence.
Blocks
BEGIN Start of BEGIN block
END End of BEGIN block or procedure
PUSH <tag> Push a copy of the descriptor <tag> onto
the stack.
PROC <tag> This is the same as PUSH except that the
descriptor being stacked represents a
procedure which is about to be called
(using ENTER).
PUSHI <n> Push a descriptor for the integer constant
<n> onto the stack.
PUSHR <r> Push a descriptor for the real
(floating-point) constant <r> onto the
stack.
PUSHS <s> Push a descriptor for the string constant
<s> onto the stack.
SELECT <tag> TOS will be a descriptor for a record.
Replace this descriptor with one describing
the sub-element <tag> of this record.
Assignment
ASSVAL Assign the value described by TOS to the
variable described by SOS. Both TOS and
SOS are popped from the stack.
ASSREF Assign a reference to (the address of) the
variable described by TOS to the pointer
variable described by SOS. Both TOS and
SOS are popped from the stack.
JAM This is the same as ASSVAL except that the
value being assigned will be truncated if
necessary.
ASSPAR Assign the actual parameter described by
TOS to the formal parameter described by
SOS. This is equivalent to either ASSVAL
(for value parameters) or ASSREF (for
reference parameters).
RESULT TOS describes the result of the enclosing
function. Following the processing of the
result code must be generated to return
from the function.
MAP Similar to RESULT except that TOS describes
the result of a MAP. Again a return must
be generated.
DEFAULT <n>
INIT <n> Create N data items, corresponding to the
last descriptor defined, and given them all
an initial (constant) value. The constant
is popped from the stack in the case of
INIT but DEFAULT causes the
machine-dependent default value to be used
(normally the UNASSIGNED pattern).
Binary operators
ADD Addition
SUB Subtraction
MUL Multiplication
QUOT Integer division
DIVIDE Real division
IEXP Integer exponentiation
REXP Real exponentiation
AND Logical AND
OR Logical inclusive OR
XOR Logical exclusive OR
LSH Logical left shift
RSH Logical right shift
CONC String concatenate
ADDA ++
SUBA --
The given operation is performed on TOS and SOS , both of
which are removed from the stack, and the result
(SOS op TOS) is pushed onto the stack.
e.g. A = B-C
PUSH A
PUSH B
PUSH C
SUB ASSVAL
Unary Operators
NEG Negate (unary minus)
NOT Logical NOT (complement)
MOD Modulus (absolute value)
The given operation is performed on TOS.
Arrays
DIM <d> <n> The stack will contain <d> pairs of
descriptors corresponding to the lower and
upper bounds for an array. This
information is used to construct <n> arrays
and any necessary accessing information for
use through the last <n> descriptors to
have been defined. All of these
descriptors will be for similar arrays.
INDEX SOS will be the descriptor for a
multi-dimensional array and TOS will be the
next non-terminal subscript. The stack is
popped.
ACCESS SOS will be the descriptor of an array and
TOS will be the final/only subscript. Both
descriptors are replaced by a descriptor
for the appropriate element of the array.
E.g. given arrays A(1:5) and B(1:4, 2:6),
and integers J,K:
A(J) = 0 K = B(J, K)
PUSH A PUSH K
PUSH J PUSH B
ACCESS PUSH J
PUSHC 0 INDEX
ASSVAL PUSH K
ACCESS
ASSIGN
Internal labels
Internal labels are those labels in the
intermediate code which have been created
by the process of translating from the
source program, and so do not appear
explicitly in the source program. The main
property of these labels is that they will
only be referred to once. This fact can be
used to re-use these labels, as, for
example, a forward reference to a
currently-defined label must cause its
redefinition.
LOCATE <1> define internal label <1>
GOTO <1> forward jump to internal label <1>
REPEAT <1> backward jump to internal label <1>
Conditional branches
These branches are always forward.
JUMPIF <cond> <label>
JUMPIFD <cond> <label>
JUMPIFA <cond> <label>
Where: <cond> ::= =, #,
<, <=,
>, >=,
TRUE, FALSE
The two items on the top of the stack are compared and a
jump is taken to <label> is the condition specified by
<cond> is true. In the case of <cond> being TRUE or FALSE
only one item is taken from the stack, and this represents a
boolean value to be tested.
User Labels
LABEL <d> locate label descriptor <d>
JUMP <d> Jump to the label described by <d>
CALL <d> Call the procedure described by <d>
Sundry Items
ON <e> <l> Start of event trap for events <e>.
Internal label <l> defines the end of the
event block.
EVENT <e> Signal event <e>
STOP stop
MONITOR monitor
RESOLVE <m> Perform a string resolution
FOR Start of a for loop
SLABEL <sd> Define switch label
SJUMP <sd> Select and jump to switch label
LINE <l> Set the current line number to <l>