The IMP Intermediate Code A Brief Summary The IMP intermediate code may be considered a sequence of instructions to a stack-oriented machine which generates programs for specific computers. It is important to note that the intermediate code describes the compilation process necessary to generate an executable form of a program; it does not directly describe the computation defined by the program. The machine which accepts the intermediate code has two main components: 1 A Descriptor area. This is used to hold descriptors containing machine-dependent definitions of the objects the program is to manipulate. This area is maintained in a block-structured fashion, that is new descriptors are added to the area during the definition of a block and are removed from the area at the end of the block. 2 A Stack. The stack holds copies of descriptors taken from the descriptor area or the parameter
area, or created specially. Items on the stack are modified by intermediate code control items to reflect operations specified in the source program. Such modifications may or may not result in code being generated. From the point of view of this definition stack elements are considered to have at least three components: i Type ii Value iii Access rule The "Access rule" defines how the "Type" and "Value" attributes are to interpreted in order to locate the described object. For example, the access rule for a constant could be "Value contains the constant" while for a variable it could be "Value contains the address of the variable". Clearly, the access rules are target-machine dependent. Descriptors may be combined to give fairly complex access rules, as in the case of applying "PLUS" to the stack when the top two descriptors are for the variable X and the constant 1 resulting in one descriptor with the access rule "take the value in X and add 1 to it". The complexity of these access rules may be restricted by a code-generator. In the example above code could be generated to evaluate X+1 resulting in an access rule "the value is in register 1", say.
The importance of the code not describing the actual computation which the source program specified but the compilation process required is seen when attempting to use the code for statements of the form: A := if B = C then D else E; This could not be encoded as: PUSH A PUSH B PUSH C JUMP # L1 PUSH D BR L2 LOC L1 PUSH E LOC L2 ASSVAL The reason is that the items on the stack at the time of the ASSVAL would be (from top to bottom) [E], [D], [A], because no items were given which would remove them from the stack. hence the ASSVAL would assign the value of E to D and then leave A dangling on the stack. Unless otherwise stated, all constants in the intermediate code are represented in octal.
Descriptors DEF TAG TEXT TYPE FORM SIZE SPEC PREFIX This item causes a new descriptor to be generated and placed in the descriptor area. On creation, the various fields of the DEF are used to construct the machine-dependent representation required for the object. TAG is an identification which will be used subsequently to refer to the descriptor. TEXT is the source-language identifier given to the object (a null string if no identifier was specified). TYPE is the type of the object: GENERAL, INTEGER, REAL, STRING, RECORD, LABEL, SWITCH, FORMAT. FORM is one of: SIMPLE, NAME, ROUTINE, FN, MAP, PRED, ARRAY, NARRAY, ARRAYN, NARRAYN. SIZE is either the TAG of the appropriate record format descriptor for records, the maximum length of a string variable, or the precision of numerical variables: DEFAULT, BYTE, SHORT. LONG.
SPEC has the value SPEC or NONE depending on whether or not the item is a specification. PREFIX is one of: NONE, OWN, CONST, EXTERNAL, SYSTEM, DYNAMIC, PRIM, PERM or SPECIAL. If SPECIAL is given there will follow an implementation-dependent specification of the properties of the object (such as that it is to be a register, for example).
Parameters and Formats The parameters for procedures and the elements of record formats are defined by a list immediately following the procedures or format descriptor definition: START Start of definition list FINISH End of definition list ALTBEG Start of alternative sequence ALT Alternative separator ALTEND End of alternative sequence. Blocks BEGIN Start of BEGIN block END End of BEGIN block or procedure
PUSH <tag> Push a copy of the descriptor <tag> onto the stack. PROC <tag> This is the same as PUSH except that the descriptor being stacked represents a procedure which is about to be called (using ENTER). PUSHI <n> Push a descriptor for the integer constant <n> onto the stack. PUSHR <r> Push a descriptor for the real (floating-point) constant <r> onto the stack. PUSHS <s> Push a descriptor for the string constant <s> onto the stack. SELECT <tag> TOS will be a descriptor for a record. Replace this descriptor with one describing the sub-element <tag> of this record.
Assignment ASSVAL Assign the value described by TOS to the variable described by SOS. Both TOS and SOS are popped from the stack. ASSREF Assign a reference to (the address of) the variable described by TOS to the pointer variable described by SOS. Both TOS and SOS are popped from the stack. JAM This is the same as ASSVAL except that the value being assigned will be truncated if necessary. ASSPAR Assign the actual parameter described by TOS to the formal parameter described by SOS. This is equivalent to either ASSVAL (for value parameters) or ASSREF (for reference parameters). RESULT TOS describes the result of the enclosing function. Following the processing of the result code must be generated to return from the function.
MAP Similar to RESULT except that TOS describes the result of a MAP. Again a return must be generated. DEFAULT <n> INIT <n> Create N data items, corresponding to the last descriptor defined, and given them all an initial (constant) value. The constant is popped from the stack in the case of INIT but DEFAULT causes the machine-dependent default value to be used (normally the UNASSIGNED pattern).
Binary operators ADD Addition SUB Subtraction MUL Multiplication QUOT Integer division DIVIDE Real division IEXP Integer exponentiation REXP Real exponentiation AND Logical AND OR Logical inclusive OR XOR Logical exclusive OR LSH Logical left shift RSH Logical right shift CONC String concatenate ADDA ++ SUBA -- The given operation is performed on TOS and SOS , both of which are removed from the stack, and the result (SOS op TOS) is pushed onto the stack. e.g. A = B-C PUSH A PUSH B PUSH C SUB ASSVAL
Unary Operators NEG Negate (unary minus) NOT Logical NOT (complement) MOD Modulus (absolute value) The given operation is performed on TOS.
Arrays DIM <d> <n> The stack will contain <d> pairs of descriptors corresponding to the lower and upper bounds for an array. This information is used to construct <n> arrays and any necessary accessing information for use through the last <n> descriptors to have been defined. All of these descriptors will be for similar arrays. INDEX SOS will be the descriptor for a multi-dimensional array and TOS will be the next non-terminal subscript. The stack is popped. ACCESS SOS will be the descriptor of an array and TOS will be the final/only subscript. Both descriptors are replaced by a descriptor for the appropriate element of the array. E.g. given arrays A(1:5) and B(1:4, 2:6), and integers J,K: A(J) = 0 K = B(J, K) PUSH A PUSH K PUSH J PUSH B ACCESS PUSH J PUSHC 0 INDEX ASSVAL PUSH K ACCESS ASSIGN
Internal labels Internal labels are those labels in the intermediate code which have been created by the process of translating from the source program, and so do not appear explicitly in the source program. The main property of these labels is that they will only be referred to once. This fact can be used to re-use these labels, as, for example, a forward reference to a currently-defined label must cause its redefinition. LOCATE <1> define internal label <1> GOTO <1> forward jump to internal label <1> REPEAT <1> backward jump to internal label <1>
Conditional branches These branches are always forward. JUMPIF <cond> <label> JUMPIFD <cond> <label> JUMPIFA <cond> <label> Where: <cond> ::= =, #, <, <=, >, >=, TRUE, FALSE The two items on the top of the stack are compared and a jump is taken to <label> is the condition specified by <cond> is true. In the case of <cond> being TRUE or FALSE only one item is taken from the stack, and this represents a boolean value to be tested. User Labels LABEL <d> locate label descriptor <d> JUMP <d> Jump to the label described by <d> CALL <d> Call the procedure described by <d>
Sundry Items ON <e> <l> Start of event trap for events <e>. Internal label <l> defines the end of the event block. EVENT <e> Signal event <e> STOP stop MONITOR monitor RESOLVE <m> Perform a string resolution FOR Start of a for loop SLABEL <sd> Define switch label SJUMP <sd> Select and jump to switch label LINE <l> Set the current line number to <l>