The IMP intermediate code Working Notes Peter S. Robertson June 1979 THE IMP INTERMEDIATE CODE The IMP Intermediate Code may be considered the order-code of a stack-oriented machine which manipulates objects exclusively through descriptors. The code is made up from ASCII characters (See Appendix 2 for a complete list of valid characters). Note that the symbols 'SPACE' and 'NEWLINE' are not valid characters; their inclusion in the examples is purely for ease of reading, they do not occur in the actual code. The workspace for this machine comprises three areas: 1. Descriptor Area This is used to hold all descriptors which are currently defined. 2. Parameter Area This is used to hold the descriptors for procedure (formal) parameters and record format elements. 3. Stack This is used to hold copies of descriptors which are being processed. Note that in general these copies are removed whenever they are accessed. The code is input by a program which "simulates" the operations specified and thereby generates code for the target machine. The code is not intended to be interpreted. It is important to note that the intermediate code is a set of commands to a code-generator and that the operations described are compile-time operations. This means that, for example, the operation '@' (q.v.) cannot be used to stack a descriptor as the result of a condition. Hence the ALGOL statement: A := if B=C then D else E could not be encoded as: @a @b @c ? #1 @d F2 :1 @e :2 S NOTATION [V] - The descriptor for the entity with name V. @V - Stack the descriptor for V. Unless otherwise stated all numerical constants in the intermediate code are in octal. 1 DESCRIPTOR DEFINITION $NNNCCC,TF,LF,FLAG This item introduces the definition of a descriptor, where: NNN - descriptor identifier (octal tag) The tags are defined in numerical order. (see ';') CCC - descriptor name (IMP type name). IF - TYPE<<4+FORM TYPE FORM 0 - GENERAL 0 - UNDEFINED 1 - INTEGER 1 - SIMPLE (E.G. INTEGER) 2 - REAL 2 - NAME (E.G. INTEGERNAME) 3 - STRING 3 - LABEL 4 - RECORD 4 - FORMAT 5 - UNDEFINED 6 - SWITCH 7 - ROUTINE 10 - FUNCTION 11 - MAP 12 - PREDICATE 13 - ARRAY 14 - ARRAYNAME 15 - NAMEARRAY 16 - NAMEARRAYNAME 17 - UNDEFINED LF - STRINGS : MAXIMUM LENGTH (ZERO = NONE GIVEN) - RECORDS : FORMAT TAG (ZERO = NONE GIVEN) - OTHER : 1 - DEFAULT SIZE 2 - BYTE 3 - SHORT 4 - LONG FLAG - SCOPE + SPEC<<3 + PROT<<4 + EXTRA<<8 SCOPE SPEC PROT 0 - NULL 0 - NO SPEC 0 - NONE 1 - OWN 1 - SPEC &1 - FROZEN... 2 - CONSTANT &2 - ..FROZENNAME 3 - EXTERNAL &4 - ..ARRAYFROZENNAME 4 - SYSTEM 5 - DYNAMIC EXTRA - FORM DEPENDENT 6 - PRIMITIVE (PRIM) 7 - PERMANENT (PERM) @tag Stack a copy of the descriptor with identifier tag. (see RECORDS for negative tag) 2 PROCEDURES E Call the procedure described on top of the stack and remove the descriptor. A new descriptor describing the result is pushed onto the stack if the procedure is a function or a map. p Assign the next parameter, described on the top of the stack, to the procedure described next on the stack. The parameter descriptor is removed leaving the procedure descriptor. e.g. @7@1p@2pE R(X, Y) Procedure Definition $12PROC,7,1,0 procedure descriptor { start of parameters ........... parameter descriptors } end of parameters The parameter descriptors are copied into the Parameter Area; they are also removed from the Descriptor Area if this is a spec. N.B. {} also define record formats. 3 H Start of begin block. ; end of procedure or begin block. All local descriptors are destroyed. R Return from routine. (see Conditional statements for exits from predicates). e.g. $10R,7,1,0 routine R(integer X, Y) {11X,21,1,0$12Y,21,1,0} R return ; end V Assign the result of the current function and return. M Assign the result of the current map and return. The left-hand side of the assignment is implied by type and form of the enclosing procedure; the right-hand side is popped from the stack. Note the form in which procedure type parameters are specified: routine R(routine S(integer X), integer Y) $10R,7,1,0{$11S,7,1,0$12Y,21,1,0} $11S,7,1,10{$13X,21,1,0} If a procedure type parameter itself has parameters which are procedures the parameters for those procedures are not specified. E.g. routine A(routine B(routine C(integer D))) $50A,7,1,0,{$51B,7,1,0} $51B,7,1,10,{$52C,7,1,0} 4 CONSTANTS N Stack a descriptor for an integer constant. The parameter is a constant in octal. ' Stack a descriptor for a string constant. The parameter is of the form: L,CCC where: L : length of the string. C : characters of the string. e.g. '3,MAX "MAX" '4,DATA "DATA" '0, "" Note the form of the null string ('0,). D Stack a descriptor for a real constant. The parameter is of the form: L,CCC where: L : number of digits (including '.'). C : decimal digits, including one optional decimal point and one optional exponent sign '@' followed by the exponent as a signed octal tag (the tag counting as one digit). e.g. D4,10.4 10.4 D3,1@12 1@10 5 RECORDS Definition $20F,104,0,0 define format descriptor { start of element descriptors ....... } end of elements. 1. The tag is used to index into the format list. recordformat F(integer A, B, C) $24F,104,0,0,{$A,21,1,0$B,21,1,0,$C,21,1,0} A is element -1 B is element -2 C is element -3 2. The format descriptors are removed from the Descriptor Area and placed in the Parameter Area. 3. Note that the descriptors in the format definition do not have an explicit tag. 4. A recordformat can only be defined within a recordformat definition when a list of alternative formats has been specified in the source, that is a statement of the form:- recordformat FORM123(FORM1 or FORM2 or FORM3) $60FORM123,104,0,0{$FORM1,104,40,0 $FORM2,104,41,0 $FORM3,104,42,0} Variables $21R,101,20,0 record(F) R $22RN,102,20,0 record(F)name RN Access A negative tag for '@' specifies that the tag is to be used to modify the descriptor on the top of the stack (a record descriptor) to describe a sub-field of that record. n Redefine the descriptor on the top of the stack to have the format corresponding to the n'th descriptor in its current format definition (c.f. negative tags to '@'). 6 ARRAYS Declaration d dim,n Define array This completes the definition of 'n' arrays of dimension 'dim'. The stack contains 'dim' pairs of bounds, and tne arrays are the last 'n' descriptors to have been defined. for example integerarray A,B(1:5) $1A,33,1,0$2B,33,1,0 N1N5 d1,2 Array Access i Evaluate (non-terminal) array index. The top two elements on the stack are the index descriptor and the array descriptor. The index descriptor is removed from the stack a Perform an array access. The stack contains the final/only index and the array desriptor. These two are removed from the stack and replaced by a descriptor of the array element accessed. e.g. A(J) @5 stack descriptor for 'C' @1 stack descriptor for 'J' i index @2 stack descriptor for 'K' a array access leaving [C(J, K)] on the stack. 7 OWN VARIABLES A tag If the stack is empty the adsign the default value to the last descriptor to have been defined. Otherwise, assign the constant desribed by the item on the stack to the last descriptor to have been defined. 'tag' specifies the number of elements to be assigned the value, zero meaning that the value is to be ignored; 'tag' is 1 if the object being initialised is not an array. N1$1J,21,1,1A1 owninteger J=1 $2K,21,1,1A1N3$3L,21,1,1A1 owninteger K,L=3 N24b $4A,33,1,1N1A3 ownintegerarray A(2:4)=1(3) l Initialise the whole of the last (own array) descriptor defined to the default value. b Construct a constant dope-vector for an own array or switch vector from the last two constants described on the top of the stack. e.g. ownintegerarray Q(2:4) N2N4b define bounds $6Q,33,1,1 define 'Q' l initialise to default value ownintegerarray S(3:6) = 0, 1, 2 N3N6b define bounds $10S,33,1,1 define 'S' N0A1N1A1N2A1 assign initial values 8 ASSIGNMENT S Assign value (=) Z Assign address (==) j Jam transfer (<-) g Global assignment (** not generated yet **) The top two elements on the stack are descriptors for the right-hand and left-hand sides of the assignment. Both descriptors are removed from the stack. @1@2S N = V @1@2Z N == V @1@2j N <- V The operator 'g' is used to specify that the assignment is to be performed at compile/load time if possible, otherwise at run-time. The sort of assignment required (S or Z) being determined by the form of the left-hand side, i.e. integer uses S, integername uses Z - c.f. 'p'. 9 BINARY OPERATORS + Add - Subtract * Multiply / Integer Divide Q Real Divide % Logical Exclusive OR ! Logical Inclusive OR & Logical AND [ Logical LEFT SHIFT ] Logical RIGHT SHIFT X Integer exponentiation x Real exponentiation . String Concatenation The given operation is performed on the elements described by the top two descriptors on the stack. These descriptors are replaced by a single descriptor defining the result. UNARY OPERATORS U Unary minus \ Logical NOT (complement) v Modulus The operation is performed on the object described by the descriptor on the top of the stack and that descriptor is replaced by a descriptor of the result. 10 LABEL DEFINITION : label Define (compiler) label 'label' to be here. NOTE: these labels may be redefined. 'label' is not a 'tag' and does not refer to a descriptor. L tag Define (user) label 'tag' to be here. NOTE: these labels may not be redefined. UNCONDITIONAL BRANCHES B label Branch backwards (repeat) F label Branch forwards J tag Jump to user-defined label (forwards or backwards) In the case of user-defined labels (J), 'tag' corresponds to a label defined using 'L' and hence may have the same value as a label defined using ':'. :3 cycle F4 exit J11 ->LAB F5 continue :5B3:4 repeat L11 LAB: NOTE: the label 0 (zero) is reserved for jumps around procedures. This is to enable the jumps around several consecutive procedures to be collapsed into one. 11 CONDITIONAL STATEMENTS AND JUMPS Conditional statements manipulate values to give a Condition Code. This code may have six values: EQUAL GREATER THAN TRUE NOT EQUAL LESS THAN FALSE These values are set by five operators: ? Compare the values described by the top two descriptors on the stack and set the Condition Code appropriately (EQUAL, NOT EQUAL, GREATER or LESS). The two descriptors are removed from the stack (but see '"'). T Set the Condition Code to TRUE (and return from the enclosing predicate). K Set the Condition Code to FALSE (and return from the enclosing predicate). C The same as '?' except that the addresses of the items described are compared. NOTE: The two items compared by '?' will be of similar type i.e. string:string, integer:integer, integer:real etc. Hence the only type conversions necessary are from integer to real, byteinteger to integer etc. " As for '?' but indicates that this is the first comparison of a double-sided condition and that the second descriptor (the middle operand) is to be restacked ready for the second comparison. e.g. if A=B=C start @1@2"#4@3?#4 12 CONDITIONAL BRANCHES = label branch equal # label branch not equal < label branch less than > label branch greater than ( label branch less than or equal ) label branch greater than or equal t label branch true k label branch false The Condition Code is tested and the branch is taken if the appropriate condition is satisfied, otherwise the next instruction in sequence is taken. e.g. @q@p?#4@r@sS:4 R = S if Q = P @q@p?(8 if Q > P start NOTE: The branch (if taken) is always in the forward direction and to a compiler defined label. This means that the compiler defined labels may be (and are) redefined without ambiguity. 13 f label for clause This item defines the start of a for loop. The descriptors on the top of the stack are:- +-----+ | | INITIAL VALUE | [F] | | | |-----| | | FINAL VALUE | [L] | | | |-----| | | INCREMENT | [I] | | | |-----| | | CONTROL VARIABLE | [C] | | | |-----| | | . . 'label' is the repeat label. i.e. the label jumped to on repeat 'label+1' is the exit label. i.e. the label jumped to on exit or the completion of the cycle. 'label+2' is never generated and so may be used by the code generator. e.g. @1N2N5N1f3 for J = 1,2,5 cycle F4 exit F5 continue :5B3:4 repeat 14 SWITCHES Declaration switch SW(1:5) N1N5b define bounds $7SW,6,1,0 Define switch descriptor Label Definition SW(3): N3 Stack index _7 Define label Note that a switch label may be redefined later if the first definition corresponded to a default definition e.g. switch T(1:3) N1N3 b $5T,6,1,0 T(2): N2_5 T(*): N1_5 N3_5 T(3): N3_5 Switch Jumps ->SW(L) @4 Stack index descriptor W7 Jump to the switch '7', index on top of stack. NOTE: Both '_' and 'W' remove the index from the top of the stack. 15 OTHER CONTROL ITEMS s STOP U tag Set the current line number to 'tag' NOTE: The stack should be empty whenever this item is found. This item should also be used to initiate consistency checks during assembly. z tag The value of the tag may be used to set control bits in the assembler. The value is generated by the statement: control n e.g. control 16_1010 y tag As for 'z' but generated by diagnose n. e tag Signal Event. The tag is a constant describing the event to be signalled - one bit set corresponding to the event; the bits are numbered from zero (least significant) to fifteen (most significant). If the sub-event and extra information fields have been specified their values will be described on the top of the stack. e.g. signal event 2 e4 signal event 2, X @xe4 signal event 2, X, Y @x@ye4 o v,lab On Event The first parameter, v, is a 16-bit value with a bit set (2\\n) for every event trapped. The second parameter, lab, is the internal label defining the finish for this event clause. e.g. on event 1,2 start o6,10 . . finish :10 G n,text The text is a string constant to be used as the external alias for the next descriptor to be defined. The parameters to G are the same as those for '. 16 r flag String Resolution. Flag is a three bit number, the bits representing: 4 - conditional resolution. 2 - first destination present. 1 - second destination present. The resolution sets the condition code to true for successful resolution, and false for failure. The various operands are on the top of the stack. e.g. S -> A.(B).C @s@a@b@c r3 S -> A.(B) @s@a@b r1 stop if S -> (B).C @s@b@c r5k4s:4 l tag Language Flag (current values) 1 IMP 2 ALGOL 60 3 PASCAL 4 SIMULA 5 FORTRAN m Call Monitor (for diagnostics) This is a no-op if run-time diagnostics are not provided. c op Stack condition. This is similar to '?' except that the resulting condition-code is tested by 'op' and a suitable value is stacked depending on the truth of the comparison (0 for TRUE and -1 (\0) for FALSE, for example). The values of 'op' are =, #, <, >, (, and ) as described under CONDITIONAL BRANCHES. e.g. BOOL := A=B @bool@a@b c= S 17 MACHINE CODE P Plant the constants desribed on the stack as code values (in the order in which they were stacked). The size of each value is implementation dependent. w Construct an instruction and plant it. The item takes the general form: w{opcode}_{operand?}; e.g. wMOVE_23(1),3; wADD_@6(3),(1)+; The operand is the operand as specified in the source with the following changes: 1 Spaces are removed. 2 Constants are output in octal. 3 References to descriptors are passed through as "space tag" e.g. *L_1,X wL_1, 23; *MOVE_X,Y wMOVE_ 23, 24; 18 APPENDIX 2 CONTROL ITEMS ITEM PAGE USE ! 10 Logical OR " 12 Compare values (double sided). # 13 Branch not equal $ 2 Define descriptor % 10 Logical EXCLUSIVE OR & 10 Logical AND ' 5 Stack string constant ( 13 Branch less than or equal ) 13 Branch greater than or equal * 10 Multiply + 10 Add , ***illegal*** - 10 Subtract . 10 String concatenation / 10 Integer divide 0 ***illegal*** 1 ***illegal*** 2 ***illegal*** 3 ***illegal*** 4 ***illegal*** 5 ***illegal*** 6 ***illegal*** 7 ***illegal*** 8 ***illegal*** 9 ***illegal*** : 11 Define compiler label ; 4 End of block or procedure < 13 Branch less than = 13 Branch equal > 13 Branch greater than ? 12 Compare values @ 2 Stack descriptor A 8 Assign own value B 11 Backward jump C 12 Compare addresses D 5 Stack real constant E 3 Call procedure F 11 Branch forwards G 16 Alias H 4 Start of begin block I 8 Clear own array J 11 Jump to user-defined label K 12 false return from predicate L 11 Define user label M 4 Map result N 5 Stack integer constant O 16 Set line number P 18 Plant machine code Q 10 Real Divide R 4 return S 9 Assign value T 12 true result from predicate U 10 Unary minus V 4 Function result W 15 Switch jump X 10 Integer exponentiation Y Jump through LABEL variable (ALGOL 60 & Pascal) Z 9 Assign address [ 10 Logical LEFT SHIFT \ 10 Logical NOT ] 10 Logical RIGHT SHIFT ^ Unassigned _ 15 Define switch label ` Unassigned a 7 Array access b 8,15 Define constant-bounded dope-vector c 17 Apply comparator (ALGOL 60 & PASCAL) d 7 Define dope-vector e 16 Signal Event f 14 for loop g 9 Global assignment h Assign and restack destination (algol 60) i 7 Array index j 9 Jam transfer k 13 Branch FALSE l 17 Language flag m 17 Monitor n 6 Override format o 16 On Event p 3 Assign parameter q Unassigned r 17 String resolution s 16 stop t 13 Branch TRUE u Unassigned v 10 Modulus w 18 Machine code x 10 Real Exponentiation y 16 Diagnostic Control z 16 Control Information { 3,6 Start of formal parameters } 3,6 End of formal parameters