The Compiler translates programs written in the high-level programming language IMP to M68000 native code. Compilation is a single pass translation from source to executable code.
Reference descriptions of the IMP language are contained in The IMP80 Language (html format) (or pdf format) (John Murison, Edinburgh Regional Computing Centre) and The IMP-77 Language (1977 edition, html format) (alternative 1986 edition in pdf format) (Peter S. Robertson, Edinburgh University Computer Science Department and Lattice Logic Ltd). The dialect of IMP accepted by the Compiler is closely similar to IMP-77, and this document provides only an outline summary of features which are fully covered by the IMP-77 Report.
A dagger in the margin is used to flag features which may not be present in IMP-77, or may be present in different form.
In addition to some general-purpose language extensions, a number of extra features have been included in the present implementation to extend the capability of the language for system programming applications. By their nature these extensions should be avoided in software which is intended to be portable.
Hamish Dewar
CLAN Systems
February 1986
External specifications (defining imports) must appear at the outermost textual level, that is, before the first begin or in the main program block. An external spec may be satisfied by a procedure occurring later in the same module, that is, it may turn out not to be an import after all. This, in a rather unsatisfactory way, allows the same definitions to be used by importing and exporting modules.
External definitions (defining exports) must appear in the source file before the main block (if any).
A program file containing just external procedures but no main block, or an include file, is terminated by end of file.
End of file may be either the physical end of the file or the statement %end %of %file.
Main program files may also contain external procedures.
If either of the statements %end %of %program or %end %of %file is used to terminate a file, the Compiler does not process any subsequent text which there may be in the file.
%include "........"
for example %include "GRAPHICS" The quoted string specifies the name of the file to be included. Line numbering in included files operates independently from the main file, and any Options (see later section) which are specified within the included file, are localised to this file. Included files may be nested to a maximum of three.
A compile-time condition is introduced by the directive $IF (dollar-sign followed by IF). It may only appear at a point in the program text at which a source statement may begin. The directive is followed by any valid IMP condition involving only literal expressions. The effect is that during compilation the condition is evaluated: if it is true, the following source lines are compiled; if it is false, they are ignored. The scope of the condition extends to a matching $ELSE or $FINISH directive. In the $ELSE case, the following source lines are compiled if the condition was false, and ignored if it was true.
Compile-time conditions may be nested within one another. However, it should be noted that excessively elaborate or lengthy conditional sections can make the flow of a program for one particular case difficult to follow. It may be preferable to place long sections of code which are peculiar to one version in a separate file, and conditionally include this file.
All conditional-compilation directives must appear at the start of a source line, apart from any leading spaces. Source lines which are not selected for compilation are not processed in any way. They do not require to be valid IMP. Consequently quotation and line continuation conventions are not applied to material which is being skipped.
In Compiler listings, lines which have been skipped are listed with a minus following the line number as a flag symbol.
Selected material must constitute a number of complete source statements, but there are no structural restrictions. In particular, the normal block and statement structure of the language does not require to be properly nested with respect to the conditional compilation structure.
Simple example:
%constant integer resident=1, loaded=2 {variant values} %constant integer variant=resident {that or LOADED} ................. .................$IF variant = loaded {is main program}
%begin %constant %integer max=20000$ELSE {is external}
%external %routine EDIT(%string(255) cliparam) %constant %integer max=5000$FINISH
Initialisation values for static variables must be literals. A static name variable may be initialised to NIL by the form '==NIL', or to point to an absolute location by the use of an appropriate store mapping function with literal argument, for example '%integer %name MODE==INTEGER(126)'.
† Own variables for which no initial value is specified are not initialised to zero. In general, for a normally loaded program, this means that they are unassigned, although in special cases they may inherit values from the program environment.
external constant or const identifiers of scalar numeric types, but not strings or structures, may be used in literal expressions.
In a constant declaration declaring multiple identifiers of ordinal type, it is permitted to omit the assignment part (equals-sign and expression) for any identifier apart from the first. The effect is that the identifier takes the next value in sequence.
low-level mode allows variables to be declared at (@) explicitly specified addresses. The normal alignment requirements which are enforced on automatically allocated variables are suspended for the first (or sole) variable declared with this form of mode specification.
It takes either of two forms:
@126 %byte MODE
@-32(A5) %integer ELAPSED
@16_90001 %record (duartf) DUART
%record %format PRINTERF(- %byte MODE, %writeonly %byte DATA, %volatile %half TIMER, %volatile %readonly %byte STATUS)A %readonly attribute indicates that a variable may only be read, not written, by the program. It is useful mainly for purposes of documentation, particularly in application to fields of a record characterising a memory-mapped device.
A %writeonly attribute indicates that a variable may only be written to, not read, by the program. As well as serving documentation purposes, it indicates that the variable should not be updated using instructions which imply a read as well as a write (eg CLR on the M68000).
For a variable, the %volatile attribute indicates that the designated location is subject to change independently of the execution of the program (for example, a timer or device status register). For a %function, %map or %predicate, the attribute indicates that the result is not a pure function of its arguments. One of the implications is that the Compiler should not optimise references to such variables and procedures.
The %register attribute is valid only for a simple dynamic variable. It requests that the variable should be allocated to one of the machine registers available for this purpose, rather than a storage location, for the sake of efficiency. At run-time, the content of the selected register is saved at the point of allocation and restored on block exit.
The %register attribute is ignored if monitoring diagnostics are in force. Certain restrictions apply to variables allocated to registers: in particular they may not be used in a context where the address of the variable is required, and their values should be regarded as undefined after event trapping if they have been re-declared in any of the procedures from which exit has been forced.
e.g.: %register (a2) %mite %name pos is a low-level variant of the preceding attribute in which the register to be used is specified explicitly. No saving/restoring of the register content is effected.
Subsequent to this declaration the register is no longer treated as eligible for use either as a temporary or user allocatable register.
This form is valid for parameters to procedures where it indicates that the parameter is to be passed, as well as being accessed, in the designated register.
This is a very low-level feature which should be employed only where the implications for Compiler register usage have been fully taken into account.
%long %integer
accepted as equivalent to integer
%short
integer in range -32768 to 32767; 16 bits
† %half
integer in range 0 to 65535; 16 bits
† %mite
integer in range -128 to 127; 8 bits
%byte
integer in range 0 to 255; 8 bits
† %integer (lo:hi)
integer in range lo to hi; 8,16,32 bits
† char
character; 8 bits
IMP does not distinguish characters from integers. The use of the %char type permits more appropriate diagnostics, automatic type transfer, and better exploitation of over-loading. It also provides compatibility with Pascal. A value of type %char may be used wherever a value of type %string is required, by virtue of automatic type conversion from character to string of length 1. A single character between string quotes is treated as of type %char, rather than %string, but like any other character, is subject to automatic type conversion.
† %boolean
boolean (%false,%true); 8 bits
procedure class
%predicate is treated as equivalent to %boolean %function.
An expression of type %boolean is a valid form of condition (but
conditions are not boolean expressions). Again this type provides
compatibility with Pascal.
%long %real
treated as %real, apart from type-checking
%string (maxsize)
character string; maxsize+1 bytes
In a name or map, maxsize may be given as '*' to indicate compatibility with any string variable, irrespective of its declared maximum size. This is also permitted in the case of %string (*) %array %name parameters to procedures.
For a constant string, but not a constant string array, maxsize (and enclosing parentheses) may be omitted, for example:
%constant %string message="TOO MANY NAMES".
%record %format record-type-ident (.....)
Format declaration introducing a record type
%record (record-type-ident) record-ident, record-ident, ....
The component values may be presented either positionally or using the record component names, preceded by the sub character (_), as keywords.
Example - Format declaration:
%record %format f1( %integer lo,hi, %string (7) code, %byte %name where, %short weight)Example - Record declarations with initialisation:
%own %record (f1) r1=0meaning: numeric values = 0, strings = "", names == NIL
%own %record (f1) r2=f1(-100,100,"FACTOR",nil,1) %own %record (f1) r3=f1(0,,"DUMMY")by enumeration, any omitted values unassigned
%own %record (f1) r4=f1(_code="UNDEF",_weight=99) %own %record (f1) r5=f1(h_i=99999,_code="MAX")by selection using field names, any omitted values unassigned
† The array type information defining the bounds may be placed after the
keyword array or after the list of identifiers. For example:
%integer %array (1:5,1:4) table or %integer %array table (1:5,1:4)
%integer %array (0:*) %name a or %integer %array %name a (0:*)
%short %array (1:%byte hi) %name c or %short %array %name c (1:%byte hi)
The former position is mandatory for the implicit case.
† As in Pascal, a multi-dimensional array is regarded as an array of arrays, and inner slices of it may be used as arrays of lower dimensionality. For example, if TABLE is an array with bounds (0:3,1:40), then TABLE(0) is an array with bounds (1:40).
%byte %name pos
%own %string (*) %name details
%record (cellinfo) %name head,next,last
The combinations %name %array, %array %name and %name %array %name are all permissible, but not any other combinations.
%routine dump store(%name from, %integer bytes)
Pointers of arbitrary type may be name-assigned to such variables.
Untyped pointer variables are characterised simply by an address; they have no type or size information associated with them. The pre-defined function SIZEOF is not applicable to such variables, nor to variables of type %string (*) %name or %record (*) %name.
Storage is allocated in the usual way for the anonymous variable. The main use of this facility is in record formats describing memory-mapped devices with non-contiguous fields.
The application of the rule to external procedures implies that a procedure passed as a parameter to an external procedure should itself be external. In this way, the parameter satisfies the requirement that it could be called directly (through the external linkage mechanism).
%routine put(%string(255) s)
: : : : : : : :
%end
%routine %alias put(%real x)
: : : : : : : :
%end
%routine %alias put(%integer i)
: : : : : : : :
%end
%routine %alias put(%char i)
: : : : : : : :
%end
In this case each call on PUT is treated as a call on the appropriate individual procedure determined by the type of the argument provided in the call. The order of matching is on the basis: most recently declared first. Hence in the examples above, it is relevant that the integer case occurs after the real case, since an integer is a valid argument for a real value parameter, under automatic type conversion. Similarly, it is relevant that the char case follows the string case, since a character is a valid string, under automatic type conversion.
The procedures do not all require to be in the same block, and some or all of them may be external. In the case of external procedures, each case will require to have a unique name for loading (through external name aliasing).
get(i;j;k) put(n;" cases processed in ";t/1000;" seconds";snl)This applies only to routines with parameters, not to other kinds of procedure.
† The effect of %continue is to pass control to the head of the containing loop, where any %while or %for control clause is tested; any %until clause attached to the %repeat at the end of the loop is ignored.
Caution: this is a Low-level facility. There is no check on the validity of the references. Use of the facility is, however, preferable to the use of store-mapping functions in a similar role, because it preserves the type identity of the objects denoted. (The facility is referred to as 'Address modification' in the system dependent section of the IMP-77 Report).
For example, with the following definitions and assignment:
%record %format PAIR(real x,y) %record(pair)%array A(1:100) %record(pair)%name P P == A(7)the following would hold: P[2] would denote A(9)
The other, optional, positional parameters are:
second input: pre-definition file to be included before main file
first output: object code file
second output: listing file
%option "-NOWARN-NOLIST"
The quoted string consists of a sequence of Option keywords each
preceded by a dash (hyphen).
The effect of most options is localised to the current procedure or block. That is, at the end of the block, the Options current at the start of the block are re-instated. In addition, the use of the option directive without a quoted string causes the state to revert to what it was immediately before the last option directive. These points do not apply to the global Options -EDIT, -LOG, -RUN and -FORCE.
Caution: indiscriminate variation of checking and diagnostic options within a program can create confusion for debugging.
The listing file consists of the text of the source file with added line numbers, and any fault or warning messages produced during compilation. Line numbering runs from 1 at the start of the file and includes blank lines and comment lines. Lines in included files are numbered independently, with an ampersand as an indicator.
Selecting any of the listing options also causes the Compiler to output UNUSED and UNDERUSED warnings for identifiers which are declared but not utilised. These warnings are not produced for certain classes of identifier, such as constant scalars and external specifications, which are often included as a complete package. A variable is regarded as underused if (a) being readable it is never read, or (b) being writeable it is never written to.
-LIST produce source file listing
listing name is derived from the source file-name (without the extension
.IMP if present). The extension .LIS is applied.
-TTList send listing to terminal
-MAP program map
information at end of each procedure, indicating size of code, etc.
-LOG print log
statistics at end of compilation indicating number of statements, atoms
per statement, identifiers per statement, time taken, etc.
The remaining listing options may not be available in all releases:
-CODE code listing
hybrid assembly language interpretation of code generated for each
statement. This is before address fix-up and branch shortening, and so
does not fully reflect the final code.
-DICT print dictionary entries for identifiers
-LOOP Undocumented
-EXP Undocumented
The compiled code is native Motorola 68000 machine-code, ready for direct execution. The code file starts with a header which identifies the import and export lists, the code section with main and reset entry-points, and the diagnostic sections.
If no name is provided explicitly, as the first output parameter, the name is derived from the source file-name (without the extension .IMP if present). The extension .MOB (for Motorola object) is applied to the object file-name.
The generation of an object file may be suppressed by specifying the null file-name as the first output parameter.
-FORCE produce object file even if program faulty
-RUN run program after compilation (if supported)
-ASS include unassigned check on full integers
-STRASS include unassigned check on strings
-SASS include unassigned check on 16-bit values
-BASS include unassigned check on 8-bit values
The unassigned check is implemented by standardising all newly
declared dynamic variables to a fixed pattern which is an improbable
integer value. The check is defaulted off for variables occupying less
than 32 bits because there is a greater possibility of the special
pattern occurring as a genuine value. Even so, the probability remains
quite low, so that most programs can benefit for the additional checking
implied by specifying -SASS and -BASS.
Defaults for unassigned checks: -ASS -STRASS -NOSASS -NOBASS
For the following three cases, violations which involve only literal operands are detected and rejected at compile-time rather than run-time.
-ARR include array bound checking
-OVER include overflow check
-CAP include capacity checking on assignment
-STACK include stack over-run check
Defaults: -ARR -NOOVER -CAP -STACK
-SYS suppress all checks apart from stack over-run
-NOCHECK suppress all checks
-LINE generate line number identification information
-MON generate variable monitoring information
This option requests that tables of identifier information should be included
in the object file to allow the values of variables to be monitored. It
is effective at the point where a variable is declared and may be freely
switched on and off to select some variables and exclude others.
-TRACE enable tracing
-STEP causes code to be generated which allows the program to be executed one line at a time under the control of the Software Front Panel
Default: -DIAG -LINE -MON -NOTRACE
-KBytes allow for codesize nK (default 64k, max 128k)
-LOW enable use of low-level features
-VOLatile tells the Compiler to assume that functions and predicates are volatile by default
Default: -NONONS -NOSTRICT -NOLOW -VOL -NOHALF -NOSHORT -WARN -EDIT
For example, it can occur as a result of a static (%own) data storage requirement in a single module which exceeds 32 kilobytes. It is sometimes possible to overcome this limitation for a single large array by declaring it as the final %own, but note that external specs also require static storage, and the space for such a reference is allocated when and if used, not at the point of declaration.
For references to the code area, including stored constants, and references to dynamic variables, the Compiler uses various techniques to circumvent the +/- 32 kilobyte relative addressing limit of the M68000. There is a possibility in exceptional cases that these do not succeed, leading to an 'out-of-reach' report. If a name is given in the report, it is that of a procedure which is too distant from one of its calls. If no name is given, the problem is access to a constant or variable.
For most errors, detection at run-time depends on whether particular checking options were selected at compile-time. The default compile-time options include most checks, but not unassigned checks on variables occupying less than 32 bits nor overflow checks.
For the fullest level of checking, the options -SASS -BASS -OVER should be enabled, over and above the default checks.
If relevant checks are not enabled, an error may manifest itself in the form of a hardware detected error, such as Bus Error or Illegal Instruction.
The level, and amount, of diagnostic information produced depends on the diagnostic Options applied to the program at compilation (and to any modules to which it is linked).
In all cases, no value is displayed in the case that the variable appears to be unassigned, just the name of the variable.
Note:
There is always the possibility, when a program goes wrong, that it will corrupt the diagnostic information or the program structure on which diagnosis depends. Similarly, when diagnostics are entered by operator intervention, it is possible for the program to be suspended in a state which prevents valid diagnostics from being produced. The diagnostic interpretation routines attempt to detect abnormalities of this kind and report on any corruption. Fortunately, in practice, these are rare occurrences, but the possibility should be borne in mind.
Another potential source of erroneous diagnostics is when programs and linked modules have been compiled with diagnostic options switched off in some parts and on in others. In general, diagnostics cannot be guaranteed to succeed if suspension occurs in a section compiled with diagnostic options disabled.
%integer %map INTEGER(%integer a) %real %map REAL(%integer a) %string(*)%map STRING(%integer a) %record(*)%map RECORD(%integer a) %byte %map BYTE or BYTEINTEGER(%integer a) %short %map SHORT or SHORTINTEGER(%integer a) %mite %map MITE or MITEINTEGER(%integer a) %half %map HALF or HALFINTEGER(%integer a) %byte %map LENGTH(%string(*)%name s) %byte %map CHARNO(%string(*)%name s, %integer n) %integer %function ADDR(%name n) %string(1)%function TOSTRING(%integer k) %string(255)%function SUBSTRING(%string(255) s, %integer from,to) %integer %function REM(%integer a,b) %integer %function MULDIV(%integer a,b,c) {A*B//C without overflow} %integer %function INTPT(%real x) %integer %function INT(%real x) %real %function FRACPT(%real x) %real %function SQRT(%real x) %integer %function CPUTIME {in milliseconds} %name NIL %record (*) %map NEW(%record (*) %name n) %routine DISPOSE(%record (*) %name n) %record %format EVENTFM( %byte event, sub, %short line, %integer extra, %string(255) message) %record(eventfm) EVENT %constant %integer NL %constant %char SNL %integer %function NEXTSYMBOL %routine READSYMBOL(%name n) %routine PRINTSYMBOL(%integer k) %routine SKIPSYMBOL %routine PRINTSTRING(%string (255) s) %routine READ(%name n) %routine WRITE(%integer m, n) %routine PRINT(%real x, %integer n,m) %routine PRINTFL(%real x, %integer n) %routine NEWLINE %routine NEWLINES(%integer i) %routine SPACE %routine SPACES(%integer i) %routine SELECT INPUT(%integer n) %routine SELECT OUTPUT(%integer n) %routine CLOSE INPUT %routine CLOSE OUTPUT %routine SET INPUT(%integer pos) %routine SET OUTPUT(%integer pos) %routine RESET INPUT {equivalent to SET INPUT(0)} %routine RESET OUTPUT {equivalent to SET OUTPUT(0)} %integer %function INSTREAM %integer %function OUTSTREAM %routine OPEN INPUT(%integer n, %string(255) S) %routine OPEN OUTPUT(%integer n, %string(255) S) %string(255)%function CLIPARAM %routine PROMPT(%string(255) S)
Machine-level operands are specified using the mnemonics D0-D7 and A0-A7 (or SP), and the standard syntax for effective addressing (<ea>) modes. Note, however, that any declaration of IMP identifiers which coincide with the register mnemonics takes precedence. Immediate, Quick, and Address register variants of op-codes are selected automatically. In comparison with full Assembler there are the following omissions:
MTCCR <ea> for MOVE <ea>,CCR MTSR <ea> for MOVE <ea>,SR MFSR <ea> for MOVE SR,<ea> MTUSP An for MOVE An,USP MFUSP An for MOVE USP,An ATCCR #<data> for AND #<data>,CCR ATSR #<data> for AND #<data>,SR ETCCR #<data> for EOR #<data>,CCR ETSR #<data> for EOR #<data>,SR OTCCR #<data> for OR #<data>,CCR OTSR #<data> for OR #<data>,SR
Labels and procedure names may be referenced in the Branch group of instructions, including BSR. The only alternative form of operand for these instructions is immediate (eg *BLT #-4), the value specified being the machine-level displacement. Short and long branches are handled automatically (though the Compiler's CODE listing does not show this). To access a forward label in an assembly language instruction, it is neccesary for the label to have been declared by means of the IMP declaration %label label-ident.
The registers defined as available for use as temporaries by the Compiler (by default D0-D4 and A0-A3), with the exception of D4, may be freely used within assembly sections, but no assumptions can be made about their contents after execution of most IMP statements. Other registers may be used only on the basis that their values are restored before reverting to IMP. In particular this applies to the stack pointer SP. In addition, the accessing of local variables may depend on SP; if SP is changed other than by Compiler-generated code, the addressing of these variables will be rendered erroneous. The addressing of global and own variables depends on A4, so that any modification of this register rules out access to non-local variables.
declared as own or declared at the outermost level or declared at the current level or declared by means of an explicit address (@) declaration or declared as registersArray identifiers are directly addressable only if they meet the above requirements and their bounds are literal and their size is 'moderate'.
Variables declared at intermediate levels, and external objects, should not be used since an indirection may be involved in accessing them. In the case of a name variable the effective operand is the pointer value (32-bit address) rather than the referenced object.