Department of Computer Science

Memorandum                 IMP CONVERGENCE II

To       G.E. Thomas, P.D. Stephens, G.E. Millard (ERCC)
         J.P. Gray, P.S. Robertson, I.A. Young    (Lattice Logic)
         P.D. Schofield, S. Michaelson, D.J. Rees (Computer Science)

From H. Dewar

Date 26th October, 1982.

B̲a̲c̲k̲g̲r̲o̲u̲n̲d̲

The first IMP convergence exercise, which started in 1980
and has been reported on in Alan Anderson's documents of 20/1/81 and 31/3/81,
made substantial progress towards eliminating the obstacles to moving
IMP programs between one implementation and another.  A number of
residual differences remain; these are not on the whole as major as
those that were overcome in the first exercise, but they can pose
significant problems for IMP program portability. The production of two
new compilers for the language, in the Regional Centre for the Perq and
in the Computer Science Department for the Motorola 68000, makes it
timely to seek further progress.

The purpose of this document is to set out as many of the significant
points of difference between the main existing compilers as I have been
able to discover.
In order to do so, I have drawn on Peter Robertson's
latest manual covering IMP77 for Vax/VMS (hereafter referred to as Vax
IMP) and John Murison's preliminary manual for IMP80 on the 2972 
(hereafter Emas IMP), as well as Alan Anderson's reports.  However, as
there have been some developments since these documents were produced, I
have used my own experience and specially conducted tests to try to
produce an up-to-date picture of the current releases of these two
implementations.  The picture is unlikely to be totally accurate, but I
hope that the number of omissions and mis-representations is small.
As well as identifying the points of difference, I have put forward
specific proposals for almost every case as a basis for discussion.
These are in line with the course I should like to follow in the M68000
version.


O̲b̲j̲e̲c̲t̲i̲v̲e̲s̲

The proposals in this document are directea to those who wish IMP
well.  Opinions differ on the suitability of IMP as a general-purpose
programming language, and on the desirability of using it for
applications at large even if it is deemed suitable. What seems to me
not open to serious question is that it has been of considerable value
as a language for implementing a wide range of system software, from
operating systems, through compilers and basic packages, to standard
utilities like editors and formatters.  The advantages which have been
derived from this approach at Edinburgh deserve to be more firmly
emphasised. It is still typically the case for systems in general that
basic software is written either in Assembler or in a mixture of various
languages; in either case, the development and support process is much
more onerous than is the case when a single high-level implementation
language is employed.  The Emas papers present some of the considerable
advantages.

There can be an additional bonus. It can prove possible to move some
of these software components (those which are not inevitably too system
dependent) from the system for which they were developed to another one.
When users (quite reasonably) comment on the non-portability of IMP
programs to other sites, because the language is not in widespread use,
they might also reflect that it has in fact enabled a number of pieces
of software, not normally regarded as portable, to be transferred
between different systems in Edinburgh.

One of the most important factors in opening up this possibility is
that the system implementation language, which must by definition permit
some of the normal constraints of high-level languages to be broken so
that access may be permitted to machine and system features, should
nonetheless offer ways of doing so which limit and localise absolute
machine or system dependence.  For example, the availability of typed
pointer variables is an aspect in which IMP is superior to, say, BCPL.
Our understanding of how to specify operations in a well-defined fashion
(as opposed to a system-defined fashion), and of what facilities can be
implemented efficiently, has increased with experience, so that there is
now less reason to tolerate in the language features which are logically
insupportable.  If we are prepared to make it a high enough priority, we
should be able to achieve even greater portability of system sortware in
the future, and at the same time improve IMP's qualities as a
general-purpose language.

Of course, the objective of greater portability is not a matter of
language (or programming discipline) alone.  It also depends critically
on the availability from the underlying operating systems of a broadly
comparable set of capabilities (and the absence of certain crucial
restrictions).  I believe that it would be a most worthwhiie enterprise
at this stage to try to define a standard s̲y̲s̲t̲e̲m̲ i̲n̲t̲e̲r̲f̲a̲c̲e̲ which could
be supported across all the main machines at Edinburgh.  This, however,
goes beyond the scope of the present document.


T̲h̲e̲ I̲M̲P̲ l̲a̲n̲g̲u̲a̲g̲e̲

There is room for only one IMP language.  This is not to say that a
single specification should be frozen for all time, since it is one
of the compensating virtues of a 'local' language that it can be
refined and evolved.  Nor is it to say that there can never be
compelling reasons for the existence of differences between
implementations.  What it does mean is that there must be an
independent definition of the language itself, not a new language
manual for each implementation.

Proposal

The production of a proper IMP language manual should be given high
priority. It should be a joint project between (at least) ERCC and
EUCSD; it should be aimed at a somewhat higher conceptual level than
any of the existing manuals; and there should be a prohibition on any
mention in the body of the text of specific implementation
restrictions and differences.  It would be desirable to aim for
publication by EUP.

SPECIFIC PROBLEM AREAS

P̲r̲e̲d̲i̲c̲a̲t̲e̲s̲

Vax IMP extends the class of procedures to include preaicates as well
as routines, functions and maps; Emas IMP does not. There is no
doubt that for certain kinds of program which make extensive use of
procedure calls for carrying out tests, the use of predicates makes
the conditional clauses much neater and more readable.  Of course,
the same effect can be achieved by using functions and testing the
value of the call to be zero or non-zero, but this is clumsier and
less clear.

Proposal

Either predicates should be included in the language as such or the
definition of condition should be modified to the effect that a
single arithmetic expression by itself should be permitted as a
condition, with the effect of a test non-zero.  The latter does not
involve an extra procedure type and is a more general facility, at
the expense of some loss of error-detecting redundancy.


I̲n̲t̲e̲g̲e̲r̲ r̲a̲n̲g̲e̲s̲

Current compilers directly reflect the specific capabilities of their
target processor in their choice of integer ranges.  For 16-bit
operands, Vax IMP provides the signed range of -32768 to 32767
(s̲h̲o̲r̲t̲); Emas IMP provides the unsigned range of 0 to 65535 (h̲a̲l̲f̲).
This is proving to be one of the major sources of difficulty in
transferring programs, not least in cases where the values involved
are always within the common subrange 0 to 32767 or where, for other
reasons, the distinction is unimportant. Such difficulties will
multiply with the introduction of microprocessors like the Motorola
68000 which directly support signed 8-bit values. Tad Pinkerton long
ago urged the desirability of permitting the programmer to specify in
a declaration the precise range of values that a variable could
assume.  Not only is this superior in terms of program specification,
but it leaves it up to the compiler to determine the appropriate
storage format.  There is a known difficulty in this approach,
arising from the fact that all current compilers are obliged to treat
range differences as type differences for the matching of n̲a̲m̲e̲
parameters, but it might be no bad thing to put some pressure on the
removal of that restriction.  Even in a partially-supported form, it
seems to be the only way forward.

Proposals

(a) It should be permitted to specify an explicit range within an
    integer declaration in the form:
             i̲n̲t̲e̲g̲e̲r̲ (LOWER:UPPER) ......

(b) Man-sized compilers should be expected to support as storage
    formats both signed and unsigned 16-bit values and both signed
    and unsigned 8-bit values, albeit less efficiently for the cases
    not covered by hardware;

(c) The keywords s̲h̲o̲r̲t̲, h̲a̲l̲f̲ and b̲y̲t̲e̲ should continue to be avaiable
    as shorthand for familiar ranges, along, perhaps, with m̲i̲t̲e̲ (for
    signed 8-bit range) and b̲i̲t̲.


R̲e̲c̲o̲r̲d̲s̲

Records got off to a bad start in IMP, with an unfortunate choice of
syntax for their declaration. They are still in some ways
second-class types (in varying ways in the two compilers), and this
Should be remedied in what are, on the whole, obvious ways.

A significant deficiency at present is the absence of a facility for
initialising o̲w̲n̲ and c̲o̲n̲s̲t̲a̲n̲t̲ records and of a facility to construct
a record from individual components.  The main problem is in devising
a satisfactory syntax, especially taking account of the complexity
introduced by alternative formats.  Peter Roberston has suggested a
form in which the individual component name is specified with each
value; this has the advantage of precision and flexibility, but may
be regarded as excessively cumbersome for simple cases. It could be
that an unadorned list of values, with enforced use of first
alternatives, would meet the need.

Proposals

(a) Some method of achieving partial record assignment (absent from
    Vax IMP) should be included (see discussion of Jam Assignment below);

(b) Record maps and functions, record format specs, and the star
    specifier for record names and maps (absent from Emas IMP) should
    be included;

(c) Initialisation of o̲w̲n̲ and c̲o̲n̲s̲t̲a̲n̲t̲ records should be permitted in
    a form to be agreed;

(d) The name of a record format should be usable as a constructor
    function, the exact syntax to be agreed;

(e) The convenient Emas IMP facility whereby a record may be defined
    in terms of an explicit specification of component types or by
    cross-reference to another record should be included;

(f) Comparison of records for equality and inequality should be
    permitted;

(g) The type checks on record assignment statements (using "=" and
    "==") should be tightened in Emas IMP.

S̲t̲r̲i̲n̲g̲s̲

Vax IMP is a stickler for length specification in all string
declarations; Emas IMP is more relaxed.  Certainly the need to 
specify a length for a c̲o̲n̲s̲t̲a̲n̲t̲ scalar string is rather an
imposition, but in all other contexts a string delaration is
incomplete without a length.  Emas IMP permits the use of the star
specifier for functions as well as pointer variables and maps; it is
not appropriate for functions any more than it is for ordinary string
variables (or c̲o̲n̲s̲t̲a̲n̲t̲ strings).

Proposals

(a) Vax IMP should permit omission of  the length specifier in
    c̲o̲n̲s̲t̲a̲n̲t̲ scalar string declarations;

(b) Emas IMP should require a length specifier in all otner contexts
    and should disallow the star specifier for functions;

(c) Emas IMP should take proper note of length specification for
    pointer variables in type-matching and implementation;
    
(d) To avoid possible future problems, the use of LENGTH other than
    as a function should be discouraged.


I̲n̲i̲t̲i̲a̲l̲i̲s̲a̲t̲i̲o̲n̲ o̲f̲ d̲y̲n̲a̲m̲i̲c̲ v̲a̲r̲i̲a̲b̲l̲e̲s̲

Vax IMP allows declaration statements for dynamic variables to
include an initial assignment; Emas IMP does not.  Although perhaps
marginal, the facility is quite a convement one, and it has the
added advantage of making manifest for the variables involved that
they are always assigned a value. All IMP compilers have to be able
to handle run-time evaluation within declarations, so that there is
no significant implementation difficulty.

Proposal

Initialisation of dynamic scalar variables should be permitted.


S̲c̲o̲p̲e̲ r̲u̲l̲e̲s̲

As a result of its internal organisation, the present Emas IMP
compiler, unlike earlier ones, permits the programmer under certain
conditions to violate the basic scope rules of IMP by using variables
before they are declared.  This can result in masking of errors,
which are revealed on attempting to compile the program through a
compiler which applies the standard rules.

Proposal

Compilers should deal with declarations in such a way that the
standard scope rules are applied.


L̲o̲o̲p̲s̲

Emas IMP purports to understand what is meant by specifying that a
for loop is to be executed a negative number of times;  Vax IMP
reckons that zero is the lowest valid number, and treats the negative
case as an error.  Emas IMP does not guarantee that on for loop
termination the control variable is equal to the end value; Vax IMP
does.  The Emas approach is motivated by efficiency considerations;
the Vax approach is better justified in logical terms.
Vax IMP permits a cycle introduced by a w̲h̲i̲l̲e̲ clause to have an u̲n̲t̲i̲l̲
clause on the associated r̲e̲p̲e̲a̲t̲; Emas IMP insists on one or the other
but not both.  Vax IMP also permits a cycle introduced by a f̲o̲r̲
clause to have an u̲n̲t̲i̲l̲ on the associated r̲e̲p̲e̲a̲t̲; again Emas does not.
In this case, it seems to me that the Emas approach is better
justified in terms of propriety -- refusing to complicate a simple
distinction among types of loop.  There could be a small loss of
efficiency in cases where the programmer is obliged to use an e̲x̲i̲t̲
immediately before a r̲e̲p̲e̲a̲t̲, but many compilers could catch this
anyway.  A compromise would be to allow u̲n̲t̲i̲l̲ after w̲h̲i̲l̲e̲ but not
after f̲o̲r̲, on the grounds that the latter case is a more tightly
packaged concept.

Proposals

(a) the language definition should define the f̲o̲r̲ loop in a logically
    consistent way;
    
(b) the combination of u̲n̲t̲i̲l̲ clauses with other loop constructs
    should not be permitted.


T̲y̲p̲o̲g̲r̲a̲p̲h̲i̲c̲a̲l̲ a̲n̲d̲ s̲y̲n̲t̲a̲c̲t̲i̲c̲ v̲a̲r̲i̲a̲t̲i̲o̲n̲s̲.

There are a number of minor differences in the ways in which the
compilers process source programs which affect the detail of what is
and is not typographically acceptable -- spurious percent-signs,
breaking of identifiers across lines, and the like.  Where these
differences affect only pathological cases, my view is that compilers
should be free to make efficiency of processing the paramount
consideration.  However, where there is a significant loss of either
convenience or error-detection, the differences need to be removed.

Emas IMP applies line continuation conventions (comma and "%C") to
comments; Vax IMP does not. This can result in executable statements
being ignored when a program is transferred from Vax to Emas -- which
can be particularly awkward to detect.  The Vax approach is simpler
to apply when conventional source stream processing is used; the Emas
approach presumably has advantages when special hardware stream
processing facilities are exploited. At present, Emas must
nonetheless make a special case of quoted strings, but with the
proposals below regarding literal expressions, line-breaks in strings
could be dis-allowed, so that this special case would disappear. (The
choice would be less affected by implementation considerations if
"%C" were less awkward to recognise; a hyphen (minus-sign) would be
much simpler, and more aesthetic.)

Vax IMP allows matching f̲i̲n̲i̲s̲h̲ and s̲t̲a̲r̲t̲ atoms in a single statement
(as in "f̲i̲n̲i̲s̲h̲ e̲l̲s̲e̲ i̲f̲ symbol = 'A' s̲t̲a̲r̲t̲ ") to be omitted.  Although
a minor point, it does have its own logic and its convenience is
shown by the fact that is widely used.

Proposals

(a) One or the other of the approaches to continuation of comments
    should be adopted as standard;

(b) The dropping of "f̲i̲n̲i̲s̲h̲ ... s̲t̲a̲r̲t̲ " should be permitted;

(c) Other variations, such as the Vax IMP n̲a̲m̲e̲ f̲u̲n̲c̲t̲i̲o̲n̲ as a variant
    for m̲a̲p̲, should be dropped.


O̲p̲e̲r̲a̲t̲o̲r̲s̲ a̲n̲d̲ l̲i̲t̲e̲r̲a̲l̲ e̲x̲p̲r̲e̲s̲s̲i̲o̲n̲s̲

Limited character sets make it difficult to find appropriate symbols
for as many operators as it would be desirable to have in the basic
language.  This is not a vast number (cf APL), but it is tedious and
clumsy to have to use a parenthesised function notation for common
cases which all the compilers treat as built-in anyway.  The general
availability of the full 95-character ASCII set, instead of the
64-character subset, has eased the situation and Vax IMP has elected
to re-introduce an operator for modulus (absolute value) using the
unambiguous vertical bar symbol instead of the rightly banished
exclamation mark.
There is room for further consideration of the issue of operators v.
functions.  The present concept of 'intrinsic' procedures is woolly,
these being defined pragmatically as those which a particular
compiler picks off and implements directly, in some cases solely for
efficiency, in others for more basic reasons. This leads to problems
of compatibility, since there are differences in the ways intrinsic
functions can be employed, compared with ordinary functions.  On the
other hand, they are not as good as operators (even leaving aside
questions of convenience), since they cannot be used in the 
construction of literal expressions.  Thus the quotient of two
literal values counts as a legitimate literal (operator / or //), but
the remainder from the division of two literal values does not
(function REM).  It may be that it is necessary to consider
permitting certain intrinsic functions to appear in literal
expressions, though the unfortunate choice of syntax for the
repetition count in literal lists may pose problems.  Certainly it
would be simpler all round if the need for the concept could be
removed, with new operators taking the place of those intrinsics
considered to be essential, and compilers hiding the other cases.
There is a particularly pressing need for an operator, or other
syntactic device, as a substitute for the TOSTRING function.  This is
a clumsy notation for manipulating a tiny character at the best of
times, but the absence of a means of incorporating control characters
in literal strings is becoming increasingly awkward, and an operator
for integer-to-string (ie symbol-to-string) conversion would solve
the problem. 


Proposals

(a) The use vertical bars for the modulus operator should be permitted;

(b) A unary prefix operator (perhaps "$"), or some other device,
    should be introduced as a substitute for the TOSTRING operation,
    and should be valid, along with concatenation, in literal expressions.


N̲a̲m̲e̲ a̲r̲r̲a̲y̲s̲

Vax IMP allows the declaration and use of name arrays (as distinct
from array names). Some Computer Science users strongly favour the
inclusion of this facility as a language feature.  My own view is
that, though it is useful, it is not acceptable, since it is not
capable of consistent extension.  The same effect can be achieved
through existing language features, by declaring records with single
name-type components, and this approach extends in a controlled way
to more complex forms of indirection.  The objection to this approach
is that it is clumsier; perhaps some consideration could be given to
defining contexts in which the sole constituent of a single-component
record could be denoted by the record identifier alone.

Proposal

Name arrays should not be included.


J̲a̲m̲-t̲r̲a̲n̲s̲f̲e̲r̲ a̲s̲s̲i̲g̲n̲m̲e̲n̲t̲

The jam-transfer assignment operator is somewhat unevenly
implemented, both in terms of where it is permitted and in terms of
what it is understood to mean.  In o̲w̲n̲ and c̲o̲n̲s̲t̲a̲n̲t̲ declarations, Vax
IMP accepts it for scalar initialisation, but not for array
initialisation; Emas IMP does not accept it in initialisation
statements at all.  Both compilers accept it for record assignment,
but interpret it quite differently. Emas IMP uses it to provide the
very valuable facility of record truncation, but inextricably coupled
with loss of type-checking.
The concept is a curious one, both in conceptual terms and in its
implications for implementation.  In regard to implementation, it
implies either less work than usual by the omitting of certain checks
or more work than usual by the requirement to coerce an operand to
fit (the latter most obviously in the case of strings).
Conceptually, it can hardly be presented as a single concept at all,
and a purist approach would demand its abolition in favour of an
statement of what is required, as, for example,
    %BYTE B=K&255                rather than %BYTE B<-K
    %STRING(7) S=TRUNCATE(T,7)   rather than %STRING(7) S<-T

The disadvantage of the explicit approach is that it is clumsier,
impossibly so for the awkward cases of coercion to signed short
formats.
However, there are several considerations which make it wort
considering a move in the direction of the purist approach (apart
from purism).  One is that it is difficult to see how any definition
of jam assignment can be produced to cover the case of integers of
arbitrary ranges, which will not conflict with the existing
interpretations.  Another is the simple consideration that not every
context in which this type of capability may usefully be exercised
involves an assignment operator to carry the distinction. The most
obvious example is parameter passing.
There are other similar phenomena which might reasonably be included
within some more general approach, such as type conversion and type
aliasing.  The latter is currently handled by means of the store
mapping functions, as in REAL(ADDR(I)), although what is involved has
little to do with store mapping and the ascent (or descent) to the
address level is unfortunate.

Proposal

An analysis of what is currently covered by jam assignment and other
Similar operations should be carried out with a view to handling them
in a more flexible and better classified fashion.


P̲o̲i̲n̲t̲e̲r̲ (n̲a̲m̲e̲) v̲a̲r̲i̲a̲b̲l̲e̲s̲

Both Emas IMP and Vax IMP permit initialisation of o̲w̲n̲ and c̲o̲n̲s̲t̲a̲n̲t̲
n̲a̲m̲e̲ variables, Emas in the form "= literal" and Vax in the form "==
literal".  The syntax is uneasy in both cases: neither takes the form
of a valid assignment statement for a variable of that type.  But to
require that would imply permitting the store-mapping functions to
appear in literal expressions, which is probably not desirable (see
earlier discussion).  The semantics is also dubious, with the 
implication that an address is nothing more than an integer value.

What I have noticed about the examples of this locution which I have
come across, is that the use of name variables is (literally) an
indirect way of expressing what is really wanted.  If on some machine
the clock is at location 4 and a program requires to have a handle on
it, the most direct way of doing so is to have a facility for
declaring a variable of appropriate type at that location, not a n̲a̲m̲e̲
variable which points to it.  An extension of the a̲l̲i̲a̲s̲ mechanism
might provide the appropriate formalism to implement this facility,
by what would be an e̲x̲t̲e̲r̲n̲a̲l̲ s̲p̲e̲c̲ which is self-satisfying at
compile-time.

In many programs which employ some kind of list processing there is a
requirement to have a unique NIL value which may be name-assigned to
pointer variables to represent end-of-list.  Such programs presently
contain their own definitions of NIL, using a variety of forms (as
discussed above).  It would be a distinct improvement to have NIL as
a pre-defined identifier, not least because compilers could then
guarantee that it is efficiently defined for that implementation ana
because it may, in fact, have to be defined in a way that is not
representable in a standard declaration.

The role of the untyped (generic) n̲a̲m̲e̲ variable in IMP is also
problematic.  It has been used mainly to overcome range differences
between integers and the type difference between integers and reals,
but Vax IMP has pushed it further by, for example, including strings
among the cases covered by the READ(N̲A̲M̲E̲ X) routine and offering
TYPEOF and SIZEOF enquiry functions.  This has the merit of
attempting to force to the surface something which would otherwise be
handled in a completely ad hoc fashion, but it is, in my view,
misguided.  The r̲e̲c̲o̲r̲d̲ concept in IMP opens the door to, in effect,
infinitely many types, so that the idea of procedures which can
handle arbitrary kinds of operand in a type-specific way becomes
unrealistic.  I think that the way ahead here is to encourage
reserving the untyped pointer variable for operations which are not
type-specific.  For these, a SIZEOF function can be useful, but it
would be shortsighted to define this as returning a number of bytes,
rather than bits.  By contrast, the way to handle genuine type
differences is almost certainly through overloading of procedure
identifiers.  The problem of differing ranges of integers might
eventually be covered by a parameter type i̲n̲t̲e̲g̲e̲r̲ (*) n̲a̲m̲e̲, with
lower and upper bounds accessible somehow.

Vax IMP does not permit a c̲o̲n̲s̲t̲a̲n̲t̲ scalar or an element of a c̲o̲n̲s̲t̲a̲n̲t̲
array to be used by reference (eg to be passed as a n̲a̲m̲e̲ parameter).
Emas IMP does permit constant array elements, but not scalars (except
as parameter to some intrinsic functions), to be so used.  The
restrictions, particularly those of Vax IMP, are a nuisance, and
reflect a confusion in the language between the concepts of constant
(invariant) and literal (explicit value), which shows itself in the
circumstance that some identifiers declared as c̲o̲n̲s̲t̲a̲n̲t̲ may be used
in literal contexts but not others. It can be argued that it is
desirable that the language should guarantee constancy of constants
(rather than relying on hardware protection), but, of course, the
fact that a parameter is passed by reference by no means always
implies that the called procedure attempts to alter it.  There have
been suggestions to allow parameter declarations to indicate whether
or not this is done, and there may be a case for re-considering
these.

Granted that the indiscriminate use of pointer variables is bad
programming practice, they can be used to good effect in carrying out
quite low-level operations without descending to the depths.  It is
generally true that where the alternative is to employ the
store-mapping functions in conjunction with integer addresses, the
use of pointer variables is distinctly preferable, being both better
controlled and more efficiently implementable on some machines --
those which use descriptors or special-purpose address registers for
example.  A number of facilities have been added to IMP to facilitate
this use, "==" comparison for example.  One addition which has been
implemented in some Computer Science compilers ana which is of value,
is a general way of specifying a type-dependent displacement from a
pointer variable.  A possible syntax is for the displacement
expression to be placed in square brackets following the pointer
variable: thus, for example, P[1] would denote an element of the type
of P, one element on, and P[-2] would denote a similar element two
elements back from P.  As well as supporting general relative
references, there are the obvious special cases of:
           P == P[1]       and      P == P[-1]
           
Proposals

(a) Further thought should be given to the choice of syntax for n̲a̲m̲e̲
    variable initialisation, and the suggestions made above regarding
    NIL and direct declaration via a̲l̲i̲a̲s̲ should be considered as ways
    of dispensing with a number of common cases;
(b) The language should permit c̲o̲n̲s̲t̲a̲n̲t̲ arrays and elements of
    c̲o̲n̲s̲t̲a̲n̲t̲ arrays, but not scalars, to be passed by reference;
(c) The use of untyped n̲a̲m̲e̲ variables to provide generic capability
    should be discouraged and a TYPEOF function should not be
    supported;
(d) A SIZEOF function should be supported, the unit prererably being
    bits;
(e) A mechanism for specifying a type-dependent displacement from a
    n̲a̲m̲e̲ variable should be introduced.


A̲r̲r̲a̲y̲s̲

   Emas IMP supports multi-dimensional own and constant arrays; Vax IMP
   does not.  This is a matter of extent of implementation and can be
   expected to be made good by inclusion of the facility in Vax IMP.
   It carries with it the implication that the order in which the elements
   of a multi-dimensional array are laid out must be definea as part of
   the language.  Fortunately both the compilers follow the Fortran
   convention (counter-intuitive though many find it).

Proposal

   Multi-dimensional own and constant arrays should be included.


A̲r̲r̲a̲y̲ n̲a̲m̲e̲s̲ a̲n̲d̲ m̲a̲p̲s̲

The forms used to declare multi-dimensional a̲r̲r̲a̲y̲n̲a̲m̲e̲ variables
differ in the two implementations.  Emas IMP provides array formats
and a pre-defined mapping function ARRAY to allow an arbitrary part
of the address space to be treated as an array; Vax IMP lacks this
facility.
These differences partly reflect some indeterminacy in the  IMP
concept of what an array is.  In effect, IMP implies that the values
of the bounds and the dimensionality of an array are intrinsic
properties of the array.  But it does not capitalise on this to any
extent, by defining enquiry functions for the bounds, for example.
Particularly in a system implementation language, it can be argued
that this view of arrays is too complex and inflexible.  It is part
of the reason why most compilers evade the issue of array mapping.
Emas IMP does confront this important need, but finds itself righting
the language to provide what is required.

A more appropriate basic building block would be the concept of an
array as a vector characterised by a starting position and simply a
number of elements of defined type (or perhaps even a size in a
defined unit).  The standard IMP array declaration is then understood
to decompose into two operations: declaring such an object and
defining a particular mode of access to it.  On this view, declaring
an a̲r̲r̲a̲y̲n̲a̲m̲e̲ is seen as involving the second of these operations
only.  Thus, to take the easy case first, an example of a legitimate
a̲r̲r̲a̲y̲n̲a̲m̲e̲ declaration would be:
       %INTEGERARRAYNAME TABLE(1:4,1:25)
       
and a valid actual parameter for this case would be any array with
number of elements known to be 100. ('Known to be!’ rather than just
"happening to be' for obvious advantages in checking and efficiency).
Extension of this approach to the more general case, covering arrays
of unknown and differing sizes, is less obvious.  One possibility
would be to allow the number of elements of the actual parameter to
be cited as an operand in the bounds part of the a̲r̲r̲a̲y̲n̲a̲m̲e̲
declaration, or to have one unspecified lower or upper bound, as, for
example:
       %INTEGERARRAYNAME TABLE(1:4,1:#ELEMENTS//4)
or     %INTEGERARRAYNAME TABLE(1:4,1:*)

On this approach, assignment to the a̲r̲r̲a̲y̲n̲a̲m̲e̲ potentially involves
more work than the present approach, but careful attention to the
precise choice of syntax and the detail of implementation can make
this small.  However, it has the following advantages: a̲r̲r̲a̲y̲n̲a̲m̲e̲s are
properly defined at the time of decalration; the need for the awkward
construct of array formats is removed; and the array map ARRAY
becomes a simple function of the two characterising properties of a
vector described above -- starting position and number of elements
(or size).

Proposal

The forms for specifying multi-dimensional a̲r̲r̲a̲y̲n̲a̲m̲e̲s must be
standardised and the valuable facility of array mapping should be
included in the language.  Further consideration should be given to
ways and means, perhaps using the above tentative suggestions as a
starting-point.


N̲o̲n̲-d̲e̲c̲i̲m̲a̲l̲ n̲u̲m̲e̲r̲a̲l̲s̲

Vax IMP permits real numbers to be specified using the facilities for
representing numerals to bases other than 10.  Peter Robertson and
Ian Young have pointed out to me that this is not simply a case of
pathological generality, but permits real values to be specified with
a precision which may not be achievable with a decimal
representation.  Emas IMP has an ad hoc mechanism to cope with a
particular case of the problem.

Proposal

The facility to represent real numbers in non-decimal bases should be
included.


C̲o̲m̲p̲i̲l̲e̲r̲ c̲o̲n̲t̲r̲o̲l̲

Both compilers have a c̲o̲n̲t̲r̲o̲l̲ statement to modify the operation of
the compiler in ways which are generally known only to the
compiler-writer, by specifying magic numeric values. Vax IMP has an
additional o̲p̲t̲i̲o̲n̲ statement taking a literal string as argument.
This at least has the merit of presenting the argument in clear
rather than code.
Emas IMP guarantees to evaluate literal comparisons at compile-time
and trades on this to provide a limited form of conditional
compilation.  Conditional compilation is an important last resort to
minimise compiler differences (as well as for other purposes), but
the Emas capability is not a great deal of use in that way.  It
cannot be used to circumvent unwanted declarations for example or
statements that are syntactically unacceptable to the compiler.

Proposal

(a) Consideration should be given to the inclusion of o̲p̲t̲i̲o̲n̲ as a 
    language feature and an attempt should be made to agree standard
    mnemonics for the frequently required cases (suppressing checks
    in particular);
    
(b) Consideration should be given to the inclusion of a simple
    textually (not structurally) based conditional compilation
    facility.


E̲v̲e̲n̲t̲s̲

The event mechanism for signalling and trapping exceptions is a
particularly valuable part of IMP, providing a capability for which
there is no effective substitute in terms of other language
facilities.  In the nature of things, it cannot be expected that
there will be complete identity of the facility as it appears in
different implementations, since it must handle some highly
system-dependent circumstances.  There are, however, a number of
minor differences of detail between the Vax and Emas implementations
which could be eliminated or reduced.

Vax IMP defines a global r̲e̲c̲o̲r̲d̲ (or r̲e̲c̲o̲r̲d̲ m̲a̲p̲) EVENT, from which
information relating to the event can be retrieved.  Emas IMP defines
a function EVENTINFO which returns two values relating to the event
(combined in barbarous fashion); it also applies scope rules to this
function name which, though intended to be helpful, are non-standard.
The advantage of the Vax approach (and a general advantage of the use
of records with respect to portability) is that it enables certain
fields to be defined universally, while others are available for
local use, and possible later pervasion.
Emas IMP restricts event numbers to the range 1-14, while Vax IMP
provides the range 0-15, with zero corresponding to an event
signalled on execution of a s̲t̲o̲p̲ instruction.  It can be useful for a
calling procedure to be able to trap a s̲t̲o̲p̲ in a called procedure,
but whether the top of the range is 14 or 15 is a matter of little
importance.

Vax IMP permits a trap to be specified in the form "o̲n̲ e̲v̲e̲n̲t̲ *".
This seems pointless, perhaps even harmful: it is hardly ever
sensible to trap every event class without exception and in the rare
cases which may exist it is not unreasonable to expect the list to be
spelled out in full.

Proposals

(a) Event information should be provided via a record EVENT, with at
    least the components EVENT, SUB, EXTRA and MESSAGE being regarded
    as standard fields;

(b) There should be as much uniformity as possible in the choice of
    event numbers and sub-numbers;

(c) Wherever possible, stop should be implemented as signalling event
    zero;
    
(d) The upper limit to event numbers should be taken to be 14;

(e) The use of the star specifier in trapping events should not be
    permitted.


P̲e̲r̲m̲a̲n̲e̲n̲t̲ p̲r̲o̲c̲e̲d̲u̲r̲e̲s̲

As noted earlier, the question of permanent procedures and system
libraries goes beyond the matter of language compatibility into the
larger issue of compatibility of the system interface.  It is,
however, worth making the point that, although in certain respects
the distinction between features which are defined into a language
and features which are expected to be provided by a permanent or
system library is an important one and reflects a sensible philosophy
of language design, it is nonetheless true that compatibility in
terms of provision or specification of system library procedures can
be just as important for portability as compatibility in language
features.

Here, mention is made of only a few minor points which are known to
cause fairly acute problems for portability. The kinds of
differences which pose problems include:

(i)   major differences of function -- clearly an extreme problem,
      exemplified by the situation of the most basic input/output
      primitives of the language, namely those which handle
      individual characters.  Vax READ SYMBOL and PRINT SYMBOL
      correspond to Emas READ CH and PRINT CH, the two
      implementations having an idiosyncratic interpretation of the
      other pair;

(ii)  systematic variation introduced by differences in system
      conventions.  The format of file-names is an obvious, and
      probably inevitable, example; much more of a. nuisance is the
      difference in the interpretation of stream numbers between Emas
      and Vax;

(iii) detailed differences in the typing or interpretation of
      arguments and results (for example, READ and the second
      argument for WRITE);

(iv) differences in status -- whether external, system, or
     'permanent' (for example, PROMPT). 

CPUTIME is a good example of a compounding of these variations:
                           Emas                    Vax
its status is              e̲x̲t̲e̲r̲n̲a̲l̲               permanent
the type of its result is  r̲e̲a̲l̲                   i̲n̲t̲e̲g̲e̲r̲
the units are              seconds                milliseconds


Of course, the virtue of the fact that these are procedures, rather
than built-in language features, is that there are ways in principle
to pick them up from libraries other than the system standard one.
But this may imply considerable loss of efficiency if applied to very
basic procedures, it is always tedious, and it is a major barrier to
achieving the end of true portability -- being able to compile and
run identical programs on different systems.

Proposal

A working group should be set up to examine the differences in the
permanent procedures provided in the various implementations and draw
up a proposed standard library.