The Production of Optimised Machine-Code for High-Level Languages using Machine-Independent Intermediate Codes. Peter Salkeld Robertson

This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated.
A copy can be downloaded for personal non-commercial research or study, without prior permission or charge.
This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author.
The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author.
When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given.

The Production of Optimised Machine-Code for High-Level Languages using Machine-Independent Intermediate Codes. Peter Salkeld Robertson Ph. D. University of Edinburgh 1981 ABSTRACT The aim of this work was to investigate the problems associated with using machine-independent intermediate codes in the translation from a high-level language into machine code, with emphasis on minimising code size and providing good run-time diagnostic capabilities. The main result was a machine-independent intermediate code, I-code, which has been used successfully to develop optimising and diagnostic compilers for the IMP77 language on a large number of different computer systems. In addition, the work has been used to lay the foundations for a project to develop an intermediate code for portable SIMULA compilers. The major conclusions of the research were that carefully designed machine-independent intermediate codes can be used to generate viable optimising and diagnostic compilers, and that the commonality introduced into different code generators processing the code for different machines simplifies the tasks of creating new compilers and maintaining old ones. Contents 1 Introduction 2 Intermediate codes 2.1 Uncol 2.2 Janus 2.3 OCODE 2.4 P-code 2.5 Z-code 2.6 Summary and conclusions 2.6.1 Error checking and reporting 2.6.2 Efficiency 2.6.3 Assumptions 2.6.4 Interpretation 3 Optimisations 3.1 Classification of optimisations 3.1.1 Universal optimisations 3.1.2 Local optimisations 3.1.2.1 Remembering 3.1.2.2 Delaying 3.1.2.3 Inaccessable code removal 3.1.2.4 Peephole optimisation 3.1.2.5 Special cases 3.1.2.6 Algebraic manipulation 3.1.3 Global optimisations 3.1.3.1 Restructuring 3.1.3.2 Merging 3.1.3.2.1 Forward merging 3.1.3.2.2 Backward merging 3.1.3.3 Advancing 3.1.3.4 Factoring 3.1.3.5 Loop optimisations 3.1.3.5.1 Iteration 3.1.3.5.2 Holding 3.1.3.5.3 Removal of invariants 3.1.3.6 Expansion 3.1.3.7 Addressing optimisations 3.1.4 Source optimisations 3.2 Combination of optimisations 4 The design of the compiler 4.1 General structure 4.2 The intermediate code 4.2.1 Objectives 4.2.1.1 Scope 4.2.1.2 Information preservation 4.2.1.3 Target machine independence 4.2.1.4 Decision binding 4.2.1.5 Simplification 4.2.1.6 Redundancy 4.2.1.7 Ease of use 4.3 Code layout and addressing 4.3.1 Nested procedure definitions 4.3.2 Paged machines 4.3.3 Events 4.4 Data layout and addressing 4.5 Procedure entry and exit 4.5.1 User-defined procedures 4.5.2 External procedures 4.5.3 Permanent procedures 4.5.4 Primitive procedures 4.6 Language-specified and compiler-generated objects 4.6.1 Internal labels 4.6.2 Temporary objects 4.7 Object-file generation 4.7.1 Reordering 4.7.2 Jumps and branches 4.7.3 Procedures 4.7.4 External linkage 4.7.5 In-line constants 4.7.6 Object-file format 4.8 Summary 5 Review of the overall structure 5.1 Division of function 5.2 Testing and development 5.3 Diagnostics 5.3.1 Line numbers 5.3.2 Diagnostic tables 5.3.3 Run-time checks 6 Observations 6.1 Suitability of I-code for optimisation 6.2 Performance 6.3 Cost of optimisation 6.3.1 Compile time 6.3.2 Space requirement 6.3.3 Logical complexity 6.4 Comments on the results 6.4.1 Register remembering 6.4.2 Remembering environments 6.4.3 Array allocation and use 6.4.4 Common operands 6.4.5 Parameters in registers 6.4.6 Condition-code remembering 6.4.7 Merging 6.5 Criticisms and benefits of the technique 6.5.1 Complexity 6.5.2 I/O overhead 6.5.3 Lack of gains 6.5.4 Flexibility 6.6 Comments on Instruction sets and Optimisation 7 Conclusions 7.1 Viability of the technique 7.2 Ease of portability 7.3 Nature of optimisations Appendix Al Simplified I-code definition A2 I-code internal representation A3 Results References 1 Introduction Compilers for high-level languages form a significant part of most computer systems, and with an ever increasing number and variety of machine architectures on the market the problems of compiler development, testing, and maintenance consume more and more manpower and computer time. Moreover, as computer technology is improving and changing rapidly it is becoming evident that software costs will increasingly dominate the total cost of a system. Indeed, it may not be long before the lifetime of software regularly exceeds that of the hardware on which it was originally implemented, a state of affairs quite different from that envisaged by Halpern when he concluded that "the importance of the entire question of machine-independence is diminishing .." [Halpern, 1965]. In addition, there is a need to encourage the slowly-developing trend to write the majority of software in high-level languages. Even though the advantages of such an approach are many, a large number of users still have a love of machine-code, usually fostered by thoughts of "machine efficiency". Clearly, techniques must be developed to simplify the production of usable compilers which can "optimise" the match between the executing program and the user's requirements, be they for fast execution, small program size, reasonable execution time but with good run-time diagnostics, or whatever. One popular method for reducing the complexity of a compiler is to partition it into two major phases: one language-dependent and the other machine-dependent. The idea is that the language-dependent phase inputs the source program and deals with all the syntactic niceties of the language, finally generating a new representation of the program, an intermediate code. This is then input by a second phase which uses it to generate machine-code for the target computer. In this way it should be possible to produce a compiler to generate code for a different machine by taking the existing first phase and writing a new second phase. This ability to move a large portion of the compiler from machine to machine has led to such compilers being referred to as "portable compilers" even though the term is perhaps misleading, as only part of the complete compiler can be moved without change. In practice many existing compilers generate intermediate representations of the program which are passed around within the compiler, for example the "analysis records" produced by the syntactic phase of compilation, but for the purposes of this work it is only when these representations are machine-independent and are made available outwith the compiler that they will be termed intermediate codes. Much of the emphasis in designing intermediate codes has been on enabling a compiler to be bootstrapped quickly onto a new machine - either by interpreting the intermediate code, or by using a macro generator to expand it into a machine-code [Brown, 19771. Once this has been done the intention is that the quality of the code so produced can be improved at leisure. While this approach has been very successful and relatively error-free, it has been the experience of several implementors that it is difficult to adapt the scheme to produce highly optimised code [Russell, 19741; apparently considerations of portability and machine-independence have caused the problems of optimisation to be overlooked. The aspect of intermediate-code design which has received most debate concerns the level of the code: low-level with a fairly simple code-generator, or high-level with a more complex code-generator [Brown, 19721. This thesis attempts to put machine-independence and optimisation on an equal footing, and describes the use of an intermediate code which takes a novel view of the process. Instead of the intermediate code describing the computation to be performed, it describes the operation of a code-generator which will produce a program to perform the required computation. This effectively adds an extra level of indirection into the compilation, weakening any linkage between the form of the intermediate code and the object code required for a particular implementation. In essence I-code attempts to describe the results required in a way which does not constrain the method of achieving those results. In particular it should be noted that the code described, I-code, was designed specifically for the language IMP-77, a systems implementation language which contains many of the constructions which pose problems for optimisation (Robertson, 1979). It in no way attempts to be a "universal" intermediate code. Notwithstanding, the code, with a small number of minor extensions to cover non-IMP features, has been used successfully in an ALGOL 60 compiler and is currently proving viable in projects for writing Pascal and Fortran 77 compilers. The intermediate code as finally designed is completely machine independent, except inasmuch as the source program it describes is machine dependent, demonstrating that the problems may not be as intractable as thought by Branquart et al. who state that "clearly complete machine independency is never reached" [Branquart, 1973]. In addition to the problems of machine independence there is also the question of operating system independence, as nowadays it is common for machines to have several systems available. For this reason the task of producing a compiler is far from finished when it can generate machine code [Richards, 1977). To simplify the generation of versions of a compiler for different operating systems, a third phase of compilation was added, although it soon became clear that the extra phase could be used for other purposes as well, as will be shown in section 4. Throughout the text, examples are given of the code produced by compilers written to demonstrate the power of the intermediate code. The examples of the intermediate code are couched in terms of mnemonics for the various code items, although the production compilers use a compacted representation. The code and its representations are described in Appendix Al and Appendix A2. In the examples of code generated for various constructions, it should be appreciated that the exact instructions and machine features used will depend very much on the context in which the code is produced, and so only typical code sequences can be given. The machines for which code is demonstrated are indicated by the following abbreviations in parentheses: (Nova) Data General NOVA (PDP10) Digital Equipment Corporation PDP10 (PDP11) Digital Equipment Corporation PDP11 (VAX) Digital Equipment Corporation VAX 11/780 (GEC4080) General Electric Company 4080 (ICL2900) International Computers Limited 2900 (4/75) International Computers Limited 4/75 (7/16) Interdata 7/16 (7/32) Interdata 7/32 (PE3200) Perkin Elmer 3200 2 Intermediate codes This section gives a brief account of the more important intermediate codes which have been discussed and have had an influence on the design of I-code. 2.1 Unco1 UNCOL, UNiversal Computer Orientated Language, [Mock, 1958], was an early attempt to specify a means for solving the M*N problem of producing compilers for M languages to run on N machines. It was proposed that an intermediate language, UNCOL, be defined which would be able to express the constructs from any language, and which could itself be translated into code for any machine, resulting in the need for only M+N compilers. Indeed it was even suggested that programs would be written directly in UNCOL rather than in machine code. These ideas were very ambitious, but were presented without any concrete examples of what CICOL might look like. Proposals were made for an UNCOL in [Steel, .1961] but the work was abandoned before anything like a complete specification had been produced. An UNCOL-like technique which has been used extensively, is to compile for a known type of machine, such as the IBM 360, and then emulate that machine on the target machine. Unfortunately, to give this any chance of being efficient, microcode support will be necessary and this is rarely available to compiler writers. 2.2 Janus The first attempt at generating an UNCOL which seems to have been at least partially successful was JANUS [Coleman, 1974]. The approach was effectively to enumerate all the mechanisms found in current programming languages and the techniques used to implement them. From this large list was defined a set of primitive data-types and operations upon them. These primitives were then put together to model the objects in the source language. Once JANUS code had been produced the intention was that it would either be interpreted or compiled into machine code by a macro generator. 2.3 OCODE Of all the languages which claim to be portable, perhaps the most successful has been BCPL [Richards, 1971]. The BCPL compiler generates the intermediate code OCODE which can either be interpreted or translated into machine code for direct execution. As BCPL is a fairly low-level language with only one data type, the word, many of the difficulties in designing intermediate codes do not arise. This means that the code can be pitched at a low level and be "semantically weak" without compromising the efficiency of the compiled code to any great extent. The OCODE machine works by manipulating single-word objects held on a stack, into which there are several pointers. e.g. R(1, 2, 3) STACK 3 adjust the top of stack to leave two cells free for linkage information. LN 1 stack the constant 1. LN 2 stack the constant 2. LN 3 stack the constant 3. LL L6 stack the address of label L6 (the entry to the routine). RTAP 5 enter the procedure adjusting the stack frame pointer by 5 locations. ENTRY 1 L6 'R' entry point for the routine R. SAVE 5 set the top of stack pointer to be 5 locations from the stack frame pointer. RTRN return. 2.4 P-code P-code is the intermediate code used by the PASCAL

-compiler: Implementation notes". Eidgenossische Technische Hochschule, Zurich, 1976. Pavelin, C.J. (1970) "The improvement of program behaviour in paged computer systems". Ph.D. Thesis, University of Edinburgh, 1970. Poole, P. (1974) "Portable and adaptable compilers". Advanced course on compiler construction, Springer-Verlag 1974. Richards, M. (1971) "The portability of the BCPL compiler". Software - Practice and Experience, Vol 1, 1971. Richards, M. (1977) "Portable Compilers". Software Portability (ed. P. Brown), Cambridge University Press, 1977. Robertson, P.S. (1977) "The IMP-77 Language". Internal Report CSR-19-77, Department of Computer Science, University of Edinburgh, 1977. Russell, W. (1974) "A translator for PASCAL P-code". Final year project report, Department of Computer Science, University of Edinburgh, 1974. Satterthwaite, E. (1972) "Debugging tools for high-level languages". Software - Practice and Experience, Vol 2, 1972. Schneck, P.B. & Angel, E. (1973) "A FORTRAN to FORTRAN optimising compiler". Computer Journal Vol 16, 1973. Sibley, R.A. (1961) "The SLANG System". CACM Vol 4, 1961. Spier, M.J. (1976) "Software Malpractice - A distasteful experience". Software - Practice and Experience, Vol 6, 1976. Steel, T.B. (1961) "A first version of UNCOL". Proc. AFIPS WJCC 19, 1961. Stephens, P.D. (1974) "The IMP language and compiler". Computer Journal Vol 17, 1974. Stockton-Gaines, R. (1965) "On the translation of machine language programs". CACM Vol 8, No 12, 1965. Szymanski, T.G. "Assembling code for machines with span-dependent instructions". CACM 21, 1978. Tanenbaum, A.S. (1976) "Structured Computer Organisation". Prentice/Hall International, 1976. Thompson, K. & Ritchie, D.M. (1974) "The UNIX timesharing system". CACM Vol 17, No 7, 1974. Trout, R.G. (1967) "A compiler-compiler system". Proc. ACM, 1967. Waite, W.M. (1970) "The Mobile Programming System: STAGE 2". CACM, vol 13, 1970. Welsh, J. & Quinn, C. (1972) "A PASCAL compiler for the ICL 1900 series". Software - Practice and Experience, Vol 2, 1972. Whitfield, H. & Wight, A.S. (1973) "EMAS - The Edinburgh Multi-Access System". Computer Journal Vol 16, 1973. Wichmann, B.A. (1977) "Optimisation". Software Portability (ed. P. Brown), Cambridge University Press, 1977. Wichmann, B.A. (1977) "How to call procedures, or second thoughts on Ackermann's function". Software - Practice and Experience, Vol 7, No 3, 1977. Williams, M.H. "Long/short address optimisation in assemblers". Software - Practice and Experience, Vol 9 No 3, 1978 Wulf, W., Johnsson, R.K., Weinstock, C.B., Hobbs, S.O. & Geschke, C.M. (1975) "The design of an optimising compiler". Elsevier Computer Science Library, 1975.