RELEASE NOTES ============= The imp77 implementation supplied by i2c (a back-end for the Imp77 portable compiler's intermediate code, which generates C code rather than architecture-specific machine code) is a prototype release with some known limitations. A second implementation is under development which will smooth out some of the 'rough edges' of this prototype, which was written to gain experience in using C as a portable back-end for Imp77. This document lists the known issues with the current implementation, to allow a small group of trusted users to explore the limitations of this method and steer the course of the reimplementation. This code is not intended for public release other than to these trusted alpha testers. * The associated built-in support routines (sometimes known as 'perms') are not fully developed; they are a placeholder, so errors caused by incomplete or faulty library calls are to be expected. Feedback on specifically which calls require improvement is welcome, moreso if accompanied by corrected code. (For example I noticed today that some string handling code was not setting the length correctly. You should expect to come across several problems of this nature...) * The implementation relies heavily on Imp77 being case-insensitive but Linux external linking being case-sensitive, so all routines in Imp77 appear as upper case to non-Imp77 code. If linking with system code, external references to procedures in other languages (such as C) should be specified using Imp77's '%alias' mechanism, giving the C-compatible name in the alias, in lower-case. For example, "%externalroutinespec sleep %alias "sleep" (%integer seconds)" would be necessary to access the linux system call by that name, whereas "%externalroutine sleep (%integer seconds)" would define a routine written in Imp77 which a C program (which is case-sensitive) would have to specify as "extern void SLEEP(int seconds);" * Be aware that C code evaluates parameters to procedures from right to left; Imp77 is defined to evaluate parameters from left to right. This does not affect the way that external procedures are specified in Imp77, it is just a difference in the way the languages work that C programmers writing in Imp77 should be aware of. External procedures are specified in the same left to right order in both languages, for example, the C call: extern int rename(const char *oldpath, const char *newpath); might be declared in an Imp77 header file as: %external %integer %fn %spec rename %alias "rename" (%byte %array %name oldpath, %byte %array %name newpath)" * To make interfacing to operating systems calls easier to write, they may be specified using "%system" rather than "%external". This will handle the case issues for you and avoid the need to use "%alias", for example, the above becomes: %system %integer %fn %spec rename(%byte %array %name oldpath, %byte %array %name newpath) In fact here's a complete program that calls a linux routine: %system %integer %fn %spec getloadavg (%long %real %array %name loadavg, %integer nelem) %begin %integer rc %long %real %array LA(0:2) rc = getloadavg (LA, 3) print string("Load average: 1 min: ") print(LA(0), 4, 4) print string(" 5 mins: ") print(LA(1), 4, 4) print string(" 15 mins: ") print(LA(2), 4, 4) newline %endofprogram * This implementation does not pass dope-vectors for own/const/external arrays (which are always 1-D) - the address of the notional element(0) is passed as the apparent base of the array, even if declared as an array with a non-zero lower bound. This means that within the called procedure, indexing works correctly as expected (eg if %integerarray letter('A':'Z') is passed to "DO SOMETHING(letter)", then within the code of "DO SOMETHING", letter('A') will access the correct element as expected. However what will *not* be available is run-time checking that an array index is within the declared range. This is a shortcoming that will be corrected in the next release. * It's possible that arrays of higher dimensions (eg 2D arrays) may have some issues, but for now they seem to be working albeit with similar restrictions to the 1D case above (although the mechanism used is somewhat different in the C translation). This too is a shortcoming that should be corrected in the next release. Unlike ERCC Imp which requires a runtime dope vector (and which supports the ARRAY() map), Imp77 was designed, if I remember right, to be implementable without runtime dope-vectors. Unfortunately this meant that ARRAY() maps were not possible in Imp77, which restricted some of the things that could be expressed in Imp77, although I believe that like other Imp features from earlier implementations which were dropped from the language, they were considered not worth preserving because they were seldom if ever used. (dynamic array bounds within records being one such feature, for example). * routines and functions passed as parameters to other routines or functions are not yet implemented. They will fail at compile time. This code is currently being added. * variant records (record formats using "%or" to overlay fields in store) are not fully supported. For now the overlaid fields are accepted but are treated as separate variables and not overlaid. If these were used in an Imp77 program simply to save space at runtime, the program should behave correctly; if they were used to deliberately pun on the storage (for example by writing to a %real field and reading the internal representation of that %real variable back as an integer by reading from an overlaid %integer field) then the code will not behave as intended. This is not just a lack of the feature being implemented; it is a subtle issue with the modern definition of the C language that the code is translated into. This will eventually be implemented. * a %label declaration within a routine or function will cause a compile-time error in the generated C code, unless option --noline is specified. This is a minor niggle which unfortunately requires a disproportionate amount of coding to fix. (It has to do with the order of code vs declarations in C and the fix requires a major restructuring of the C code generated by Imp77 - which currently is handled in a single phase or pass, but which requires code shuffling to correct, i.e. two-passes or a second phase with the entire program being held in memory). So use the --noline workaround for now until a major code refactoring can be performed. * The translation of Imp variables to C in the current implementation is not centralised in a single place as it should have been, so several edge cases which work in the general case of expressions fail in some specific circumstances. The cause of these failures is well understood and the necessary restructuring of code is underway, but that rewrite is a big task and may take some weeks at the time of writing before it is completed. Fortunately the few times these errors occur are all in places where the generated C code causes compile-time errors so they are usually easily found. These errors are mostly C translation problems where indirection is involved, either "*" or "&", in addition to a few cases where data is cast inappropriately. * %string variables currently always have capacity for 255 characters. Proper string length enforcement is pending. Most string handling works as expected at the cost of using excessive space, which is not really an issue in a modern environment given the size of all existing historical Imp77 programs. * Be aware that Imp77 does not add a terminating '\0' to strings in the style of C, so interfacing with external C code will often require a 'shim' procedure to be written to append a '\0' to a string, as well as adjusting the address of a string (to skip the preceding length byte) when passing strings to C procedures, * Imp I/O tries to follow the Linux model while retaining compatibility with original era behaviour. stream 0 is predefined to be the console and stream 1 is standard input and standard output. * The common Imp "CLI PARAM" is supported but linux-style 'ARGC' and 'ARGV(n)' are also available as built-in functions. Users should explicitly SELECT INPUT 0 or 1 at the start of a program, and of course explicit opening and selection of other streams with "OPEN INPUT" etc should work too. * The perms supplied are a bit of a mish-mash of various Imp systems (not just Imp77) to increase the chances of being able to compile old code. Let me know if there is something missing which you think should be present. Perms being present which are not wanted should not be a problem because they are declared as "%perm" which is a level of scope above even that of "%external", so any external procedures you may write which happen to have the same name as a perm routine should not cause a linker clash. * As with historical Imp implementations, perm routines (which can potentially be implemented by the compiler using open inline code) do not necessarily have an actual callable procedure body, and therefore cannot be passed as parameters to other routines (when that facility is implemented!); however the standard Imp solution of writing a shim procedure to call a perm routine does work, and adds no runtime overhead. For example: %routine my write(%integer n, p) write(n, p) %end * An approximation to Imp77's "%on %event" mechanism is supplied. It is not as extensive as traditional implementations, and although it will mostly work, there are some corner cases where we know it will not work (such as resignalling a caught exception and working back up a chain of nested procedure calls.) It almost certainly does not currently undo event blocks correctly when the enclosing procedure returns. That code has not yet been written. CALLING I2C =========== The user interface to the i2c command is under development - the initial command that is supplied is "i2c" but the command-line options of that command are somwhat limited. A replacement driver for the compiler, which behaves much more like the C compiler's "cc" command, is under development. That newer version is currently called "i77" and is included in the distribution, and it calls "./i2c" to do the translation from Imp to C before invoking gcc to compile to an object or an executable file. Users should understand that i2c is an Imp77 compiler which uses C as an intermediate step in producing a binary - it is *not* a source-to-source translator whose output is meant to be user-maintainable C; indeed the C code will normally be deleted as soon as the Imp77 program is compiled (though during development that is not being done). If you need a high-level, maintainable translation of old Imp code to C, that requires a different project, "imptoc", which is not complete and which will not be worked on until i2c is fully released. The i2c environment is expected to be Linux (using gcc). The suite may work on other systems (such as Windows) as long as they support gcc, but no support for those environments is envisaged at this point. Note that true gcc is a hard requirement - using clang in gcc-compatible mode is not an option as clang does not supported nested procedure definitions and never will. i2c programs currently require perms.c to be compiled and linked with them, and the Imp77 %include file "perms.inc" must be present in the 'current directory' when i2c is invoked. The new i77 command will fix these awkward restrictions when it is completed. PORTABILITY =========== The code compiles on both 32 bit and 64 bit Intel, and 32 bit and 64 bit ARM. Compiling on 64 bit Intel requires the "-m32" flag to gcc. For some reason compiling on 64 bit ARM works with the default options, and I'm not sure why! On 32 bit ARM you'll see a warning about: void *caller_addr = __builtin_return_address(1); - ignore it. I know what it is. Several programs actually run on 64 bit ARM but that's primarily just luck; as we've discussed on our mailing list, 64 bit addressing is problematic for Imp77 and the i2c implementation makes no effort to be 64 bit compatible, so do not expect code that makes use of ADDR() to work. However there is a solution to always running correctly on aarch64: the following commands will install 32-bit compatible binaries and runtime on a Pi 4 or Pi 5... sudo apt install gcc-arm-linux-gnueabihf sudo dpkg --add-architecture armhf sudo apt-get update sudo apt-get install libc6:armhf then change the "gcc" command in the makefile for i2c to "arm-linux-gnueabihf-gcc -std=c99" At the moment I am not interested in Windows or OpenBSD testing. I already know it's going to be problematical. This is a Linux suite for now. I *am* interested in any problems that show up under 32 bit ARM which has not been significantly tested yet, but in principle ought to work. Here's an idea of how to download and test on a 32-bit system. mkdir -p ~/src/i2c-tmp cd ~/src/i2c-tmp wget http://www.gtoal.com/history.dcs.ed.ac.uk/archive/languages/imp77-tmp/gtoal/i2c.zip unzip i2c make ./REGRESSION-COMPILE.sh ./REGRESSION-RUN.sh (The programs in REGRESSION-COMPILE.sh have not been ported and probably won't run, they're just for me to spot changes in the generated C output when I make changes to i2c. Whereas the programs in REGRESSION-RUN.sh should actually run.) DIAGNOSTICS =========== i2c attempts to provide the same level of runtime checking that historical Imp compilers offered, however those features have to be implemented somewhat differently to how a true compiler that outputs binary files directly would handle diagnostics. Some checks are handled by explicit tests within the generated C code, others rely on linux mechanisms such as 'valgrind' and 'gdb' and the various "sanitize" options to the gcc compiler which enable specific checks. Sometimes the linux valgrind checks are over-enthusiastic, and definitely confusing; so the Imp77 programs sometimes have to be run without those checks enabled (using the --nocheck option). Valgrind also adds a significant runtime overhead compared to the low overhead that traditional Imp compilers typically imposed. That said, all checks are on by default. Tuning the checks for efficiency and clarity are tasks scheduled for handling after the generated code itself is fully implemented and reliable. The code to invoke valgrind and several of the gcc compile-time options which depend on the version of gcc that you have available is somewhat hairy. It works on my system, I can't promise it will work elsewhere. Unassigned variable testing in particular uses a mechanism that only exists in V12 and later of gcc and may need serious tweaking to avoid false positives with earlier versions of GCC. I'm in the process of making those tests conditional on a command-line option. This alpha-tester release is via my personal web site. It will not be moved to github until it is significantly farther developed and tested. Please don't share outside our small group - there's no fate worse for a compiler than getting a reputation for being buggy in its early days of release (as we saw at Edinburgh a couple of times). So I don't want potential users other than the four of us to use i2c at this stage, as I know (and have documented above) that it definitely has bugs. This is alpha testing, not beta testing. When you report problems, don't take offence if you get one of these canned replies :-) * that's one of the known problems I mentioned, fixing it is scheduled, it's just a case of me getting around to writing that code. * Thank you indeed for the code to fix that problem or implement that library routine, I'll put it in right away! * Oh shit. I hadn't spotted that one, thanks! * That is going to be a bastard to fix, but I'll put it in the pending queue for once everything else is handled. (My job queue is ordered by how likely it is that a problem will come up in a real program. Currently procedure parameters are the first priority and refactoring to fix some translation issues is a close second.) * Yeah, never going to change that, I'll just document it. * That's actually correct behaviour. (OK, I probably will respond with more that just those, but I do expect a significant number of things that you'll bump up against to be problems I'm already aware of.) Contact me by email with comments, fixes, and advice: gtoal@gtoal.com Graham