head 1.29; access; symbols; locks gtoal:1.29; strict; comment @# @; 1.29 date 2026.04.18.01.58.51; author gtoal; state Exp; branches; next 1.28; 1.28 date 2024.07.03.14.14.00; author gtoal; state Exp; branches; next 1.27; 1.27 date 2024.07.03.04.13.26; author gtoal; state Exp; branches; next 1.26; 1.26 date 2024.07.01.21.37.36; author gtoal; state Exp; branches; next 1.25; 1.25 date 2024.06.29.18.01.17; author gtoal; state Exp; branches; next 1.24; 1.24 date 2024.06.25.15.59.42; author gtoal; state Exp; branches; next 1.23; 1.23 date 2024.06.21.01.48.23; author gtoal; state Exp; branches; next 1.22; 1.22 date 2024.06.20.05.48.23; author gtoal; state Exp; branches; next 1.21; 1.21 date 2024.06.12.06.53.12; author gtoal; state Exp; branches; next 1.20; 1.20 date 2024.06.11.06.57.33; author gtoal; state Exp; branches; next 1.19; 1.19 date 2024.06.10.20.33.23; author gtoal; state Exp; branches; next 1.18; 1.18 date 2024.06.10.16.24.58; author gtoal; state Exp; branches; next 1.17; 1.17 date 2024.06.10.04.44.31; author gtoal; state Exp; branches; next 1.16; 1.16 date 2024.06.09.16.19.59; author gtoal; state Exp; branches; next 1.15; 1.15 date 2024.06.09.05.56.23; author gtoal; state Exp; branches; next 1.14; 1.14 date 2024.06.05.02.12.22; author gtoal; state Exp; branches; next 1.13; 1.13 date 2024.06.04.23.19.34; author gtoal; state Exp; branches; next 1.12; 1.12 date 2024.06.02.06.52.34; author gtoal; state Exp; branches; next 1.11; 1.11 date 2024.06.01.08.56.49; author gtoal; state Exp; branches; next 1.10; 1.10 date 2024.06.01.08.07.47; author gtoal; state Exp; branches; next 1.9; 1.9 date 2024.06.01.02.51.50; author gtoal; state Exp; branches; next 1.8; 1.8 date 2024.05.29.19.06.17; author gtoal; state Exp; branches; next 1.7; 1.7 date 2024.05.28.23.30.48; author gtoal; state Exp; branches; next 1.6; 1.6 date 2024.05.28.19.23.24; author gtoal; state Exp; branches; next 1.5; 1.5 date 2024.05.28.06.55.08; author gtoal; state Exp; branches; next 1.4; 1.4 date 2024.05.28.02.11.23; author gtoal; state Exp; branches; next 1.3; 1.3 date 2024.05.28.00.57.22; author gtoal; state Exp; branches; next 1.2; 1.2 date 2024.05.24.19.04.03; author gtoal; state Exp; branches; next 1.1; 1.1 date 2024.05.22.03.30.58; author gtoal; state Exp; branches; next ; desc @@ 1.29 log @bringing rcs files up to date @ text @# This grammar was derived from imp80-rde-2022/genps/ps86.dat # The parser generator used here is a hybrid of the ERCC style and yacc, where # code is embedded with the grammar. There is a description of the parser at # https://github.com/gtoal/uparse (which does need a little updating in light # of some recent additions). # Whereas in my Algol 60 test parser, I used a grammar which explicitly encoded # the operator precedence, in this Imp80 parser I have reverted to the ERCC style # where all operators appear to the grammar to be of equal precedence, with the # precedence being added later using the Shunting Yard algorithm just as was # done in the original ERCC compilers. There was no strong reason for me to do # it this way other than for the experience. I have to admit that I found it # easier just writing the code from first principles rather than from understanding # the TORP code in the ERCC Imp80 compiler. Anyway, it works. # Invoke as ./imp80 tests-imp80/pass2-tmp.imp 2> /dev/null | gtcpp -N -k -C -I- -P 2> /dev/null | clang-format | more # Currently the C code is generated on the fly as the AST below is traversed # using compile(). However I am about to modify it so that first the AST is # built completely (with some P_* items being replaced with AST_* tuples at # a higher level), and then the C code is generated from the updated AST. # The P_* tuples reflect the structure of the Imp code - the AST_* ones map # more closely to the domain of the generated C code. # Note that if a rule does not appear to do anything and does not explicitly "return" # then the control drops through to a default handler, which recursively # evaluates all the subphrases and returns a tuple with their results. # Ideally compile() would just return P_* tuples, and generate_c() would display # those tuples in C form, but in the early stage of development that separation # was barely adhered to and the majority of C output was done in compile() however # as the code developed, output first moved to generate_c() (whose parameters # were all P_* tuples) and eventually to handle_ast_phrase() where the inputs # should be all AST_* tuples. Once everything has migrated from generate_c() # to handle_ast_phrase(), I will remove generate_c() entirely. (And possibly # also rename handle_ast_phrase() as generate_c() ...) # As well as returning P_* tuples which exactly mirror the grammar structure, the # programmer may define, create, and return AST_* tuples from compile() which are # more traditional AST items that reflect the underlying structure, and perhaps # also the representation in C of the Imp80 code. # Built-in phrases are known to the parser and are introduced by B<...> # Note that the equivalent functionality in an ERCC parser would be # implemented by parse-time rules in this parser, however it looks like # we're not going to need either B<> *or* C<> rules for this one. I; # some code to handle initialisation, line reconstruction, and keyword stropping. # It was large and uninteresting, so has been split off into a separate grammar file. # (Include files are denoted using I<...>; Most support procedures are in that # file, though there is nothing stopping us placing more support code in *this* # file, which is done within anonymous {...} brackets. # P IS THE MAIN ENTRY POINT. P = { compile(P(1), depth+1); // perform initialisation compile(P(2), depth+1); // Read in the entire source and perform line reconstruction. int SS = compile(P(3), depth+1); // Get the tree representing the entire program compile(P(4), depth+1); // clean up at end. return SS; // Pass back parse tree for generate_c() to walk and output the C code. }; P = '+', '-', '&', '****', '**', '*', '!!', '!', '//', '/', '>>', '<<', '.', '\\\\', '\\', '^^', '^' { // Phrase is used to convert an OP into a more useful AST phrase. // (the default for char syms is separate P() items for each char, which is unweildy) // It would be cleaner to return the OP_xxx enum rather than the string. // I should do so later. (The string is decoded in PushBinOp()) // TO DO: every P_mktuple(AST_*) operation must synthesize the type of its // result from the types of the operands. if (alt == 0) { //\\ '+', t[1] = wstrtopool(L"+"); t[2] = OP_ADD; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 1) { //\\ '-', t[1] = wstrtopool(L"-"); t[2] = OP_SUB; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 2) { //\\ '&', t[1] = wstrtopool(L"&"); t[2] = OP_AND; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 3) { //\\ '*' '*' '*' '*', t[1] = wstrtopool(L"****"); t[2] = OP_IEXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 4) { //\\ '*' '*', t[1] = wstrtopool(L"**"); t[2] = OP_REXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 5) { //\\ '*', t[1] = wstrtopool(L"*"); t[2] = OP_MULT; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 6) { //\\ '!' '!', t[1] = wstrtopool(L"!!"); t[2] = OP_EOR; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 7) { //\\ '!', t[1] = wstrtopool(L"!"); t[2] = OP_OR; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 8) { //\\ '/' '/', t[1] = wstrtopool(L"//"); t[2] = OP_INTDIV; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 9) { //\\ '/', // %begin // %real r // r = 3/5 ;! beware: a naive conversion to C will cause a result of 1.0 (because both operands of '/' will be ints) // print(r,5) ;! 0.6 // %endofprogram t[1] = wstrtopool(L"/"); t[2] = OP_REALDIV; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 10) { //\\ '>' '>', t[1] = wstrtopool(L">>"); t[2] = OP_RSHIFT; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 11) { //\\ '<' '<', t[1] = wstrtopool(L"<<"); t[2] = OP_LSHIFT; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 12) { //\\ '.', t[1] = wstrtopool(L"."); t[2] = OP_CONCAT; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 13) { //\\ '\' '\', t[1] = wstrtopool(L"\\\\"); t[2] = OP_IEXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 14) { //\\ '\', t[1] = wstrtopool(L"\\"); t[2] = OP_REXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else if (alt == 15) { //\\ '^' '^', t[1] = wstrtopool(L"^^"); // NOTE: soap80 does not recognise ^^, only **** t[2] = OP_IEXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } else { //\\ '^'; t[1] = wstrtopool(L"^"); t[2] = OP_REXP; return t[0] = P_mktuple(AST_BINOP, alt, 2/*phrases*/, t); } return -1; }; P = { PushBinOp(compile(P(1), depth+1)); // TORP return -1; }; P = '+', '-', '\\', '~', '' { if (alt == 4) return -1; if (alt == 0) { //\\ '+', // return -1; // simpler, but generated C would not retain the unary + t[1] = wstrtopool(L"+"); t[2] = OP_POS; } else if (alt == 1) { //\\ '-', t[1] = wstrtopool(L"-"); t[2] = OP_NEG; } else if (alt == 2) { //\\ '\\', t[1] = wstrtopool(L"\\"); t[2] = OP_NOT; } else if (alt == 3) { //\\ '~', t[1] = wstrtopool(L"~"); t[2] = OP_NOT; } return t[0] = P_mktuple(AST_MONOP, alt, 2/*phrases*/, t); }; P = { { int Expr = compile(P(1), depth+1); if (Expr != -1) PushMonOp(Expr); // TORP return -1; } }; # worry about these as RP later... P = '==', '=', '<-', '->' { if (alt == 0) { t[1] = wstrtopool(L"=="); t[2] = OP_ASSIGN_ADDR; } else if (alt == 1) { t[1] = wstrtopool(L"="); t[2] = OP_ASSIGN_VALUE; } else if (alt == 2) { t[1] = wstrtopool(L"<-"); t[2] = OP_JAM_TRANSFER; } else if (alt == 3) { t[1] = wstrtopool(L"->"); t[2] = OP_UNCOND_STR_RESOL; } return t[0] = P_mktuple(AST_ASSOP, alt, 2/*phrases*/, t); }; P = '==', '>=', '>', '##', '\\==', '<>', '<=', '<', '->', '=', '#', '\\=' { { // Some work required to distinguish address comparisons and conditional resolution from simple arithmetic comparisons or string comparisons const char *C_Comp[] = { "== &", ">=", ">", "!= &", "!= &", "!=", "<=", "<", "-> /*STRRES*/", "==", "!=", "!=" }; // (The '&'s above are temporary) codegen(" %s ", C_Comp[alt]); return -1; // TO DO: return one of several possible AST_something tuples } }; P = ; # <-------------------------------------------------------------------------- FIX! TO DO # is an *Existing* variable: P = «[A-Z][A-Z0-9]*» { { // We want to return an AST record with the VarDecl contents // (which includes the TypeDecl information) int VTagAtom = SubPhraseIdx(Ph, 2); wSVTag = Atom2WStr(VTagAtom); VTag=wstrtopool(wSVTag); // Use the scope tools to find it: // (Question: are there any separate namespaces in Imp as there are in C?) int VarIDX = lookup_by_strpoolidx("decl", VTag); // This (currently) just returns a raw index into the VarDecl[] array. Would be nice to strongly type it. if (VarIDX == -1) { fprintf(stderr, "* NAME NOT SET(#3): %ls\n", wSVTag); //exit(1); return -1; } t[1] = VarIDX; int rvalue = P_mktuple(AST_RVALUE, 0/*alt no*/, 1/*phrases*/, t); P_TYPEINFO(rvalue) = VarDecl[VarIDX].type; return t[0] = rvalue; } }; # is like except we do not look it up in the # name tables, because the caller will check to see if the name is # a field in the parent record. Just return the literal. P = «[A-Z][A-Z0-9]*» { { // check to see which is in the most accessible format: //Diagnose("", SubPhraseIdx(Ph, 2),0, debug_declarations); VTag = SubPhraseIdx(Ph, 2); //Diagnose("", VTag,0, debug_declarations); //if (SVTag != NULL) free(SVTag); SVTag=Atom2Str(VTag); // *short term* cache. Diags if (wSVTag != NULL) free(wSVTag); wSVTag=Atom2WStr(VTag); t[1] = VTag; return t[0] = P_mktuple(AST_TOKEN, 0/*alt no*/, 1/*phrases*/, t); } }; # All s were a single BIP in the ERCC grammar, but here I will handle them by # writing specific new phrases for each type of const - possibly accepting some invalid # items - and delaying error reporting until the constants are converted for storing # in the AST. # IMP80 includes the following types of constant: # # Decimal integer constants e.g. 2243, -16, 1 000 000 # Base constants e.g. 2_1001, 8_7720, 16_A06C, # B'1011', X'1A69' # Real constants e.g. 120.0, 120, 1.2@@2, 12@@1, # 1200@@-1 # Character constants e.g. 'A', 'a', '+', '"', ' # ', '''', '6' # Multi-character constants e.g. M'Four', M'MAX', M'1+1=', # M'"#%' # String constants e.g. "Here is a string", "A", # "123", "a ""good"" boy" # Named constants (also called e.g. %constant %integer PRICE=23 # "%constant variables") : # TOTAL COST = PRICE * NUMBER # Note that currently, base constants such as 2_1001, 8_7720, and 16_A06C are not implemented. # These were implemented in the ERCC compilers as BIPs. Here they will be parsed using # regular expressions and converted to C form by code in one of compile(), handle_ast_phrase(), # or generate_c(). [TBD] P = «'([^\']|'')*'»; # , , <-------------------- FIX! TO DO P = «''''», «''», «'.'»; # , , <---------------- FIX! TO DO P = '@@' , '@@' '-' , ''; # <------------------------------------------------------------------ FIX! TO DO # Not sure if ".123" or "23." are supposed to be acceptable or not. This grammar certainly doesn't # mention it, but then again it uses a BIP. The manual from which the summary above was taken # doesn't say either. However Imp77 and Bob's Imp80 *do* accept ".123" so I will add it. # Next style to check is "123@@2" for 12300.0 The syntax below *should* allow it. P = '.', '.', ''; # <------------------------------------------------------------------ FIX! TO DO P = «[0-9A-Za-z][0-9A-Za-z]*»; # will ignore for now examples like 16_10_ABCD P = '_', , '.', , , 'E', # EBCDIC «[MBKXRH]» { // <--------------------------------------------------------------------------------------------------- FIX! TO DO }; # Multi Binary oKtal heX Realhex (K'7777' not O'7777' - earlier error) # H is like R but without conversion (but not sure *what* conversion. See imp80pass1.imp ) P = { // Pushes an AST_ATOM_LIT: t[1] = compile(P(1), depth+1); t[2] = -1; // type info, TO DO. t[0] = P_mktuple(AST_CONST, 0/*alt no*/, 2/*phrases*/, t); //Diagnose("NEW CONSTANT: ", t[1], 0, TRUE); //{int child=P_P(t[1],1); Diagnose("CONSTANT CHILD: ", child, 0, TRUE);} PushConst(t[0]); // <--------------------------------------------------------------- FIX! TO DO }; P = «[0-9][0-9]*» ; # <--------------------------------------------------------------- FIX! TO DO P = '"' ; # <------------------------------------------------------------------------- FIX! TO DO P = «.» ; # <------------------------------------------------------------------------- FIX! TO DO P = , , ''; # <------------------------ FIX! TO DO # ^ comes through (as expected) as two adjacent strings (which is illegal in Imp. It's not C.). # But really it's a doubled-quote representing a single quote. P = ; # <----------------------------------------------------- FIX! TO DO P = ';' { return -1; }; P = '!' , '|' , "comment" { // TO DO: tweaking this to suppress a later error, but would be nice to know why compile() had not tagged the result properly // Add | AST_ATOM_LIT ? t[1] = -1; /* easier if I dont care how the comment was started... */ // compile(P(1), depth+1); // compiling literals through the default path causes them to be printed. This may be wanted at the initial // stages of bringing up a new program but by the time it's developed as far as we are now, we generally don't // want bits of text showing up at random! I think it's anything created by wlit() and can be suppressed by // redefining wlit() at the appropriate place. By the time this code is complete there should be no drop-throughs, // and any text that we want to be reserved should be passed back via AST_* tuples. So issues such as this rule // should go away. // Note that for now, line reconstruction loses all spaces in parsed comments, and the path of literal text // through the compiler loses embedded '{comments}' at the moment too, so comments are not really well handled // for now and will require some major restructuring of the line reconstruction phase. // The reason that spaces are stripped in "!" comments is that "!" is ambiguous with the "OR" operator (and // in previous compilers, with !MOD!) so it is not safe to stop stripping spaces just because a "!" has been // seen in the input. Because "LABEL: ! comment text" is valid, the line reconstruction *has to* interact // with the parser state, which is appalling language design and forces a specific implementation on the // compiler-writer. You cannot, for example, have a completely independent pre-processing stage, and you can't // have a parser that parses first to a tree and then performs all the actions on the parsed tree - the levels // are all mixed up together. Hence why this issue has not been handled earlier. I've been trying to keep my // options open as to how the parser interacts with the line reconstruction and with the decision to handle // declarations and scope at parse time versus at compile time from the fully-parsed program. (i.e. statement-at-a-time // compiling versus program-at-a-time compiling of the parse tree) t[2] = compile(P(2), depth+1); t[0] = P_mktuple(AST_COMMENT, 0/*alt no*/, 2/*phrases*/, t); generate_c(t[0], depth+1); return t[0]; }; # Although tempted to use a regexp with .* to match text to the end of a line, *don't*. # At least not without extreme testing. I suspect problems with regexps matching line endings. # maybe [^$;]* could work better? P = «.» , ''; # comment text up to a statement separator. (Differs from Imp77 comments) <-------------- FIX! TO DO # SHUNTING YARD: only UOP and OP and VAR push things. # The other phrases are only used to ensure that all the symbols are handled in left to right order. # FIXEDARRAY(INDEXES)_FIELD - cannot be a map as that is not valid as a record field P = '_', # TO DO!!!!! '' { // NOW PUNTED TO Compile_Var() so should not get called here during parse // but will be called later. if (alt == 0) { t[1] = -1; // reserve for parent // object t[2] = compile(P(2), depth+1); // field t[3] = compile(P(3), depth+1); // app t[4] = compile(P(4), depth+1); // subfields // AST entry name yet to be determined... return P_mktuple(AST_PENDING_APP_OR_FIELD, 0/*alt no*/, 4/*phrases*/, t); } else { return -1; } }; # This is the top level. A VAR could be a function call or access via a map. # If it's a map then it may be a recordmap in which case the Opt_Record_Field # may be present. Likewise if a record function. # If VARNAME is an array then Opt_Record_Field is valid if it's a record array. # If it's a function, then it needs to be handled similarly to # where calling procedures has been handled but I'm not at all sure if a %map # being assigned to is handled properly. And indeed after checking, I see it's not - # a map on the LHS with parameters is not being given the parameters. (It's also # not applying theindirection through the map but we know about that already, which # will be fixed once I have the logic to match up type information between an object # and how that object is used, whether as a parameter or the LHS of an assignment, # or as one of the arguments of an operator) # RECORDFN(PARAMS)_FIELD # RECORDMAP(PARAMS)_FIELD # ARRAY(INDEXES)_FIELD P = { { int varname = compile(P(1), depth+1); // RVALUE containing VarDecl[VarIDX] int app = compile(P(2), depth+1); // -1 or MkApp(Expr, Rest); - we don't yet know if an array or a fn call int subfields = compile(P(3), depth+1); // -1 or AST_PENDING_APP_OR_FIELD // (this will recursively create // tuples for any subfields) int Var = AST_mktuple(AST_PENDING_APP_OR_FIELD, varname, -1 /*field*/, app, subfields); PushVar(Var); // Punt to reverse-polish stack evaluation of precedence return -1; } }; P = , , '('')' { // shunting yard algorithm. Note need to distinguish between the use of // VAR and CONST in expressions (which are getting the shunting yard treatment) // and many other places in the grammar, where they are not. // NOTE: parentheses in Imp serve two purposes: 1) overriding the precedence of // operators, and 2) forcing the order of evaluation specifically to avoid // overflows in intermediate results. So although we normally output expressions // using minimal brackets and only insert them when needed due to differences // between Imp and C precedences, we will leave user-supplied parentheses in place // when outputting the C version of an expression to explicitly preserve option (2). if (alt == 2) { // '('')' // compile the expr and reduce it to a single irreducible object so it is treated // like a variable or a constant and its subexpressions are not examined when // handling C's operator precedence. // Compiling should return an AST_EXPRESSION // a + (b + c) was coming out as (a + b + c) - the TORP was picking up // the "a +" that was already on the stack and adding it to the Expr. What // was needed was to seal a false bottom on the stack so that Expr is // constructed independent of anything preceding it. [*** FIXED ***] t[1] = compile(P(2), depth+1); int AstTuple = P_mktuple(AST_USER_PARENS, 0, 1, t); PushParen(L"("); PushExpr(AstTuple); PushParen(L")"); // AstTuple is opaque to the RP code. It treats it like any atom. return -1; } // otherwise, compile or , which will push them on the TORP stack <--------------------------------- FIX! TO DO }; # NOTE!!! Imp80 has removed !X! and |X| in favour of MOD(X) (or IMOD(X)). This does solve the ambiguity # problem of "A = X !! Y !! Z" - is it equivalent to "A = X ! (!Y!) ! Z", or "A = X !! (Y !! Z)"? # but it will break older programs! (btw there was a *thrd* ambiguity in the old days when implicit # multiplication by abutment was allowed - !a! !b! was not imod(a)*imod(b) but rather imod(a!!b) ) # Next problem... in order to recognise a const variable *at parse time*, that variable already has to have been # added to the name table. But adding to the name table happens when executing the AST. Unless we have a shadow # set of name tables that are manipulated during parsing (which I refuse to do) I think the only way this can be # made to work is if we parse a statement at a time rather than the whole program. I.e. like the ERCC compilers. # I do *not* like that the language design precludes any specific parsing method such as whole program at a time. # The alternative is to remove all phrases that look for const expressions and just allow any expression in the # grammar, but fault it when it is used if the expression tree cannot be folded down to a constant. # If that is the option, then is just with folding done in TORP plus a check of the result. # (const expressions that are just numbers are not a problem - it's the ambiguity of const integers etc which # look like variables that makes this difficult.) # Actually the expression doesn't have to be folded, and in fact for a good translation to C we don't want to # fold it - all we want to do is have a flag in the AST objects which says that this item will evaluate to a # constant - without actually doing the evaluation. (The C compiler will do that for us) # (note that the problem of C const ints not being usable as literal constants, and the need for #define versions # of the same constants as well, is a separate issue.) P = { { // alt=0: symbolic constant // alt=1: literal constant // t[1] = compile(P(1), depth+1); // PushConst(P_mktuple(AST_RVALUE, 0/*alt no*/, 1/*phrases*/, t)); int ConstVar = compile(P(1), depth+1); if (P_alt(ConstVar) == 0 && P_P(ConstVar,1) == -1) { fprintf(stderr, "* NAME NOT SET(#4): %ls\n", wSVTag); // exit(1); } // if (..?) { // TO DO: Check the type and if it was a constant, mark it as so: // P_ISCONST(ConstVar) = 1; // PushConst(ConstVar); return -1; } }; P = , , '('')' ; # <---------------------- FIX! TO DO P = { { // CODE BELOW same as following P but uncommented. int Expr; int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; t[2] = compile(P(1), depth+1); t[1] = ExPop(); DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; // TO DO: Modify the tuple below to mark it as evaluating to a constant // compatible with contexts in C that require a constant expression: // *OR* return AST_CONSTANT_EXPRESSION tuple? (which does not yet exist) Expr = P_mktuple(AST_EXPRESSION, 0, 2, t); // converts to a single object // if (...?) { // TO DO: Test and confirm evaluates to a constant... // P_ISCONST(Expr) = 1; // } //Diagnose("CEXPR: ", Expr, 0, debug_declarations); return Expr; } }; P = ; # <------------------------------- FIX! TO DO P = , '*'; # <--------------------------------------------------------- FIX! TO DO P = ; # <-------------------------------- FIX! TO DO P = { { // CODE BELOW same as previous P but commented. int Expr; // Previously I had many calls to duplicating the 'false bottom' // code below, until I realised I could safely use it *everywhere* and // therefore placed a single instance here instead. int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; t[2] = compile(P(1), depth+1); // walk the CST expression tree to process each atom sequentially left to right, as input to the shunting yard algorithm t[1] = ExPop(); // now convert the reverse-polish stack back into an AST tree, with precedence now magically applied. DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; Expr = P_mktuple(AST_EXPRESSION, 0, 2, t); // converts to a single object // We do not print s immediately because a bracketed () which is a subexpression of a larger has to be reprocessed // in order to apply operator precedence. A consequence of this is that s in other contexts will have to be output via generate_c() // at the point of call. //Diagnose("EXPR: ", Expr, 0, debug_declarations); return Expr; } }; P = , ''; # <--------------------------------------------------------- FIX! TO DO P = , ''; # <--------------------------------------------------------- FIX! TO DO # Note that is used both for array element indexing, and for parameters to a procedure. # It might be better to convert the P_APP into AST_PROC_PARAMS and AST_ARRAY_INDICES (or whatever # they'll be called) when passing to generate_c rather than passing around P_APP and either # deferring that choice to a distant piece of code, or having one of them unpick the tuple # to avoid making the choice within generate_c ... P = '('')', '' { if (alt == 0) { int Expr = compile(P(2), depth+1); int Rest = compile(P(3), depth+1); return MkApp(Expr, Rest); } return -1; }; P = ',', '' { if (alt == 0) { int Expr = compile(P(3), depth+1); int Rest = compile(P(4), depth+1); return MkApp(Expr, Rest); } return -1; }; P = "if", "unless" { if (alt == 0) { codegen("if "); } else { codegen("unless "); } return -1; }; P = { codegen(" ("); compile(P(1), depth+1); compile(P(2), depth+1); codegen(") "); return -1; }; P = '='','',' { // %for J = I, -1, 1 // for (J = I; J < 1; J += -1) // if the step is positive use '<', if negative use '>', if unknown use '!='... codegen("for ("); int Control = compile(P(1), depth+1); // TO DO: check type of Control and also that it is not a literal constant! (P_alt(Control) == ?) if (Control == -1 || P_P(Control,1) == -1) { fprintf(stderr, "* NAME NOT SET(#5): %ls\n", wSVTag); // exit(1); return -1; } generate_c(Control, depth+1); int Expr1 = compile(P(3), depth+1); int Expr2 = compile(P(5), depth+1); int Expr3 = compile(P(7), depth+1); codegen(" = "); generate_c(Expr1, depth+1); codegen("; "); generate_c(compile(P(1), depth+1), depth+1); codegen(" != ("); generate_c(Expr3, depth+1); codegen(")+("); generate_c(Expr2, depth+1); codegen("); "); generate_c(compile(P(1), depth+1), depth+1); codegen(" += "); generate_c(Expr2, depth+1); codegen(") "); return -1; }; P = "while", "until", "for" { if (alt == 0) { codegen("while "); compile(P(2), depth+1); } else if (alt == 1) { // TO DO: restructure so that while is tested at end of loop rather than beginning! // Note that for now compiling these causes them to also be printed on the fly. codegen("until "); compile(P(2), depth+1); t[1] = P(2); int Cond = t[0] = P_mktuple(AST_DEFER_UNCOMPILED_PHRASE, /*alt*/0, /*count*/1, t); return Cond; } else if (alt == 2) { compile(P(2), depth+1); } return -1; }; # %externallongrealfnspec DSIN %alias "sin" (%longreal angle) # extern double DSIN(double ANGLE) asm("sin"); P = "alias", '' { return -1; }; P = ',' , '' { if (alt == 0) { //codegen(", "); fprintf(stderr, "? Warning: being compiled by default code - so no calls to Declaration() for each \n"); compile(P(3), depth+1); compile(P(4), depth+1); } return -1; }; # is used for new names in declarations. P = «[A-Z][A-Z0-9]*» { // check to see which is in the most accessible format: DTag = SubPhraseIdx(Ph, 2); // Set a global for use by return -1; }; P = "integer", '' { return -1; }; P = "real", "integer" { Basetype = (alt == 0 ? FLOAT : INTEGER); return -1; }; P = '('')', '' { // <---------------------------------------------------------------- FIX! TO DO // TO DO. }; # signed/unsigned extensions could go here. P = "integer", "real", "long" "long", "long", "byte", "string", "half", "short", "record"'('')' { switch (alt) { case 0: Signedness = SIGNED; Precision = WORD; Basetype = INTEGER; break; // integer case 1: Signedness = SIGNED; Precision = WORD; Basetype = FLOAT; break; // real case 2: Signedness = SIGNED; Precision = QUADWORD; break; // long long (integer | real) case 3: Signedness = SIGNED; Precision = LONGWORD; break; // long (integer | real) case 4: Signedness = SIGNED; Precision = BYTE; Basetype = INTEGER; break; // byte (integer)? (no %mite in this variant of Imp) case 5: Signedness = SIGNED; Precision = COMPOUND; Basetype = STRINGTYPE; break; // string case 6: Signedness = UNSIGNED; Precision = SHORT; Basetype = INTEGER; break; // half (integer)? case 7: Signedness = SIGNED; Precision = SHORT; Basetype = INTEGER; break; // short (integer)? case 8: Signedness = SIGNED; Precision = COMPOUND; Basetype = RECORD; compile(P(3), depth+1); // compiling assigns type to global "RecordTypeIDX" which is picked up by Declaration() // TO DO: Add RecordTypeIDX to the PUSH/POPDECL stack. // (Note: inline definition of RFREF not yet handled.) break; // record } if (debug_declarations > 1) fprintf(stderr, "[ Sign=%d Prec=%d Type=%d ]", Signedness, Precision, Basetype); // <---------------------------------------------------------------------------------------------- FIX! TO DO }; P = "routine", { if (alt == 0) { Proc = ROUTINE; if (debug_declarations > 1) fprintf(stderr, "[Proc=%d]", Proc); } // <---------------------------------------------------------------------------------------------- FIX! TO DO }; P = "fn", "map", "function" { if (alt == 1) Proc = MAP; else Proc = FN; if (debug_declarations > 1) fprintf(stderr, "[Proc=%d]", Proc); return -1; }; # ensure each formal parameter starts with a clean slate: P = ; # These are all the types that are allowed as parameters to a procedure. P = , , "name" { // Need to check I set up Object = here correctly. Was a bit of a rush late one night. Area = PARAMETER; if (alt == 0) { // a simple object // , Object = VAR; Proc = NONE; Linkage = AUTO; compile(P(1), depth+1); // < compile(P(2), depth+1); // compile(P(3), depth+1); // int ParamIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*A %s:%d*/",__FILE__,__LINE__); { int MoreDecls = P(4); // ',' , '' while (P_alt(MoreDecls) == 0) { codegen(", ");/*a*/ compile(P_P(MoreDecls, 3), depth+1); // int NextParamIDX = Declaration(depth+1, TRUE /* A, B, C format allowed */); //codegen("/*B %s:%d*/",__FILE__,__LINE__); MoreDecls = P_P(MoreDecls, 4); } } return -1; } else if (alt == 1) { // A procedure parameter // Object = CODE; // TO DO: Or should this be VAR? Not yet tested. compile(P(1), depth+1); // compile(P(2), depth+1); // compile(P(5), depth+1); // Reordered so FPP is handled before names compile(P(3), depth+1); // int ProcParamIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); codegen("/*C %s:%d*/",__FILE__,__LINE__); { int MoreDecls = P(4); // ',' , '' while (P_alt(MoreDecls) == 0) { codegen(", /*b*/"); compile(P_P(MoreDecls, 3), depth+1); // int NextProcParamIDX = Declaration(depth+1, TRUE /* A, B, C format allowed */); codegen("/*D %s:%d*/",__FILE__,__LINE__); MoreDecls = P_P(MoreDecls, 4); } } return -1; } else if (alt == 2) { // "name" // NOTE: generated C might consider using GCC's "__attribute((may_alias))" to better replicate Imp semantics? Object = VAR; Basetype = GENERIC; NameInfo = OBJECTNAME; //IsFormat = UNINIT_ISFORMAT; Spec = NO_SPEC; //Signedness = UNINIT_SIGNEDNESS; //Precision = UNINIT_PRECISION; // or maybe COMPOUND Proc = NONE; // unless the %name parameter is a %fn perhaps? Haven't thought this one through yet. //IsArray = UNINIT_ISARRAY; //Linkage = UNINIT_LINKAGE; compile(P(2), depth+1); // int NameIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*E %s:%d*/",__FILE__,__LINE__); { int MoreDecls = P(4); // ',' , '' while (P_alt(MoreDecls) == 0) { codegen(", /*c*/"); compile(P_P(MoreDecls, 3), depth+1); // int NextNameIDX = Declaration(depth+1, TRUE /* A, B, C format allowed */); codegen("/*F %s:%d*/",__FILE__,__LINE__); MoreDecls = P_P(MoreDecls, 4); } } return -1; } }; P = "name", # for %routinename or %functionname parameters to other procedures - I think the %name part is irrelevant '' { return -1; }; P = "array""name", "name", '' { switch (alt) { case 0: NameInfo = ARRAYNAME; break; case 1: NameInfo = OBJECTNAME; break; case 2: NameInfo = NO_NAME; break; } if (debug_declarations > 1) fprintf(stderr, "[NameInfo=%d]", NameInfo); return -1; }; # below should have done an so I have no idea why the parameter 'INT' in procedure 'ONE' # in file other-tests/imptests/cnstiarr.imp is showing up in the debug info as Proc==PROC rather than Proc==NONE P = '('')', '' { if (alt == 1) codegen("void"); // proc with no parameters Area = PARAMETER; // <--------------------------------------------------------------------------------------------- FIX! TO DO }; # due to an uncharacteristic example of laziness in the original imp80 grammar, the # comma in a formal parameter list which separated the declaration of parameters with # different types was actually optional, and surprisingly, several ERCC Imp programs # actually took advantage of this, eg: # %externalintegerfnspec jim(%real a,b %integername c) # So I have replaced the literal ',' that was below with an optional comma in order to # match the real Imp80 compiler. Doesn't mean I like it. P = ',', '' { return -1; }; P = , '' { if (alt == 0) codegen(", "); // <-------------------------------------------------------------------------------------------- FIX! TO DO }; P = "ofprogram", "ofperm", # added to facilitate building compiler. Note prims do not have a file - they are in the compiler itself. So no %endofprim "offile", "oflist", # of Rt/Fn/Map or begin block: { // Need a routine to unwind the control stack and show what was expected. if (alt == 0) { if (debug_compiler) fprintf(stderr, "[end of program]\n"); if (ctrl_depth() != 1) { fprintf(stderr, "\n* Spurious %%ENDOFPROGRAM\n"); exit(1); } int EXPECTED = pop_ctrl(); if (EXPECTED != ENDOFPROGRAM) { fprintf(stderr, "\n* %%ENDOFPROGRAM not expected here (expecting %s)\n", ctrl_debug[EXPECTED]); exit(1); // A.k.a "%FINISH missing" or "%REPEAT missing" or "%END missing" ... also needed everywhere a pop_ctrl() doesn't match what's expected. } codegen("\n exit(0);\n} /* End of program */\n"); } else if (alt == 1) { if (debug_compiler) fprintf(stderr, "[end of perm]\n"); } else if (alt == 2) { if (debug_compiler) fprintf(stderr, "[end of file]\n"); if (ctrl_depth() != 0) { // might not be the case for an unbalanced %include file... fprintf(stderr, "\n* Spurious %%ENDOFFILE\n"); exit(1); } } else if (alt == 3) { if (debug_compiler) fprintf(stderr, "[end of list]\n"); } else if (alt == 4) { if (debug_compiler) fprintf(stderr, "[end of Rt/Fn/Map/begin block]\n"); if (ctrl_depth() < 1) { fprintf(stderr, "\n* Spurious %%END (ctrl_depth()=%d))\n", ctrl_depth()); exit(1); } int EXPECTED = pop_ctrl(); if (EXPECTED != END) { fprintf(stderr, "\n* %%END not expected here (expecting %s)\n", ctrl_debug[EXPECTED]); exit(1); } codegen("\n}\n"); // no semicolon } // <-------------------------------------------------------------------------------------------- FIX! TO DO }; P = "format", '' { if (alt == 0) Object = ARRAYFORMAT; //IsFormat = (alt == 0 ? IS_FORMAT : NO_FORMAT ); //if (debug_declarations > 1) fprintf(stderr, "[format=%d]", IsFormat); return -1; }; P = , '' { return -1; // let the parent level handle it. }; # conditional expressions (and assignments) do *NOT* use the TORP precedence code. P = , '('')', # DON'T PUT EXTRA BRACKETS AROUND THIS ONE, IE DON'T REPLACE WITH "not" { // %not is '!' in C if (alt == 0) { int Expr1 = compile(P(1), depth+1); generate_c(Expr1, depth+1); // TO DO: s need false bottom! compile(P(2), depth+1); int Expr2 = compile(P(3), depth+1); generate_c(Expr2, depth+1); int RestofSC = P(4); if (P_alt(RestofSC) == 0) { int Comp2 = P_P(RestofSC, 1); int Expr3 = P_P(RestofSC, 2); codegen(" && "); generate_c(Expr2, depth+1); compile(Comp2, depth+1); Expr3 = compile(Expr3, depth+1); generate_c(Expr3, depth+1); } compile(P(4), depth+1); // RESTOFSC } else if (alt == 1) { compile(P(2), depth+1); // SC compile(P(3), depth+1); // RESTOFCOND } else { // NOT SC codegen("!"); // not sure about priorities for booleans codegen("("); compile(P(2), depth+1); codegen(")"); } return -1; }; P = "and", # %and is '&&' in C "or", # %or is '||' in C '' { // TO DO: Only one of them needs to have () added around the other, because the // operator precedences of && and || in C are not equal. (&& is higher than || so we need () around "and" SC's in a RESTOFORC.) if (alt == 0) { codegen(" && "); compile(P(2), depth+1); compile(P(3), depth+1); } else if (alt == 1) { codegen(" || "); compile(P(2), depth+1); compile(P(3), depth+1); } return -1; }; P = "and", '' { if (alt == 0) { codegen(" && "); compile(P(2), depth+1); compile(P(3), depth+1); } return -1; }; P = "or", '' { if (alt == 0) { codegen(" || "); compile(P(2), depth+1); // NEED () AROUND THE IF IT CONTAINS AN "&&" compile(P(3), depth+1); } return -1; }; P = "spec", '' { if (alt == 0) { Spec = SPEC; } else if (alt == 1) { // Procedures of all kinds require a %end push_ctrl(END); // we will probably later distinguish between END and PROCEND etc. Spec = NO_SPEC; } if (debug_declarations > 1) fprintf(stderr, "[Spec=%d]", Spec); return -1; }; # a data spec, not a proc spec: P = "spec", '' { if (alt == 0) Spec = SPEC; if (debug_declarations > 1) fprintf(stderr, "[Spec=%d]", Spec); return -1; }; P = ','':', '' { if (alt == 0) { int LowIDX = compile(P(2), depth+1); int HighIDX = compile(P(4), depth+1); AddDims(LowIDX, HighIDX); compile(P(5), depth+1); } return -1; }; P = '('':'')' { { // TO DO: list of upper/lower bounds int LowIDX = compile(P(2), depth+1); int HighIDX = compile(P(4), depth+1); AddDims(LowIDX, HighIDX); compile(P(5), depth+1); // We now have ArrayDims of pairs of Low:High on the Dims stack, ready for Declaration(depth+1, ) to use. } return -1; }; P = ',', '' { if (alt == 0) compile(P(2), depth+1); return -1; }; P = { ArrayDims = 0; compile(P(3), depth+1); // compile(P(1), depth+1); // int ADeclIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); // codegen("/*G %s:%d*/",__FILE__,__LINE__); codegen(";\n"); { int MoreDecls = P(2); // P = ',' , '' while (P_alt(MoreDecls) == 0) { //codegen(", /*e*/"); compile(P_P(MoreDecls, 3), depth+1); // int NextADeclIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*H %s:%d*/",__FILE__,__LINE__); // for now. Syntax of multiples is complex. codegen(";\n"); MoreDecls = P_P(MoreDecls, 4); // } } // more array declarations on the same line are effectively independent in terms of bounds (but not type, area etc) compile(P(4), depth+1); return -1; }; P = , "array" { if (alt == 0) { compile(P(1), depth+1); compile(P(2), depth+1); int DeclIDX = Declaration(depth+1, TRUE /* A, B, C form allowed */); //codegen("/*I %s:%d*/",__FILE__,__LINE__); { int MoreDecls = P(3); // ',' , '' while (P_alt(MoreDecls) == 0) { codegen("; ");/*d*/ compile(P_P(MoreDecls, 3), depth+1); // int NextDeclIDX = Declaration(depth+1, TRUE /* A, B, C format allowed */); //codegen("/*J %s:%d*/",__FILE__,__LINE__); MoreDecls = P_P(MoreDecls, 4); } } codegen(";\n"); return -1; } else if (alt == 1) { IsArray = ARRAY; compile(P(2), depth+1); compile(P(3), depth+1); return -1; } }; P = , "array" { if (debug_declarations > 1) fprintf(stderr, "[IsArray=%d]", IsArray); if (alt == 0) { compile(P(1), depth+1); // compile(P(2), depth+1); // compile(P(3), depth+1); // compile(P(4), depth+1); // int OwnDecIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*K %s:%d*/",__FILE__,__LINE__); compile(P(5), depth+1); // codegen(";\n"); int MoreDecTuple = compile(P(6), depth+1); // } else if (alt == 1) { IsArray = ARRAY; // Must be set to SCALAR somewhere for default. compile(P(2), depth+1); // compile(P(3), depth+1); // compile(P(4), depth+1); // ArrayDims = 0; int BPairTuple = compile(P(5), depth+1); // int ArrayDeclIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*L %s:%d*/",__FILE__,__LINE__); int ConstListTuple = compile(P(6), depth+1); // codegen(";\n"); } return -1; }; P = ',', '' { if (alt == 0) { // only for formatting... compile(P(2), depth+1); // compile(P(3), depth+1); // compile(P(4), depth+1); // (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*M %s:%d*/",__FILE__,__LINE__); compile(P(5), depth+1); // codegen("; "); compile(P(6), depth+1); // } else if (alt == 1) { } return -1; }; P = "own", "external", "extrinsic", "constant", "const" { switch (alt) { case 0: Area = OWN; break; case 1: Area = EXTDATA; break; case 2: Area = EXTDATA; Spec = SPEC; break; case 3: Area = CONSTANT; break; case 4: Area = CONSTANT; break; } if (debug_declarations > 1) fprintf(stderr, "[Area=%d]", Area ); return -1; }; # single const initialiser, eg %const %integer MINUS SIX = -6 P = '=', '' { if (alt == 0) { int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; compile(P(2), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); t[1] = ExPop(); codegen(" = "); generate_c(t[1], depth+1); t[2] = -1; DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; return -1; // t[0] = MkInit(AST_INITIALISER, /*alt*/0, /*count*/2, t); } }; P = '('')', '' { // <---------------------------------------------------------------- FIX! TO DO // TO DO. }; P = { { int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; compile(P(1), depth+1); compile(P(2), depth+1); compile(P(3), depth+1); // TO DO: these are crashing with an array bound exceeded: if (P_alt(P(4)) == 0) { // initialiser includes a repeat count: // P = '('')', '' t[1] = ExPop(); //int Count = compile(P(4), depth+1); // We need this as an actual number, not as a string representing an expression. //t[2] = ExPop(); codegen("[ LOW ... HIGH ] = "); generate_c(t[2], depth+1); generate_c(t[1], depth+1); } else { t[1] = ExPop(); generate_c(t[1], depth+1); t[2] = -1; } DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; return t[0] = -1; // MkInit(AST_INITIALISER, /*alt*/0, /*count*/2, t); } }; # eg %const %integer %array double(0:3) = 0, 2, 4, 6; # # https://gtoal.com/imp77/reference-manual/IMP9-to-IMP80.pdf # # 2) Own arrays can be multi-dimensional. As before, the bounds must # be constants or constant expressions. The order in which array # elements are assigned the initialising values is such that the # first subscript changes fastest. Thus, for an array A(1:2, 1:3) # the order of assignment would be A(1,1), A(2,1), A(1,2), A(2,2), # A(1,3), A(2,3) # # # https://www.nv5geospatialsoftware.com/docs/Columns__Rows__and_Array.html # # In computer science, the way array elements are mapped to memory is always # defined using the mathematical [row, column] notation. Much of the following # discussion utilizes the m x n array shown in the following figure, with m rows # and n columns: # # [ A0,0 A0,1 … A0,n-1 ] # | A1,0 A1,1 … A1,n-1 | # | … | # [ Am-1,0 Am-1,1 … Am-1,n-1 ] # # Given such a 2-dimensional matrix, there are two ways that such an array can # be represented in 1-dimensional linear memory, either row by row (row major), # or column by column (column major): # # Contiguous First Dimension (Column Major): In this approach, all elements of # the first dimension (m in this case) are stored contiguously in memory. The 1-D # linear address of element Ad1, d2 is therefore given by the formula (d2*m + d1). # As you move linearly through the memory of such an array, the first # (leftmost) dimension changes the fastest, with the second dimension (n, in this # case) incrementing every time you come to the end of the first dimension: # # A0,0, A1,0, …, Am-1,0, A0,1, A1,1, …, Am-1,1, … # # Computer languages that map multidimensional arrays in this manner are called # column major, following the mathematical [row, column] notation. IDL and # Fortran are both examples of column-major languages. # # Contiguous Second Dimension (Row Major): In this approach, all elements of # the second dimension (n, in this case) are stored contiguously in memory. The # 1-D linear address of element Ad1, d2 is therefore given by the formula (d1*n + # d2). As you move linearly through the memory of such an array, the second # dimension changes the fastest, with the first dimension (m in this case) # incrementing every time you come to the end of the second dimension: # # A0,0, A0,1, …, A0,n-1, A1,0, A1,1, …, A1,n-1, … # # Computer languages that map multidimensional arrays in this manner are known # as row major. C is a row-major language. Imp (both Imp77 and Imp80) is a # column-major language. P = '=', '' { // <----------------------- FIX! TO DO if (alt == 0) { codegen(" = {"); if (P_alt(P(2)) == 0) codegen("\n"); // copy newline if present in original compile(P(3), depth+1); compile(P(4), depth+1); codegen("}"); return -1; } else return -1; }; P = ',', '' { // <----------------------- FIX! TO DO if (alt == 0) { codegen(", "); if (P_alt(P(2)) == 0) codegen("\n"); // copy newline if present in original compile(P(3), depth+1); compile(P(4), depth+1); return -1; } else return -1; }; P = "on""start" { // Shouldn't this allow %on %event * %start ??? if (debug_compiler) fprintf(stderr, "[on event block]\n"); codegen("if (_imp_onevent("); generate_c(compile(P(3),depth+1), depth+1); generate_c(compile(P(4),depth+1), depth+1); codegen(")) {\n"); push_ctrl(FINISH); return -1; }; # Only used in %on %event list: P = ',', '' { if (alt == 0) { codegen(", "); generate_c(compile(P(2), depth+1), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); } return -1; }; P = "event", '' { return -1; }; P = ',', '' { if (alt == 0) { codegen(", "); generate_c(compile(P(2), depth+1), depth+1); } return -1; }; P = "until", '' { { int EXPECTED = pop_ctrl(); if (EXPECTED != REPEAT && EXPECTED != REPEATUNTIL && EXPECTED != REPEATPUSHEDUNTIL) { fprintf(stderr, "\n* %%REPEAT not expected here (expecting %s)\n", ctrl_debug[EXPECTED]); exit(1); } if (EXPECTED == REPEATUNTIL) { // if this was a %cycle ... %repeat with a possible %until after it, the // %cycle could have been emitted as "do {" and this code must be either "} until (cond)" or "} while (1)" if (alt == 0) { if (debug_compiler) fprintf(stderr, "[until condition]\n"); codegen("} until "); compile(P(2), depth+1); codegen(";"); } else if (alt == 1) { codegen("} while (1);"); } } else if (EXPECTED == REPEATPUSHEDUNTIL) { int SavedUntilCond = pop_ctrl(); // mildly dirty pushing both codes, and conditions, on the same stack... if (alt == 0) { fprintf(stderr, "? WARNING: Having an %%until at both ends of a %%cycle is a bit dodgy, don't you think?\n"); // %until cond1 %cycle ... %repeat %until cond2?! codegen("} while (!"); compile(P(2), depth+1); // conds include the brackets codegen(" && !"); // or "||" ? compile(SavedUntilCond, depth+1); codegen(");"); } else if (alt == 1) { // %until cond %cycle ... %repeat if (debug_compiler) fprintf(stderr, "[deferred condition from earlier until cycle]\n"); codegen("} while (!"); compile(SavedUntilCond, depth+1); codegen(");"); } } else if (EXPECTED == REPEAT) { if (alt == 0) { fprintf(stderr, "* %%UNTIL is not allowed at the end of a %%WHILE or %%FOR cycle.\n"); exit(1); } codegen("}"); // no semicolon } return -1; } }; # Only need to check control stack for elses which follow a finish... P = "else", '' { { int EXPECTED = pop_ctrl(); if (alt == 0) { if (debug_compiler) fprintf(stderr, "[FINISHELSEQ]\n"); codegen(" else "); } else codegen("\n"); // following a %finish } // recurse on AFTERELSE <----------------------------------------------------------------- FIX! TO DO }; P = "start", , { if (alt == 0) { if (debug_compiler) fprintf(stderr, "[afterelse: start]\n"); push_ctrl(FINISH); codegen("{\n"); return -1; } else if (alt == 1) { if (debug_compiler) fprintf(stderr, "[afterelse: if or unless start]\n"); //push_ctrl(FINISHELSE); compile(P(1), depth+1); compile(P(2), depth+1); compile(P(3), depth+1); return -1; } else { if (debug_compiler) fprintf(stderr, "[afterelse: ]\n"); // recurse } // recursively handle alt 2 <------------------------------------------------------------- FIX! TO DO }; # Don't need to check control stack for elses which don't follow a finish... P = "else", '' { if (alt == 0) { if (debug_compiler) fprintf(stderr, "[ELSEQ]\n"); codegen(" else "); // recurse <--------------------------------------------------------------------------- FIX! TO DO } }; P = "start", "then""start", "then" { if (alt < 2) { if (debug_compiler) fprintf(stderr, "[then start]\n"); push_ctrl(FINISHELSE); codegen(" {\n"); return -1; } else { if (debug_compiler) fprintf(stderr, "[then UI]\n"); // recursively handle the rest <--------------------------------------------------------- FIX! TO DO } }; P = "external", "system", "dynamic", "prim", "perm", # prim and perm are for my benefit in writing the compiler. Not in the original grammar. '' { Linkage = alt+1; // %routine without one of the above is auto if nested but not sure what at external level if no %external keyword. if (debug_declarations > 1) fprintf(stderr, "[Linkage=%d]", Linkage); return -1; }; # record format statement P = "spec", '('')' { { // All that has been done prior to this is and Object = RECORDFORMAT; int RecFmtVarIDX; // a VarDecl index int RecFmtTypeIDX; // a TypeDecl index if (debug_compiler) fprintf(stderr, "[record format statement]\n"); // Object = RECORDFORMAT; // already set in if (alt == 0) { // "spec" Spec = SPEC; compile(P(2), depth+1); // set up the record format name RTag = DTag; // tag of record format name // Declaration() returns a VarDecl index, not a TypeDecl index!: RecFmtVarIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); // RecordFormatIDX recfm RecFmtTypeIDX = VarDecl[RecFmtVarIDX].type; if (TypeDecl[RecFmtTypeIDX].recfm != -1) { fprintf(stderr, "? %%recordformat %ls is already defined.\n", Atom2WStr(DTag)); } // We do not create the anonymous table just for the %spec - the absence (indicated by recfm == -1) // is how we know the spec exists. codegen("typedef struct "); PrintAtom(RTag); codegen(" "); PrintAtom(RTag); codegen(";\n"); RTag = -1; } else { Spec = NO_SPEC; // '('')' compile(P(1), depth+1); // set up the record format name RTag = DTag; // Declaration() returns a VarDecl index, not a TypeDecl index!: RecFmtVarIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); // RecordFormatIDX recfm RecFmtTypeIDX = VarDecl[RecFmtVarIDX].type; if (TypeDecl[RecFmtTypeIDX].recfm != -1) { fprintf(stderr, "* %%recordformat %ls is already defined.\n", Atom2WStr(DTag)); } if (TypeDecl[RecFmtTypeIDX].recfm == -1) { TypeDecl[RecFmtTypeIDX].recfm = anonymous_table(); // now prepare the container for the fields. RecFmTableIDX = TypeDecl[RecFmtTypeIDX].recfm; } // We have created a holder for the record format definition, which is probably added from within P // Add fields with: add_entry_to_anonymous_table(RecFmTableIDX, Atom2WStr(DTag), VarIDX) codegen("typedef struct "); PrintAtom(RTag); codegen(" {\n"); compile(P(3), depth+1); // , also calls // Get record fields compile(P(4), depth+1); // <-- this adds the field to the container compile(P(5), depth+1); // // %OR% alternative record format compile(P(6), depth+1); // compile(P(7), depth+1); // codegen("} "); PrintAtom(RTag); codegen(";\n"); RTag = -1; // Done compiling a recordformat // TO DO: The compiled %recordformat (currently in RecFmtTypeIDX) has to be added to scope "decl" by name // so that it can be found when a record of this format is declared. (or scope "recordformat" if they are // allowed a separate namespace - I think Imp80 and Imp77 differ in this respect). (Unless Declaration() // has already done this for us?) } } return -1; }; # record format reference - used in/like a eg when declaring # an instance of a record. P = , { // <-- looks like an RFSTMNT must be for the variation: %record (%integer A,B) fred if (alt == 0) { // TO DO: // We set RecordTypeIDX to the VarIDX corresponding to the previously declared %recordformat '' - // from that we get the 'recfm' field which is the index of the anonymous symbol table entry containing the fields. int VarnameTuple = compile(P(1), depth+1); RecordTypeIDX = P_P(VarnameTuple, 1); // (might have been preferable to invent a rather than use which returns an RVALUE tuple.) } else { fprintf(stderr, "* Inline record format references have not yet been implemented. You need to use a %%recordformat.\n"); RecordTypeIDX = -1; } }; # record format declaration P = ',', # added '' { if (debug_compiler) fprintf(stderr, "[more record format declarations]\n"); //compile(P(3), depth+1); // //compile(P(4), depth+1); // //return -1; }; P = , '('')' { // I think that the second form is for a nested record declaration if (debug_compiler) fprintf(stderr, "[record format declaration (RFDEC)]\n"); // <-------------------------------- DROP THROUGH AND COMPILE if (alt == 0) { compile(P(1), depth+1); Object = VAR; Area = FIELD; // whatever the parent record's area may be once used in a record declaration compile(P(2), depth+1); compile(P(3), depth+1); int FieldVarIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); if (RecFmTableIDX != -1) { add_entry_to_anonymous_table(RecFmTableIDX, Atom2WStr(DTag), FieldVarIDX); } else { fprintf(stderr, "* no currently open %%recordformat into which field \"%ls\" can be added.\n", Atom2WStr(DTag)); } codegen(";\n"); return FieldVarIDX; // TO DO: both these returns may need to be wrapped in some sort of AST list. AST_RECORDFORMAT? AST_FIELD? } else { int FieldVarTuple = compile(P(2), depth+1); // TO DO: compile(P(3), depth+1); // TO DO: compile(P(4), depth+1); return FieldVarTuple; // TO DO } }; # record formal element P = , "array" { if (debug_compiler) fprintf(stderr, "[record format element]\n"); // a declaration of a record field - more restrictive in what it can be compared to regular declarations // <-------------------------------- DROP THROUGH AND COMPILE }; P = "or", '' { if (debug_compiler) fprintf(stderr, "[alternative record format declaration]\n"); // <-------------------------------- DROP THROUGH AND COMPILE }; # asm statements are rather freeform unless the compiler is written to # understand some particular architecture. But for an Imp to C translator # we just want to pass the asm code though to some procedure (where it # might be translated as a special case. Or ignored.) P = «.» , '' { return -1; }; # TO DO: semicolons in an asm statement eg *LD_1,';' P = , { // The other call to is as an operand in an EXPR where it is pushed // onto the TORP stack in order to have operator precedence applied to the // EXPR as a whole. The call to actually just puts it on the // stack and returns -1. So to get a here we don't look at the // *result* of calling but rather we pop the unprocessed item off // the stack that was placed there by . compile(P(1), depth+1); // Pushes VAR Op_or_Data TOSPop = DataStack(--DataStack_nextfree); ; // NOT ExPop(). We're not applying precedence to a tree. if (TOSPop.type != 'D') { fprintf(stderr, "** Internal error in %s at line %d\n", __FILE__, __LINE__); exit(1); } int Var = TOSPop.idx; //Diagnose("Assignment or call: ", Var,0, debug_declarations); if (alt == 0) { // - Assignment int Assop = compile(P(2), depth+1); int Expr = compile(P(3), depth+1); int Assign = Mk_AST_assignment(Var, Assop, Expr, depth); generate_c(Assign, depth+1); } else { // = Call generate_c(Var, depth+1); } return -1; }; P = '(' ')', '' { if (alt == 0) { return compile(P(2), depth+1); // AST_EXPR tuple. } else { return -1; } }; P = '->' { // CHANGED FROM TO NEW to restrict to *one* index. // changed to to inhibit NAME NOT SET errors if (debug_compiler) fprintf(stderr, "[go to]\n"); int SWIdx = compile(P(4), depth+1); if (SWIdx != -1) { // switch label destination codegen("goto_switch("); //generate_c(compile(P(3), depth+1), depth+1); int SwitchName = compile(P(3), depth+1); // VARNAME if (P_P(SwitchName,1) == -1) { codegen("* ERROR: %%switch %ls NOT DECLARED\n", wSVTag); // exit(1); } codegen("%ls, ", wSVTag); generate_c(SWIdx, depth+1); // If the constant is < 0 we want to output "M" instead of "-" codegen(")"); //Diagnose("Switch index: ", SWIdx, 0, TRUE); } else { // plain goto codegen("goto "); //generate_c(compile(P(3), depth+1), depth+1); int Label = compile(P(3), depth+1); if (P_P(Label,1) == -1) { // Simple Label has not yet been defined. Add it to the name table as a %spec so it can be faulted if missing at the end of the program. // Since I'm getting a NAME NOT SET start turn from lunarlander.imp, I suspect the check for a name // in the symtab stuff is too low-level as it should not be faulting unknown labels. } codegen("%ls", wSVTag); } return -1; }; P = "return" { if (debug_compiler) fprintf(stderr, "[return]\n"); codegen("return"); return -1; }; P = "result" { if (debug_compiler) fprintf(stderr, "[result]\n"); codegen("return "); // Assop will modify the result, possibly with a cast (for <-) or removing an indirection (for ==) // TO DO: possibly needs a new AST_RESULT tuple similar to AST_ASSIGN // We need to determine the result type of this function, and use it in the AST_RESULT/AST_ASSIGN tuple to // cause the appropriate cast if needed. int Expr = compile(P(3), depth+1); int Type = P_TYPEINFO(Expr); // P_TYPEINFO(X) is the hidden type field of any AST expression //Describe_Type(Type); generate_c(Expr, depth+1); return -1; }; P = "monitor" { if (debug_compiler) fprintf(stderr, "[print diagnostics]\n"); codegen("assert(_IMP_MONITOR_)"); // not ideal - assert will cause an exit from the program - %monitor should not. return -1; }; P = "stop" { if (debug_compiler) fprintf(stderr, "[stop program]\n"); codegen("exit(0)"); return -1; }; # Imp80 is %signal [%event]? const[, expr]? - unlike Imp77 which accepts %signal [%event]? const[, expr[, expr]?]? # or my previous imptoc which took %signal [%event]? const[, expr[, expr[, "string"]?]?]? for reasons that currently # evade my recollection but seemed quite valid at the time... P = "signal" { if (debug_compiler) fprintf(stderr, "[signal event]\n"); codegen("_imp_signal("); generate_c(compile(P(3), depth+1), depth+1); generate_c(compile(P(4), depth+1), depth+1); codegen(")"); return -1; }; P = "exit" { if (debug_compiler) fprintf(stderr, "[exit from cycle]\n"); codegen("break"); return -1; }; P = "continue" { if (debug_compiler) fprintf(stderr, "[continue at end of cycle]\n"); codegen("continue"); return -1; }; # We need to detect sequences in order to create C {...} blocks without unnecessary nesting. # This may necessitate a grammar change where is replaced by and nested 's by # with invoking rather than '"and" '. P = , , { if (alt == 0) { compile(P(1), depth+1); // must be encapsulated in a {...} block if more than 1 statement compile(P(2), depth+1); } else if (alt == 1) { // PC_IU is just "if" or "unless" compile(P(2), depth+1); compile(P(3), depth+1); compile(P(1), depth+1); compile(P(4), depth+1); } else if (alt == 2) { // but PC_WUF is a full while/until/for condition if (P_alt(P(2)) == 0) { compile(P(2), depth+1); // "while", compile(P(1), depth+1); // } else if (P_alt(P(2)) == 1) { codegen("do "); compile(P(1), depth+1); // compile(P(2), depth+1); // "until", } else { compile(P(2), depth+1); // "for" { compile(P(1), depth+1); // } compile(P(3), depth+1); } return -1; }; P = , , { if (alt == 0) { compile(P(1), depth+1); codegen("; "); compile(P(2), depth+1); return -1; } else { compile(P(1), depth+1); codegen("; "); } return -1; }; P = "and"; # <-------------------------------------------------------------------------- FIX! TO DO P = , , { if (alt == 0) { codegen("{ "); compile(P(1), depth+1); codegen("; "); compile(P(2), depth+1); codegen(" }"); return -1; } else { compile(P(1), depth+1); codegen("; "); } return -1; }; P = , ; # <-------------------------------------------------------------------------- FIX! TO DO P = , , , , , , ; # <-------------------------------------------------------------------------- FIX! TO DO P = , ''; # <-------------------------------------------------------------------------- FIX! TO DO P = ; # <-------------------------------------------------------------------------- FIX! TO DO # * The identifiers used for labels must be distinct from other local # identifiers. # {EMAS IMP80: labels do not have to be distinct.} # In other words, EMAS IMP80 has a separate namespace for labels but other # implementations might not. So probably safer if I do - i.e. add a label # to scope table "Lab" rather than "Decl". # Also, # * The use of both types of label is limited to the block in which # they are defined, excluding any blocks described therein. That is, # labels cannot be global to a block and therefore it is not possible # to jump into or out of a block. # note that 'blocks' do not include start/finishes, only begin/end and Rt/Fn/Maps. P = '(*):', # sets a flag that tells the line reconstruction to preserve '(''):', # spaces if immediately followed by a ! or %COMMENT statement. ':' { if (debug_compiler || debug_declarations) fprintf(stderr, "[label]\n"); // MAYBE USE RATHER THAN and handle declaration here rather than in Declaration(depth+1, ) since labels // need to be handled a bit differently from variables? Yeah, making that change... And indeed later changed to // because varname caused a NAME NOT SET error // Note that labels are currently *not* having a ';' added after them. This may be OK but my previous imptoc // inserted them and I suspect for a reason, though right now I can't remember what that reason might have been. // I'll add the semicolons back in later if some circumstance arises where the lack of them causes problems. int LabelTag; int SwitchName = compile(P(1), depth+1); LabelTag = VTag; if (alt < 2 && P_P(SwitchName, 1) == -1) { fprintf(stderr, "* ERROR: %%switch %ls NOT DECLARED\n", wSVTag); // exit(1); //} else if (alt < 2 && P_P(SwitchName, 1) != -1) { // fprintf(stderr, "* ERROR: SWITCH LABEL %d SET TWICE\n", TO DO!); // exit(1); } else if (alt == 2 && P_P(SwitchName, 1) != -1) { fprintf(stderr, "* ERROR: LABEL %ls SET TWICE\n", wSVTag); // exit(1); } if (alt == 0) { Object = VAR; // SWITCHDEFN; no, SWITCHDEFN is for %switch fred(1:10) Basetype = SWITCHLABEL; codegen("%s_default:", Atom2Str(LabelTag)); } else if (alt == 1) { Object = VAR; // SWITCHDEFN; Basetype = SWITCHLABEL; codegen("%s_", Atom2Str(LabelTag)); Cpool_nextfree = 0; // Reset Start_Divert_C_to_Cpool(); int len = generate_c(compile(P(3), depth+1), depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. End_Divert_C_to_Cpool(); if (Cpool(0) == '-') { codegen("M%s", &Cpool(1)); } else { codegen("%s", &Cpool(0)); } Cpool_nextfree = 0; // return unused space. codegen(":"); } else { // TO DO: See lunarlander.imp - I'm getting a NAME NOT SET with "start turn" Object = VAR; // LABELDEFN; no, LABELDEFN is for %label Fred Basetype = JUMPLABEL; codegen("%s: ", Atom2Str(LabelTag)); // TO DO: This outputs a ' ' after the tag. We should suppress the space at the // point where it was added and add any needed spaces at the point of instantiation. } // PENDING! (TO DO) - the declaration code for actual labels (rather than declaration of labels) is currently // non-existent and calling Declaration(depth+1, ) below outputs random crap. I'm not 100% sure that I should even // be using the 'Declaration(depth+1, )' method to handle the labels themselves. Maybe it would be more sensible to // put that code here in the grammar where the labels are instanced. // TO DO: (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); return -1; }; # ERCC Imp80 has this appalling hack that in C would be handled by the preprocessor - # a statment such as %if CONST_CONDITION %then %start is to be treated like # #if CONST_CONDITION # which flies in the face of everything Imp ever did before and is again only possible # as a hack that came about because of the way the ERCC compilers were coded. # The conditions in an else-if chain all have to be constant and the compiler has # to ensure no jumping into the code withing a leg of the tests. A complete pig's ear # of language design. And it's not just at the top-level where it might have been # easily detected - it is allowed anywhere. # # Note that %if const-condition %then %start... %finish %else %if variable-condition %then %start ... %finish %else %start ... %finish # needs extreme rewriting: # # #if const-condition # ... # #else # if (variable-condition) { # ... # } else { # ... # } # #endif # # whereas #elif could have been used if the second condition were constant too. # P = { if (debug_compiler) fprintf(stderr, "[if or unless]\n"); // <-------------------------------------------------------------------------- FIX! TO DO }; P = , '' { if (alt == 0) { push_ctrl(REPEAT); // for loop compile(P(1), depth+1); } else if (alt == 1) { push_ctrl(REPEATUNTIL); // plain %cycle codegen("do "); } return -1; }; P = "cycle", "cycle" { if (debug_compiler) fprintf(stderr, "[cycle]\n"); if (alt == 0) { compile(P(2), depth+1); // if was empty, we should push_ctrl(REPEATUNTIL);, otherwise... } else { if (P_alt(P(1)) == 1) { // %until fprintf(stderr, "? Please recode the source file so that loops with \"%%until %%cycle ... %%repeat\"" "are replaced by \"%%cycle ... %%repeat %%until \""); // In the meantime we just lose the condition at the top and hopefully at some point in the near // future I'll have stacked it and will be able to pop it off the stack on the %repeat ... codegen("/* Deferred: "); // Because compile() currently outputs the condition as well as returning it as a tuple. int UntilCond = compile(P(1), depth+1); codegen(" */"); codegen("\ndo "); push_ctrl(UntilCond); // Save the condition for testing at the end of the loop push_ctrl(REPEATPUSHEDUNTIL); // %until not allowed after %repeat } else { push_ctrl(REPEAT); // %until not allowed after %repeat compile(P(1), depth+1); } } codegen("{\n"); return -1; }; P = "repeat" { if (debug_compiler) fprintf(stderr, "[repeat]\n"); // handles the control stack and whether a %until is allowed compile(P(2), depth+1); compile(P(3), depth+1); return -1; }; P = { if (debug_compiler) fprintf(stderr, "[auto declaration]\n"); compile(P(1), depth+1); Object = VAR; Linkage = AUTO; Proc = NONE; compile(P(2), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); return -1; }; P = # pushes a scope level, but we want to # add to the name tables before pushing # the new scope. It's only *inside* the procedure # body that we want to restrict declarations to the # nested scope level. So should only set # a flag, and the flag should be tested (and cleared) # immediately after is handled at the previous # scope level. # Note that flags set at parse time have to be unset # at the start of the following alternative, otherwise # they will propogate beyond their intended lifetime - # which has not yet been done. # Of course this is assuming we need to handle name tables # on the fly while parsing. If all the work can be done # by walking the AST after parsing the entire program then # it is not an issue. But I don't think that will be the # case. For example we want to distinguish between # fred(3) at parse time depending on whether fred is a # function or an array element. We *could* put it off # until AST time but making these decisions while parsing # simplifies the code generation a lot. # In parameter lists and places where declaration type info # currently resets the glo { if (debug_compiler) fprintf(stderr, "[procedure declaration]\n"); compile(P(1), depth+1); Object = CODE; compile(P(2), depth+1); // SEX compile(P(3), depth+1); // RT compile(P(4), depth+1); // Set up Spec and flag for compile(P(6), depth+1); // ALIAS compile(P(5), depth+1); // DECLNAME reordered so name handled after params set up // (but work to be done to associate param list with Rt) ParentVarIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*P %s:%d*/",__FILE__,__LINE__); // BUT TO DO: NEEDS MORE WORK ParentTypeIDX = VarDecl[ParentVarIDX].type; codegen("("); push_scope_level(); // aka . popped at end of param list if a spec, otherwise on %end of routine. Ensures params are like locals of procedure. compile(P(7), depth+1); // FPP (or void if none) codegen(")"); if (Spec == SPEC) { pop_scope_level(); // aka codegen(";\n"); } else { // pop scope level on %end of procedure. // Parameters have been declared as if within procedure body. codegen(" {\n"); } return -1; }; P = { if (debug_compiler) fprintf(stderr, "[own/external/const declaration]\n"); compile(P(1), depth+1); Object = VAR; compile(P(2), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); compile(P(5), depth+1); return -1; }; P = "record""format" { if (debug_compiler) fprintf(stderr, "[record format declaration]\n"); compile(P(1), depth+1); // IsFormat = IS_FORMAT; ... deprecated Object = RECORDFORMAT; compile(P(4), depth+1); return -1; }; P = "end" { if (debug_compiler) fprintf(stderr, "[end of something]\n"); // <-------------------------------------------------------------------------- FIX! TO DO }; P = "include" { if (debug_compiler) fprintf(stderr, "[push an include file]\n"); // TO DO: not handled in previous imptoc - it was assumed that incuded files could be translated separately // but that is not correct. Compilation of include files depends on what has gone before. So they have to // be translated in the context of the parent file, so as well as pushing the input file on the include stack, // we have to switch output to the .h.tmp version of the .inc file, and switch back to the current context on EOF // (the mechanics of handling include files was already worked out for takeon where .g files can contain I<...> ) // Note that include files will probably be handled like C and allow a search path via a -I option. codegen("\n#include \"...\"\n"); return -1; }; P = "begin" { if (debug_compiler) fprintf(stderr, "[begin block]\n"); if (ctrl_depth() <= 0) { push_ctrl(ENDOFPROGRAM); codegen("\nint _imp_mainep(int _imp_argc, char **_imp_argv) {\n"); } else { push_ctrl(END); // Also pushed at the start of a rt/fn/map codegen("\n{ /* %%begin block */\n"); } compile(P(2), depth+1); // ARGH! This was missing and nested scopes were not being created! return -1; }; P = '('':'')' { if (debug_declarations > 1) fprintf(stderr, "[one switch declaration]"); // Will require *either* exbedded code at end of scope block with actual switch statement and gotos, // *or* we have to backpatch this declaration with assignment of an array of labels (in the style &&labname) // *or* we generate *all* labels at the point of declaration but output the unused ones at the end of // the scope block with a jump to the default if given, or an error message if not. (i.e. the BADSWITCH() // code from before) // We do at a minimum require a hook between closing a scope and executing pending code. /* This is what the last iteration of imptoc did with switch declarations: static int Bip_sw; static void *Bip[40 / *999:1038* /] = { &&Bip_999, &&Bip_1000, &&Bip_1001, &&Bip_1002, &&Bip_1003, &&Bip_1004, &&Bip_1005, &&Bip_1006, &&Bip_1007, &&Bip_1008, &&Bip_1009, &&Bip_1010, &&Bip_1011, &&Bip_1012, &&Bip_1013, &&Bip_1014, &&Bip_1015, &&Bip_1016, &&Bip_1017, &&Bip_1018, &&Bip_1019, &&Bip_1020, &&Bip_1021, &&Bip_1022, &&Bip_1023, &&Bip_1024, &&Bip_1025, &&Bip_1026, &&Bip_1027, &&Bip_1028, &&Bip_1029, &&Bip_1030, &&Bip_1031, &&Bip_1032, &&Bip_1033, &&Bip_1034, &&Bip_1035, &&Bip_1036, &&Bip_1037, &&Bip_1038, }; goto *Bip[Bip_sw = (Item)-999]; or static int S_sw; static void *S[5 / *1:5* /] = { &&S_1, &&S_2, &&S_3, &&S_4, &&S_default, }; goto *S[S_sw = (A[P + 1]) - 1]; S_4:; // code for each switch entry Then if there is no S_default created due to a S(*): in Imp, add this at the foot of the procedure or block: goto S_skip; S_default: fprintf(stderr, "\nSwitch label 'S(%d):' not set in %s\n", S_sw + 1, __PRETTY_FUNCTION__); fflush(stderr); abort(); S_skip:; although the goto above would be better coded as: S_sw = (A[P + 1]) - 1; if (1 <= S_sw && S_sw <= 5) goto *S[S_sw]; else goto S_default; and of course there is the saving of __LINE__ and __FILE__ to be done, using the values from the Imp source where the ->S(A[P + 1]) occurred. Which could probably best be implemented using a macro? e.g. goto_switch(S, A[P + 1]); Perhaps: #define goto_switch(S, N) \ do { \ S ## _sw = N; \ _sw_line = __LINE__; \ _sw_file = __FILE__; \ if (S ## _low <= S ## _sw && S ## _sw <= S ## _high) \ goto *S[S ## _sw - S ## _low]; \ else \ goto S ## _default; \ } while (0) Initialising the switch without exbedding code is going to be tricky. For now the best I can think of is a nested procedure call: At the switch declaration: #define declare_switch(S, low, high) static int S ## _sw; const int S ## _low = low, S ## _high = high; static void **S; auto void init_ ## S(void); init_ ## S(); then at the end of the block: void init_S(void) { const static void *_S[5 / * 1:5 * /] = { &&S_1, &&S_2, &&S_3, &&S_4, &&S_default, }; S = _S; } So to do all this we need to keep track of the switch bounds (associated with the switch name), and an array of whether each label is present (low:high) (plus a separate one for default:) stored in the var information for that name. I still need a good mechanism to trigger code generation on popping the scope at the end of a rt/fn/map/begin block. */ // '('':'')' compile(P(1), depth+1); // Object = SWITCHDEFN; IsArray = ARRAY; Basetype = SWITCHLABEL; Area = CONSTANT; // NameInfo under consideration. May depend on whether the // switch label is implemented as a pointer to an array or // the array itself - exbedded array declaration or placed // at end of the block? // LIKE BPAIR BUT ONE-DIMENSION ONLY: int LowIDX = compile(P(5), depth+1); int HighIDX = compile(P(7), depth+1); AddDims(LowIDX, HighIDX); compile(P(2), depth+1); // NAME int SwitchDeclIDX = Declaration(depth+1, FALSE /* switch declarations cannot be A, B, C(L:H) - they must all be handled separately. */); int next_name = P(3); while (next_name != -1 && P_alt(next_name) == 0) { // replace the name but keep the other parameters the same: compile(P_P(next_name, 3), depth+1); // ',' , '' int MoreSwitchDeclIDX = Declaration(depth+1, FALSE /* switch declarations cannot be A, B, C(L:H) - they must all be handled separately. */); next_name = P_P(next_name, 4); } }; P = ',', ''; P = "switch" ; P = "list" { if (debug_compiler) fprintf(stderr, "[turn on listing]\n"); return -1; }; P = "else" { { if (debug_compiler) fprintf(stderr, "[(finish) else (start)]\n"); int EXPECTED = pop_ctrl(); if (EXPECTED == ELSE || EXPECTED == FINISHELSE) { push_ctrl(FINISH); codegen("} else {"); // NOTE: a plain "%else" represents "%finish %else %start" } else { // TO DO: // And today's exercise is... try to remember how to extract the line/column from the "else" symbol // to report the error better :-/ fprintf(stderr, "\n* Spurious %%ELSE\n"); exit(1); } } return -1; }; P = "finish" { if (debug_compiler) fprintf(stderr, "[finish_opt_else]\n"); { if (ctrl_depth() <= 1) { fprintf(stderr, "\n* Spurious %%FINISH\n"); exit(1); // at least until we get around to handling ERCC conditional compilation tests } int EXPECTED = pop_ctrl(); codegen("\n}"); if (EXPECTED == FINISH) { push_ctrl(ELSE); } else if (EXPECTED == FINISHELSE) { push_ctrl(ELSE); } else { fprintf(stderr, "\n* %%FINISH not expected (expecting %s)\n", ctrl_debug[EXPECTED]); exit(1); } // recurse on FINISHELSEQ } }; P = '*' { if (debug_compiler) fprintf(stderr, "[embedded assembly code]\n"); codegen("asm(/*to do*/)"); return -1; }; P = "trustedprogram" { if (debug_compiler) fprintf(stderr, "[trusted program pragma]\n"); return -1; }; P = "mainep" { if (debug_compiler) fprintf(stderr, "[use this routine for main()]\n"); // TO DO: Declaration() return -1; }; P = "control" { // pragma? if (debug_compiler) fprintf(stderr, "[%%control - fine control over compiler options]\n"); }; P = '' { // print a semicolon after certain statements. This is for C's benefit and only really an issue // when generating C on the fly from compile() statements. Once these have moved to AST_* tuples // and are generated *after* parsing, there will be no need for this here. codegen("/*5*/; "); return -1; }; # All there were factored out into separate phrases to make the C generation easier! # I'm considering just parsing these one at a time rather than all together # in , and outputting C code as each statement type is compiled, rather # than only once the entire source is parsed. # (pending a conversation re scope for statements like # %integer fred = fred + 1 # or # %integer jim = fred, fred # where there is a variable of the same name in an enclosing scope) # Alternatively I could arrange for each statement to return an AST_* tuple # and do *no* C output from the parser, separating it to a separate phase # which walks the true AST (of AST_* tuples which map closely to the C code) # REDESIGN: Only *these* statements will call generate_c()!: # And once that is done, I'll change it so that these return AST_* entries # and *those* will be called to generate_c(). # Finally we will have a tree/linked list of statements that can be output # from its root, by a simgle tree-walk call from P. P = , , , , , , , , , , , , , , , , , , , # maybe better as a UI? # Not sure why these have to stand alone - can't be used in UI_SEQUENCE , , , { switch (alt) { case 0: // case 1: // case 2: // case 3: // case 4: // case 5: // case 6: // case 7: // case 8: // case 9: // case 10: // case 11: // case 12: // case 13: // case 14: // case 15: // case 16: // case 17: // case 18: // case 19: // case 20: // case 21: // case 22: // break; default: fprintf(stderr, "** Error: STATEMENT.alt = %d\n", alt); break; } // drop through and return a to P }; E @ 1.28 log @removing unconditional diags @ text @d23 2 a24 2 # The P_* tuples reflect the structure of the Imp code - the AST_* ones are # more akin to the generated C code. d128 5 d312 1 a312 1 P = '@@', ''; # <------------------------------------------------------------------ FIX! TO DO d314 1 a314 1 # Not sure if ".123" is supposed to be acceptable or not. This grammar certainly doesn't d320 1 a320 1 P = '.', ''; # <------------------------------------------------------------------ FIX! TO DO d322 5 a326 1 P = , d525 5 a529 1 d551 5 d695 3 d726 5 d737 1 a737 1 "string", d822 1 d863 1 a863 1 # below show have done an so I have no idea why the parameter 'INT' in procedure 'ONE' d1189 4 d1200 3 d1223 54 d1279 1 a1279 1 codegen(" = "); if (P_alt(P(2)) == 0) codegen("\n"); // copy newline if present in original d1282 1 a1295 5 P = '('')', '' { // <---------------------------------------------------------------- FIX! TO DO // TO DO. }; d1850 1 a1850 1 Divert_C_to_Cpool = TRUE; d1852 1 a1852 1 Divert_C_to_Cpool = FALSE; @ 1.27 log @record fields appear! @ text @d1598 1 a1598 1 Describe_Type(Type); @ 1.26 log @record declarations improving, assignments started @ text @d398 12 a409 3 // NOW PUNTED TO Compile_Var() so should not get called here. fprintf(stderr, "** Error: P *was* executed\n"); // drop through and return P_Opt_Record_Field <----------------------------------------------------- FIX! TO DO d430 9 a438 6 int Var = Compile_Var(-1, P(1), P(2), P(3), depth+1); /*Diagnose("Result of int Var = Compile_Var(-1, P(1), P(2), P(3), depth+1): ", Var, 0, debug_declarations);*/ PushVar(Var); // is *only* used in context of an so needs TORP // - except when creating atoms // CALLERS OTHER THAN MUST PULL THIS OFF THE TOP OF THE STACK! // THIS IS NOT A VAR. DebugVarIDX(Var); a1406 1 {int tmp = debug_scope; debug_scope = 2; a1407 1 debug_scope = tmp;} a1408 2 fprintf(stderr, "Setting TypeDecl[RecFmtTypeIDX=%d].recfm = %d **********************************************************\n", RecFmtTypeIDX, TypeDecl[RecFmtTypeIDX].recfm); a1433 5 fprintf(stderr, "-------------------------------------------\n"); {int tmp = debug_declarations; debug_declarations = 1; DebugVarIDX(RecFmtVarIDX); debug_declarations = tmp;} fprintf(stderr, "-------------------------------------------\n"); a1477 2 fprintf(stderr, "? adding field \"%ls\" to %%recordformat.\n", Atom2WStr(DTag)); {int tmp = debug_scope; debug_scope = 2; a1478 1 debug_scope = tmp;} d1562 1 a1562 1 Diagnose("Switch index: ", SWIdx, 0, TRUE); a1925 1 {int tmp = debug_declarations; debug_declarations = 1; a1926 1 debug_declarations = tmp;} a2268 44 { /* #define BIP_TYPE (1U <<27U) #define PHRASE_TYPE (2U <<27U) #define SEMANTIC_TYPE (3U <<27U) #define KEYWORD_TYPE (4U <<27U) #define CHAR_TYPE (5U <<27U) #define UTF32CHAR_TYPE (6U <<27U) #define STRING_TYPE (7U <<27U) #define UTF32STRING_TYPE (8U <<27U) #define REGEXP_TYPE (9U <<27U) #define OPTION_TYPE (10U <<27U) #define COUNT_OF_ALTS (11U <<27U) #define COUNT_OF_PHRASES (12U <<27U) #define ALT_NUMBER (13U <<27U) #define AST_idx_mask 0x7FFFFFFU #define AST_type_shift 27U #define AST_type_mask 31U #define AST_BIP (16U << AST_type_shift) #define AST_PHRASE (17U << AST_type_shift) #define AST_ATOM_LIT (18U << AST_type_shift) #define AST_POOL_LIT (19U << AST_type_shift) // Up to 31U is free... */ #ifdef NEVER for (i = 1; i <= count; i++) { if (P_AST_type(P(i)) == PHRASE_TYPE) { t[i] = compile(P(i),depth+1); fprintf(stderr, "t[i] = %ls\n", Decode(t[i])); } else if (P_AST_type(P(i)) == AST_ATOM_LIT) { t[i] = P(i); // returns P(i) for now fprintf(stderr, "\nt[i] = %ls P_type=%d\n", Decode(t[i]), P_AST_type(P(i))); } else { fprintf(stderr, "* Error: unexpected type code %d in phrase handling procedure.\n", P_AST_type(P(i))>>AST_type_shift); } } t[0] = P_mktuple(op, alt, count, t); return t[0]; // FOR NOW #endif } @ 1.25 log @fixed a ';' @ text @d219 1 a219 1 return t[0] = P_mktuple(AST_ASSOP, alt, 1/*phrases*/, t); d399 1 a399 1 //fprintf(stderr, "** Error: P *was* executed\n"); d670 1 d712 5 a716 1 case 8: Signedness = SIGNED; Precision = COMPOUND; Basetype = RECORD; break; // record d754 1 a754 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*A %s:%d*/",__FILE__,__LINE__); d756 1 a756 1 int MoreDecls = P(4); // ',' , '' { d760 1 a760 1 (void)Declaration(depth+1, TRUE /* A, B, C format allowed */); //codegen("/*B %s:%d*/",__FILE__,__LINE__); d772 1 a772 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); codegen("/*C %s:%d*/",__FILE__,__LINE__); d774 1 a774 1 int MoreDecls = P(4); // ',' , '' { d778 1 a778 1 (void)Declaration(depth+1, TRUE /* A, B, C format allowed */); codegen("/*D %s:%d*/",__FILE__,__LINE__); d795 1 a795 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*E %s:%d*/",__FILE__,__LINE__); d797 1 a797 1 int MoreDecls = P(4); // ',' , '' { d801 1 a801 1 (void)Declaration(depth+1, TRUE /* A, B, C format allowed */); codegen("/*F %s:%d*/",__FILE__,__LINE__); d888 1 a888 1 codegen("\n} /* End of Rt/Fn/Map/begin depth=%d */\n", ctrl_depth()); // no semicolon d1030 1 a1030 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); // codegen("/*G %s:%d*/",__FILE__,__LINE__); d1037 1 a1037 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); codegen("/*H %s:%d*/",__FILE__,__LINE__); // for now. Syntax of multiples is complex. d1052 1 a1052 1 (void)Declaration(depth+1, TRUE /* A, B, C form allowed */); //codegen("/*I %s:%d*/",__FILE__,__LINE__); d1054 1 a1054 1 int MoreDecls = P(3); // ',' , '' { d1058 1 a1058 1 (void)Declaration(depth+1, TRUE /* A, B, C format allowed */); //codegen("/*J %s:%d*/",__FILE__,__LINE__); d1081 1 a1081 1 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*K %s:%d*/",__FILE__,__LINE__); d1085 1 a1085 1 compile(P(6), depth+1); // d1091 4 a1094 3 ArrayDims = 0; compile(P(5), depth+1); // (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*L %s:%d*/",__FILE__,__LINE__); compile(P(6), depth+1); // d1357 5 d1369 11 a1379 1 recfm = Declaration(depth+1, FALSE /* A, B, C form not allowed */); // RecordFormatIDX recfm d1386 21 a1406 1 RTag = DTag; recfm = Declaration(depth+1, FALSE /* A, B, C form not allowed */); // RecordFormatIDX recfm a1408 3 //ParentVarIDX = Declaration(depth+1, FALSE /* A, B, C form not allowed */); //codegen("/*P %s:%d*/",__FILE__,__LINE__); // BUT TO DO: NEEDS MORE WORK //ParentTypeIDX = VarDecl[ParentVarIDX].type; d1412 1 a1412 1 compile(P(4), depth+1); // d1420 11 a1430 1 RTag = -1; recfm = -1; // Done compiling a recordformat d1440 11 a1450 1 return -1; // remove the drop through once this is handled. <------------------------------- DROP THROUGH d1457 3 a1459 3 compile(P(3), depth+1); // compile(P(4), depth+1); // return -1; d1465 1 a1465 1 if (debug_compiler) fprintf(stderr, "[record format declaration]\n"); d1473 11 a1483 2 (void)Declaration(depth+1, FALSE /* A, B, C form not allowed */); codegen(";\n"); d1485 4 a1488 3 compile(P(2), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); a1489 1 return -1; d1595 1 d1597 3 d1926 1 d1928 1 d1932 1 a1932 1 push_scope_level(); // popped at end of param list if a spec, otherwise on %end of routine d1936 1 a1936 1 pop_scope_level(); d1940 1 a1940 1 // Parameters are now declared as if within procedure body. a1959 1 //IsFormat = IS_FORMAT; // THIS APPEARS TO HAVE BE OVERRIDDEN BY THE TIME IT IS PRINTED. d1961 1 d2103 1 a2103 1 Declaration(depth+1, FALSE /* switch declarations cannot be A, B, C(L:H) - they must all be handled separately. */); d2109 1 a2109 1 Declaration(depth+1, FALSE /* switch declarations cannot be A, B, C(L:H) - they must all be handled separately. */); @ 1.24 log @StrPoolIDX @ text @d883 1 a883 1 codegen("\n} /* End of Rt/Fn/Map/begin depth=%d */;\n", ctrl_depth()); d1033 1 a1033 1 //codegen(";\n"); d1195 1 d1526 2 d1536 1 a1536 1 codegen("assert(_IMP_MONITOR_)"); @ 1.23 log @several features added. @ text @d225 1 a225 1 const char *C_Comp[] = { "== &", ">=", ">", "!= &", "!= &", "!=", "<=", "<", "*BADCOMP*", "==", "!=", "!=" }; d227 1 a227 1 fprintf(stdout, " %s ", C_Comp[alt]); d599 1 a599 1 fprintf(stdout, "if "); d601 1 a601 1 fprintf(stdout, "unless "); d607 1 a607 1 fprintf(stdout, " ("); d609 1 a609 1 fprintf(stdout, ") "); d618 1 a618 1 fprintf(stdout, "for ("); d631 1 a631 1 fprintf(stdout, " = "); d633 1 a633 1 fprintf(stdout, "; "); d635 1 a635 1 fprintf(stdout, " != "); d637 3 a639 1 fprintf(stdout, "; "); d641 1 a641 1 fprintf(stdout, " += "); d643 1 a643 1 fprintf(stdout, ") "); d651 1 a651 1 fprintf(stdout, "while "); compile(P(2), depth+1); d655 1 a655 1 fprintf(stdout, "until "); compile(P(2), depth+1); d669 1 a669 1 //fprintf(stdout, ", "); d734 3 d743 1 a743 1 if (alt == 0) { d745 1 a745 1 Object = VAR; d749 1 a749 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*A %s:%d*/",__FILE__,__LINE__); d753 1 a753 1 fprintf(stdout, ", ");/*a*/ d755 1 a755 1 (void)Declaration(depth+1, TRUE); //fprintf(stdout, "/*B %s:%d*/",__FILE__,__LINE__); d760 1 a760 2 } else if (alt == 1) { // TO DO: Parameter list of a procedure parameter probably needs some more work d767 1 a767 1 (void)Declaration(depth+1, FALSE); fprintf(stdout, "/*C %s:%d*/",__FILE__,__LINE__); d771 1 a771 1 fprintf(stdout, ", /*b*/"); d773 1 a773 1 (void)Declaration(depth+1, TRUE); fprintf(stdout, "/*D %s:%d*/",__FILE__,__LINE__); d786 1 a786 1 //Proc = UNINIT_PROC; d790 1 a790 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*E %s:%d*/",__FILE__,__LINE__); d794 1 a794 1 fprintf(stdout, ", /*c*/"); d796 1 a796 1 (void)Declaration(depth+1, TRUE); fprintf(stdout, "/*F %s:%d*/",__FILE__,__LINE__); d819 3 a821 1 P = '('')', d823 1 a823 1 if (alt == 1) fprintf(stdout, "void"); // proc with no parameters d837 2 a838 2 P = , '' { if (alt == 0) fprintf(stdout, ", "); d842 1 d861 1 d863 1 a863 1 fprintf(stdout, "\n exit(0);\n} /* End of program */\n"); d874 1 a874 1 if (debug_compiler) fprintf(stderr, "[end of 'begin' block or Rt/Fn/Map]\n"); d883 1 a883 1 fprintf(stdout, "\n} /* End of Begin/Rt/Fn/Map depth=%d */;\n", ctrl_depth()); d916 1 a916 1 fprintf(stdout, " && "); d927 2 a928 2 fprintf(stdout, "!"); // not sure about priorities for booleans fprintf(stdout, "("); d930 1 a930 1 fprintf(stdout, ")"); d941 1 a941 1 fprintf(stdout, " && "); d945 1 a945 1 fprintf(stdout, " || "); d955 1 a955 1 fprintf(stdout, " && "); d965 1 a965 1 fprintf(stdout, " || "); d1025 2 a1026 2 (void)Declaration(depth+1, FALSE); // fprintf(stdout, "/*G %s:%d*/",__FILE__,__LINE__); fprintf(stdout, ";\n"); d1030 1 a1030 1 //fprintf(stdout, ", /*e*/"); d1032 2 a1033 2 (void)Declaration(depth+1, FALSE); fprintf(stdout, "/*H %s:%d*/",__FILE__,__LINE__); // for now. Syntax of multiples is complex. //fprintf(stdout, ";\n"); d1047 1 a1047 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*I %s:%d*/",__FILE__,__LINE__); d1051 1 a1051 1 fprintf(stdout, "; ");/*d*/ d1053 1 a1053 1 (void)Declaration(depth+1, TRUE); //fprintf(stdout, "/*J %s:%d*/",__FILE__,__LINE__); d1057 1 a1057 1 fprintf(stdout, ";\n"); d1076 1 a1076 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*K %s:%d*/",__FILE__,__LINE__); d1078 1 a1078 1 fprintf(stdout, ";\n"); d1087 1 a1087 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*L %s:%d*/",__FILE__,__LINE__); d1089 1 a1089 1 fprintf(stdout, ";\n"); d1100 1 a1100 1 (void)Declaration(depth+1, FALSE); //fprintf(stdout, "/*M %s:%d*/",__FILE__,__LINE__); d1102 1 a1102 1 fprintf(stdout, "; "); d1136 1 a1136 1 fprintf(stdout, " = "); d1156 1 a1156 1 fprintf(stdout, "[ LOW ... HIGH ] = "); generate_c(t[2], depth+1); d1173 1 a1173 1 fprintf(stdout, " = "); if (P_alt(P(2)) == 0) fprintf(stdout, "\n"); // copy newline if present in original d1182 1 a1182 1 fprintf(stdout, ", "); if (P_alt(P(2)) == 0) fprintf(stdout, "\n"); // copy newline if present in original d1196 1 a1196 1 fprintf(stdout, "if (_imp_onevent("); d1199 1 a1199 1 fprintf(stdout, ")) {\n"); d1208 1 a1208 1 fprintf(stdout, ", "); d1221 1 a1221 1 fprintf(stdout, ", "); a1226 23 P = ',''('':'')', '' { if (alt == 0) { compile(P(5), depth+1); // LOW compile(P(7), depth+1); // HIGH compile(P(2), depth+1); // (void)Declaration(depth+1, FALSE); fprintf(stdout, "/*N %s:%d*/",__FILE__,__LINE__); { int MoreDecls = P(3); // ',' , '' { while (P_alt(MoreDecls) == 0) { fprintf(stdout, ", /*f*/"); compile(P_P(MoreDecls, 3), depth+1); // (void)Declaration(depth+1, FALSE); fprintf(stdout, "/*O %s:%d*/",__FILE__,__LINE__); MoreDecls = P_P(MoreDecls, 4); } } compile(P(9), depth+1); // } else if (alt == 1) { } return -1; }; d1238 1 a1238 1 fprintf(stdout, "/*PC_WUF RESTOFREPEATUNTIL*/"); d1241 1 a1241 1 fprintf(stdout, "} until "); d1243 1 a1243 1 fprintf(stdout, ";"); d1245 1 a1245 1 fprintf(stdout, "} while (1);"); d1249 1 a1249 1 fprintf(stdout, "/*PC_WUF RESTOFREPEATPUSHEDUNTIL*/"); d1253 1 a1253 1 fprintf(stdout, "} while (!"); d1255 1 a1255 1 fprintf(stdout, " && !"); // or "||" ? d1257 1 a1257 1 fprintf(stdout, ");"); d1261 1 a1261 1 fprintf(stdout, "} while (!"); d1263 1 a1263 1 fprintf(stdout, ");"); d1266 1 a1266 1 fprintf(stdout, "/*PC_WUF RESTOFREPEAT*/"); d1270 1 a1270 1 fprintf(stdout, "}"); // no semicolon d1285 2 a1286 2 fprintf(stdout, " else "); } else fprintf(stdout, "\n"); // following a %finish d1297 1 a1297 1 fprintf(stdout, "{\n"); d1318 1 a1318 1 fprintf(stdout, " else "); d1329 1 a1329 1 fprintf(stdout, " {\n"); d1349 35 a1383 1 if (debug_compiler) fprintf(stderr, "[record format statement]\n"); d1387 2 a1388 1 # record format reference d1390 3 a1392 1 { return -1; }; d1398 2 d1402 2 a1403 1 P = , d1405 1 d1407 14 d1429 1 a1429 1 return -1; d1435 1 a1435 1 return -1; d1458 1 a1458 1 /*Diagnose("Assignment or call: ", Var,0, debug_declarations);*/ a1463 1 //fprintf(stderr, "Generating assignment:\n"); a1465 3 //if (P_op(Var) == AST_RVALUE) { // P_op(Var) = AST_LVALUE; //} a1481 1 fprintf(stdout, "goto "); d1485 1 d1489 1 a1489 1 fprintf(stdout, "* ERROR: %%switch %ls NOT DECLARED\n", wSVTag); // exit(1); d1491 4 a1494 4 fprintf(stdout, "%ls", wSVTag); fprintf(stdout, "["); generate_c(SWIdx, depth+1); fprintf(stdout, "]"); d1497 1 d1506 1 a1506 1 fprintf(stdout, "%ls", wSVTag); d1513 1 a1513 1 fprintf(stdout, "return"); d1520 1 a1520 1 fprintf(stdout, "return "); d1533 1 a1533 1 fprintf(stdout, "assert(_IMP_MONITOR_)"); d1539 1 a1539 1 fprintf(stdout, "exit(0)"); d1549 1 a1549 1 fprintf(stdout, "_imp_signal("); d1552 1 a1552 1 fprintf(stdout, ")"); d1558 1 a1558 1 fprintf(stdout, "break"); d1564 1 a1564 1 fprintf(stdout, "continue"); a1582 2 // TO DO: "until" condition has to be tested at the end of the loop... fprintf(stdout, "/*UI PC_WUF P_alt(P(2)=%d*/", P_alt(P(2))); d1587 1 a1587 1 fprintf(stdout, "do "); d1604 1 a1604 1 compile(P(1), depth+1); fprintf(stdout, "; "); d1609 1 a1609 1 fprintf(stdout, "; "); d1620 2 a1621 2 fprintf(stdout, "{ "); compile(P(1), depth+1); fprintf(stdout, "; "); d1623 1 a1623 1 fprintf(stdout, " }"); d1626 1 a1626 1 compile(P(1), depth+1); fprintf(stdout, "; "); d1686 1 a1686 1 fprintf(stdout, "%s_default:", Atom2Str(LabelTag)); d1690 12 a1701 3 fprintf(stdout, "%s_", Atom2Str(LabelTag)); generate_c(compile(P(3), depth+1), depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. fprintf(stdout, ":"); d1708 1 a1708 1 fprintf(stdout, "%s: ", Atom2Str(LabelTag)); // TO DO: This outputs a ' ' after the tag. We should suppress the space at the d1716 1 a1716 1 // TO DO: (void)Declaration(depth+1, FALSE); d1759 1 a1759 1 fprintf(stdout, "do "); d1769 1 a1769 1 fprintf(stdout, "/*CYCLE MAYBE WITH I=...*/"); d1772 1 a1772 1 fprintf(stdout, "/*PC_WUF start_of_cycle*/"); d1778 1 a1778 1 fprintf(stdout, "/* Deferred: "); // Because compile() currently outputs the condition as well as returning it as a tuple. d1780 2 a1781 2 fprintf(stdout, " */"); fprintf(stdout, "\ndo "); d1789 1 a1789 1 fprintf(stdout, "{\n"); d1795 1 a1795 1 fprintf(stdout, "/*end_of_cycle*/"); d1805 1 a1805 1 compile(P(1), depth+1); Object = VAR; d1852 1 a1852 1 ParentVarIDX = Declaration(depth+1, FALSE); //fprintf(stdout, "/*P %s:%d*/",__FILE__,__LINE__); // BUT TO DO: NEEDS MORE WORK d1855 1 a1855 1 fprintf(stdout, "("); d1858 1 a1858 1 fprintf(stdout, ")"); d1861 1 a1861 1 fprintf(stdout, ";\n"); d1865 1 a1865 1 fprintf(stdout, " {\n"); d1885 1 d1887 1 d1904 1 a1904 1 fprintf(stdout, "\n#include \"...\"\n"); d1912 1 a1912 1 fprintf(stdout, "\nint _imp_mainep(int _imp_argc, char **_imp_argv) {\n"); d1914 2 a1915 2 push_ctrl(END); fprintf(stdout, "\n{ /* %%begin block */\n"); d1917 1 d1921 2 a1922 2 P = "switch"'('':'')' { if (debug_declarations > 1) fprintf(stderr, "[switch declaration]"); d1930 58 a1987 1 compile(P(1), depth+1); Object = SWITCHDEFN; d1989 47 a2035 11 compile(P(6), depth+1); // Bounds pair LOW compile(P(8), depth+1); // HIGH compile(P(3), depth+1); // NAME // TO DO: (void)Declaration(depth+1, FALSE); fprintf(stdout, "/*Q %s:%d*/",__FILE__,__LINE__); compile(P(4), depth+1); // NAMES compile(P(10), depth+1); // RESTOFSWLIST compile(P(11), depth+1); // S return -1; d2038 4 d2053 1 a2053 1 fprintf(stdout, "} else {"); // NOTE: a plain "%else" represents "%finish %else %start" d2073 1 a2073 1 fprintf(stdout, "\n}"); d2087 1 a2087 1 fprintf(stdout, "asm(/*to do*/)"); d2112 1 a2112 1 fprintf(stdout, "/*5*/; "); a2145 1 , d2150 1 @ 1.22 log @update check-in. @ text @d332 2 a333 2 Diagnose("NEW CONSTANT: ", t[1], 0, TRUE); {int child=P_P(t[1],1); Diagnose("CONSTANT CHILD: ", child, 0, TRUE);} d651 6 a656 1 fprintf(stdout, "/*TO DO*/until "); compile(P(2), depth+1); d744 1 a744 1 (void)Declaration(depth+1, FALSE); d750 1 a750 1 (void)Declaration(depth+1, TRUE); d763 1 a763 1 (void)Declaration(depth+1, FALSE); d769 1 a769 1 (void)Declaration(depth+1, TRUE); d786 1 a786 1 (void)Declaration(depth+1, FALSE); d792 1 a792 1 (void)Declaration(depth+1, TRUE); d875 1 a875 1 fprintf(stdout, "\n} /* End of Rt/Fn/Map */;\n"); d1017 2 a1018 2 (void)Declaration(depth+1, FALSE); //fprintf(stdout, ";\n"); d1024 1 a1024 1 (void)Declaration(depth+1, FALSE); // for now. Syntax of multiples is complex. d1039 1 a1039 1 (void)Declaration(depth+1, FALSE); d1045 1 a1045 1 (void)Declaration(depth+1, TRUE); d1067 2 d1070 2 a1071 3 (void)Declaration(depth+1, FALSE); fprintf(stdout, ";\n"); d1079 1 d1081 1 a1081 1 (void)Declaration(depth+1, FALSE); fprintf(stdout, ";\n"); d1092 1 a1093 1 (void)Declaration(depth+1, FALSE); d1118 53 a1170 2 # Initialised const and own arrays are currently leaving operands and operators on the TORP stacks. # Problem should go away once initialisations are handled. TO DO. d1172 7 a1178 4 P = '=', ''; # <----------------------- FIX! TO DO P = '' { (void)ExPop(); // quick hack because code was generating a *huge* stack, with each initialised array element being on the stack :-/ return -1; d1181 3 a1183 1 P = ',', ''; # <---------------- FIX! TO DO a1184 1 P = '('')', ''; # <---------------------------------------------------------------- FIX! TO DO d1186 9 d1196 1 d1203 1 d1225 1 a1225 1 (void)Declaration(depth+1, FALSE); d1231 1 a1231 1 (void)Declaration(depth+1, FALSE); d1246 1 a1246 1 if (EXPECTED != REPEAT && EXPECTED != REPEATUNTIL) { d1252 2 a1253 1 // %cycle could have been emitted as "do {" and this code could be either ") until (cond)" or "} while (1)" d1262 18 d1281 1 d1283 1 a1283 1 fprintf(stderr, "* %%UNTIL is not allowed at the end of this cycle.\n"); exit(1); a1351 2 P = '=', '' { return -1; }; d1538 1 a1538 1 compile(P(1), depth+1); // must be encapsulated in a {...} block if more that 1 statement d1546 13 a1558 2 compile(P(2), depth+1); compile(P(1), depth+1); d1725 1 d1728 16 a1743 2 push_ctrl(REPEAT); // %until not allowed after %repeat compile(P(1), depth+1); d1751 1 d1808 1 a1808 1 ParentVarIDX = Declaration(depth+1, FALSE); // BUT TO DO: NEEDS MORE WORK a1837 10 P = "on""start" { if (debug_compiler) fprintf(stderr, "[on event block]\n"); fprintf(stdout, "if (_imp_onevent("); generate_c(compile(P(3),depth+1), depth+1); generate_c(compile(P(4),depth+1), depth+1); fprintf(stdout, ")) {\n"); push_ctrl(FINISH); return -1; }; d1866 1 a1866 1 fprintf(stdout, "\n#include \n\nint _imp_mainep(int _imp_argc, char **_imp_argv) {\n"); d1869 1 a1869 1 fprintf(stdout, "\n{\n"); d1890 1 a1890 1 (void)Declaration(depth+1, FALSE); @ 1.21 log @restructuring ops and assignments and @ text @d12 1 a12 1 # done in the original ERCC compilers. There was no strong reason for doing d23 2 d27 1 a27 1 # then the control drops through to a default handler for every rule which recursively d31 7 a37 2 # those tuples in C form, but at this early stage of development that separation # had barely been adhered to and the majority of C output is done in compile(). d44 5 d58 7 a64 2 P = ; a65 3 # Built-in phrases are known to the parser and are introduced by B<...> # Note that the equivalent functionality in an ERCC parser is # implemented by parse-time rules in this parser. (See C<...> below). d78 4 a81 1 d180 1 a180 1 // return -1; // simpler but generated C would not include the unary + d222 1 a222 1 P = '==', '>=', '>', '##', '\\=', '<>', '<=', '<', '->', '=', '#', '\\==' { d225 2 a226 1 const char *C_Comp[] = { "== &", ">=", ">", "!= &", "!=", "!=", "<=", "<", "*BADCOMP*", "==", "!=", "!= &" }; d232 1 a232 1 P = ; d234 1 d239 12 a250 4 VTag = SubPhraseIdx(Ph, 2)&AST_idx_mask; // Use the scope tools to find it. if (SVTag != NULL) free(SVTag); // Sorry. Should use Stringpool. int VarIDX = lookup("decl", SVTag=Tag2Str(VTag)); d252 3 a254 2 //fprintf(stderr, "Debug VARNAME: VarIDX returned by looking up %s was %d\n", SVTag, VarIDX); return t[0] = P_mktuple(AST_RVALUE, 0/*alt no*/, 1/*phrases*/, t); d258 16 a273 3 P = «[A-Z][A-Z0-9]*» { Tag = SubPhraseIdx(Ph, 2)&AST_idx_mask; // Set a global for use by return -1; d281 31 a311 1 P = «'([^\']|'')*'»; # , , ; d313 1 a313 1 P = «''''», «''», «'.'»; # , , ; d315 1 a315 2 P = '@@', ''; P = '.', ''; d322 1 d326 1 a326 2 P = { // Pushes an AST_LITERAL: // PushConst(compile(P(1), depth+1)); // more complex than that but this will do for basic testing during development. a327 2 // alt=0: symbolic constant // alt=1: literal constant d329 5 a333 1 PushConst(P_mktuple(AST_RVALUE, 1/*alt no*/, 1/*phrases*/, t)); d335 3 a337 1 return -1; d340 1 a340 1 P = «[0-9][0-9]*»; d342 6 a347 4 P = '"' ; P = «.» ; P = , , ''; P = ; d349 1 a349 1 P = ';'; d352 31 a382 1 return -1; // suppressing extraneous data during development. Will want it back later. d389 1 a389 1 P = «.» , ''; # comment text up to a statement separator. (Differs from Imp77 comments) d396 1 a396 1 P = '_', # TO DO!!!!! d398 3 a400 5 // If APP is present then VARNAME must be an array. // If it is a record array the Opt_Record_Field may be present, // otherwise Opt_Record_Field must not be present. // If APP is not present, then VARNAME can be any field in the parent record. return -1; a420 17 #ifdef NEVER // int RValue = compile(P(1), depth+1); // A P_mktuple(AST_RVALUE, {VarIDX}) if (P_P(RValue,1) == -1) { fprintf(stderr, "* NAME NOT SET: %s\n", SVTag); // exit(1); } //int VarIDX = P_P(RValue, 1); //int VTag = VarDecl[VarIDX].varname; PushVar(RValue); //DebugVarIDX(VarIDX); return -1; #else d422 1 d424 3 d428 1 a428 1 #endif d465 1 a465 1 // otherwise, compile or , which will push them on the TORP stack d502 1 a502 1 fprintf(stderr, "* NAME NOT SET: %s\n", SVTag); // exit(1); d510 1 a510 1 P = , , '('')' ; d526 1 a526 1 d529 1 d531 1 a531 2 }; P = ; d534 1 a534 1 '*'; d536 1 a536 1 P = ; d559 1 d565 1 a565 1 ''; d568 1 a568 1 ''; d580 2 a581 3 // drop the '()' which is put back in by AST_PROC_CALL generate_c(Expr, depth+1); compile(P(3), depth+1); a588 1 fprintf(stdout, ", "); d590 2 a591 2 generate_c(Expr, depth+1); compile(P(4), depth+1); d603 1 d621 3 a623 2 if (P_P(Control,1) == -1) { fprintf(stderr, "* NAME NOT SET: %s\n", SVTag); // exit(1); d658 1 a658 5 P = "alias", ''; #P = ',', # added to enable implicit continuation in record formats (TO DO now needs to be moved) # ''; d669 7 a675 1 P = ; d677 1 a677 2 P = "integer", ''; d682 1 d706 2 a707 1 if (debug_declarations > 1) fprintf(stderr, "[ Sign=%d Prec=%d Type=%d ]", Signedness, Precision, Basetype); d716 1 d724 1 d743 1 a743 1 fprintf(stdout, ", /*a*/"); d796 1 a796 1 ''; d807 1 d814 1 d824 1 a824 1 P = ',', '' ; d828 1 d841 1 a841 1 // end of program a849 1 if (debug_compiler) fprintf(stderr, "[end of program]"); d852 1 a852 2 // end of perm if (debug_compiler) fprintf(stderr, "[end of perm]"); d854 1 a854 1 // end of file a857 1 if (debug_compiler) fprintf(stderr, "[end of file]"); d859 1 a859 2 // end of list if (debug_compiler) fprintf(stderr, "[end of list]"); d861 3 a863 3 // %end (begin or Rt/Fn/Map) if (ctrl_depth() == 0) { fprintf(stderr, "\n* Spurious %%END\n"); exit(1); d870 1 a870 2 if (debug_compiler) fprintf(stderr, "[end of 'begin' block]"); fprintf(stdout, "\n}; /* End of Rt/Fn/Map */\n"); d872 1 d880 1 d886 1 a886 1 return -1; // lt the parent level handle it. d930 1 d934 1 a958 4 P = , '' { if (debug_compiler) fprintf(stderr, (alt == 1 ? "[procedure call]" : "[assignment]")); }; d969 1 d977 1 d1038 1 a1038 1 fprintf(stdout, "; /*d*/"); d1076 1 d1108 1 d1114 1 a1114 2 P = '=', ''; d1117 1 d1120 1 a1120 2 P = ',', ''; d1122 1 a1122 2 P = '('')', ''; d1126 8 a1133 1 ''; d1135 1 a1135 2 P = "event", ''; d1181 1 a1184 1 if (debug_compiler) fprintf(stderr, "[until condition]"); d1206 1 a1207 1 if (debug_compiler) fprintf(stderr, "[FINISHELSEQ]"); d1210 1 a1210 1 // recurse on AFTERELSE d1217 1 a1219 1 if (debug_compiler) fprintf(stderr, "[afterelse: start]"); d1222 1 a1223 1 if (debug_compiler) fprintf(stderr, "[afterelse: if or unless start]"); d1229 1 a1229 1 if (debug_compiler) fprintf(stderr, "[afterelse: ]"); d1232 1 a1232 1 // recursively handle alt 2 d1239 1 d1241 1 a1241 2 if (debug_compiler) fprintf(stderr, "[ELSEQ]"); // recurse d1249 1 a1251 1 if (debug_compiler) fprintf(stderr, "[then start]"); d1254 2 a1255 2 if (debug_compiler) fprintf(stderr, "[then UI]"); // recursively handle the rest d1259 1 a1259 2 P = '=', ''; d1266 1 d1273 2 a1274 1 if (debug_compiler) fprintf(stderr, "[record format statement]"); d1279 1 a1279 1 ; d1284 2 a1285 1 if (debug_compiler) fprintf(stderr, "[more record format declarations]"); d1289 2 a1290 1 if (debug_compiler) fprintf(stderr, "[record format declaration]"); d1296 1 d1298 1 a1298 1 if (debug_compiler) fprintf(stderr, "[record format element]"); d1303 2 a1304 1 if (debug_compiler) fprintf(stderr, "[alternative record format declaration]"); d1311 1 a1311 1 P = «.» , ''; # TO DO: semicolons in an asm statement eg *LD_1,';' d1314 6 a1319 7 // All other occurances of are in the context of an which // goes through the TORP process to handle operator priority, so the // result of compiling a is -1 with the return of the real result // being postponed until an returns. // This phrase here where is used as an LHS or as a procedure call // needs special handling. This is why the two calls below take P(1) // rather than compile(P(1)) a1320 2 //fprintf(stderr, "P(1): "); Diagnose(P(1)); //fprintf(stderr, "Before compile(P(1)), DataStack_nextfree=%d\n", DataStack_nextfree); d1322 8 a1329 8 //fprintf(stderr, "After compile(P(1)), DataStack_nextfree=%d\n", DataStack_nextfree); Op_or_Data TOSPop = DataStack(--DataStack_nextfree); ; // NOT ExPop() if (TOSPop.type != 'D') { fprintf(stderr, "** Internal error in %s at line %d\n", __FILE__, __LINE__); exit(1); } int Var = TOSPop.idx; // GETTING -1 //fprintf(stderr, "Popped "); Diagnose(Var); if (alt == 0) { d1333 1 d1335 5 a1339 3 } else { int Call = Mk_AST_procedure_call(Var, depth); generate_c(Call, depth+1); d1352 3 a1354 2 P = '->' { // CHANGED FROM TO NEW to restrict to *one* index. if (debug_compiler) fprintf(stderr, "[go to]"); d1362 1 a1362 1 fprintf(stdout, "* ERROR: %%switch %s NOT DECLARED\n", SVTag); // exit(1); d1364 1 a1364 1 fprintf(stdout, "%s", SVTag); d1374 3 d1378 1 a1378 1 fprintf(stdout, "%s", SVTag); d1384 1 d1386 1 a1386 1 if (debug_compiler) fprintf(stderr, "[return]"); d1390 1 a1390 1 if (debug_compiler) fprintf(stderr, "[result]"); d1404 3 a1406 3 if (debug_compiler) fprintf(stderr, "[print diagnostics]"); fprintf(stdout, "assert(FALSE)"); // drop through to handle for now? NO! TO DO: Add return -1; d1410 1 a1410 1 if (debug_compiler) fprintf(stderr, "[stop program]"); d1415 4 d1420 1 a1420 1 if (debug_compiler) fprintf(stderr, "[signal event]"); d1422 3 a1424 8 compile(P(2), depth+1); fprintf(stdout, ", "); compile(P(3), depth+1); if (0) { // if is not the null alt fprintf(stdout, ", "); compile(P(4), depth+1); fprintf(stdout, ")"); } d1429 1 d1431 1 a1431 1 if (debug_compiler) fprintf(stderr, "[exit from cycle]"); d1435 1 d1437 1 a1437 1 if (debug_compiler) fprintf(stderr, "[continue at end of cycle]"); d1477 1 a1477 2 P = "and" { }; d1495 1 a1495 1 ; d1503 1 a1503 1 ; d1505 2 a1506 2 P = , ''; P = ; d1508 20 a1527 3 P = '(*):', # sets a flag that tells the line reconstruction to preserve '(''):', # spaces if immediately followed by a ! or %COMMENT statement. ':' { d1530 7 a1536 3 // need to be handled a bit differently from variables? Yeah, making that change... if (debug_compiler || debug_declarations) fprintf(stderr, "[label]"); d1540 1 a1540 1 fprintf(stderr, "* ERROR: %%switch %s NOT DECLARED\n", SVTag); // exit(1); d1544 1 a1544 1 fprintf(stderr, "* ERROR: LABEL %s SET TWICE\n", SVTag); // exit(1); d1549 1 a1549 1 fprintf(stdout, "%s_default:", Tag2Str(LabelTag)); d1553 1 a1553 1 fprintf(stdout, "%s_", Tag2Str(LabelTag)); d1557 3 d1562 1 a1562 1 fprintf(stdout, "%s: ", Tag2Str(LabelTag)); // TO DO: This outputs a ' ' after the tag. We should suppress the space at the d1603 2 a1604 1 if (debug_compiler) fprintf(stderr, "[if or unless]"); d1620 1 a1620 1 if (debug_compiler) fprintf(stderr, "[cycle]"); d1633 1 a1636 1 if (debug_compiler) fprintf(stderr, "[repeat]"); d1642 1 a1642 1 if (debug_compiler) fprintf(stderr, "[auto declaration]"); d1652 1 a1652 1 # add to the name tables before pushing d1657 1 a1657 1 # immediately after is handled at the previous d1679 1 a1679 1 if (debug_compiler) fprintf(stderr, "[procedure declaration]"); d1709 1 a1709 1 // if (debug_compiler) fprintf(stderr, "[own/external/const declaration]"); d1721 5 a1725 2 if (debug_compiler) fprintf(stderr, "[on event block]"); fprintf(stdout, "if (_imp_onevent(/*to do*/)) {\n"); d1727 1 d1731 1 a1731 1 if (debug_compiler) fprintf(stderr, "[record format declaration]"); d1734 1 d1738 2 a1739 1 if (debug_compiler) fprintf(stderr, "[end of something]"); d1743 1 a1749 1 if (debug_compiler) fprintf(stderr, "[push an include file]"); d1751 1 d1755 2 a1756 2 if (debug_compiler) fprintf(stderr, "[begin block]"); if (ctrl_depth() == 0) { d1758 1 a1758 1 fprintf(stdout, "\n#include \nint main(int argc, char **argv) {\n"); d1763 1 d1791 2 a1792 1 if (debug_compiler) fprintf(stderr, "[turn on listing]"); d1797 1 a1797 1 if (debug_compiler) fprintf(stderr, "[(finish) else (start)]"); d1811 1 d1815 1 d1817 1 a1817 1 if (ctrl_depth() == 0) { a1830 1 if (debug_compiler) fprintf(stderr, "[finish]"); d1834 1 d1836 1 a1836 1 if (debug_compiler) fprintf(stderr, "[embedded assembly code]"); d1840 2 a1841 1 if (debug_compiler) fprintf(stderr, "[trusted program pragma]"); d1845 1 a1845 1 if (debug_compiler) fprintf(stderr, "[use this routine for main()]"); d1848 1 a1848 1 d1853 1 a1853 1 if (debug_compiler) fprintf(stderr, "[%%control - fine control over compiler options]"); d1856 4 a1859 1 P = '' { // print a semicolon after certain statements. This is for C's benefit. d1861 1 d1864 1 a1864 1 # All there were subdivided in order to make interfacing to C easier! d1869 7 d1880 8 d1894 1 a1898 1 , d1911 76 a1986 1 ; @ 1.20 log @*** empty log message *** @ text @d67 2 a68 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d72 2 a73 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d77 2 a78 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d82 2 a83 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d87 2 a88 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d92 2 a93 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d97 2 a98 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d102 2 a103 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d107 2 a108 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d112 2 a113 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d117 2 a118 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d122 2 a123 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d127 2 a128 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d132 2 a133 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d137 2 a138 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d142 2 a143 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d147 2 a148 1 return t[0] = P_mktuple(P_BINOP, alt, 1/*phrases*/, t); d163 1 d165 1 d168 1 d171 1 d174 1 d176 1 a176 1 return t[0] = P_mktuple(P_MONOP, alt, 1/*phrases*/, t); d191 1 d194 1 d197 1 d200 1 d202 1 a202 1 return t[0] = P_mktuple(P_ASSOP, alt, 1/*phrases*/, t); d217 1 d227 1 d296 1 d317 1 a317 1 d333 5 d356 1 a356 1 if (alt == 2) { d363 4 a366 8 // a + (b + c) is coming out as (a + b + c) - it looks like the TORP // is picking up the "a +" that is already on the stack and adding it to // the Expr. I suspect what is needed is to seal a false bottom on the // stack so that Expr is constructed independent of anything preceding it. // *OR* maybe all the problem is that at this point, the operator is on the // operator stack and not the output stack. Actually that suggests that we // need false bottoms on *both* stacks. Or maybe just to stop evaluating // when we get to a "(" on the stack. [*** FIXED ***] d368 1 a368 7 int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; int Expr = compile(P(2), depth+1); t[1] = Expr; DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; d422 17 a438 3 t[1] = ExPop(); t[2] = P(1); return P_mktuple(AST_EXPRESSION, 0, 2, t); d448 1 d450 7 d459 4 a462 1 Expr = P_mktuple(AST_EXPRESSION, 0, 2, t); d478 6 d487 1 d489 1 a489 1 generate_c(compile(P(2), depth+1), depth+1); d499 2 a500 1 generate_c(compile(P(3), depth+1), depth+1); d534 5 d540 1 a540 1 generate_c(compile(P(3), depth+1), depth+1); // Actually I'm not sure why just works here. I thought I needded to do the false bottom thing? (TO DO) d544 1 a544 1 generate_c(compile(P(7), depth+1), depth+1); d548 1 a548 1 generate_c(compile(P(5), depth+1), depth+1); d792 1 d797 2 a798 1 generate_c(compile(P(1), depth+1), depth+1); // TO DO: s need false bottom! d800 2 a801 1 generate_c(compile(P(3), depth+1), depth+1); d807 1 a807 1 generate_c(compile(P(3), depth+1), depth+1); d809 2 a810 1 generate_c(compile(Expr3, depth+1), depth+1); d812 1 a812 1 compile(P(4), depth+1); d814 3 a816 3 compile(P(2), depth+1); compile(P(3), depth+1); } else { d1205 27 a1231 86 P = { { int Ass = compile(P(4), depth+1); // , '' if (P_alt(Ass) == 0) { // 0 assignment 1 procedure call // TO DO: under development int LHS = compile(P(1), depth+1); if (P_P(LHS,1) == -1) { fprintf(stderr, "* NAME NOT SET: %s\n", SVTag); // exit(1); } // If the LHS is a %map, // If is present, // access the field via LHS(APP)->field // otherwise // Indirect through the LHS to assign the value, i.e. *LHS(APP) = RHS // Similarly, if the LHS is an array: // If the LHS is a recordarray, then may be applied. // (with "." vs "->" depending on %name) // otherwise must be empty, and VARNAME(APP) is evaluated // simply, as an LHS destination. // Some of this code might want to be factored out to support P which will // duplicate much of what is being described here. // Tests for invalid type combinations will be needed as well as type casts // where allowed (and esp. for "<-" jam transfers) // LHS = ; // Using the struct here would reduce the risk of putting a value in the wrong field! t[1] = P_P(Ass,1); // ASSOP t[2] = LHS; // LHS t[3] = P_P(Ass,2); // RHS generate_c(P_mktuple(AST_ASSIGN, 0, 3, t), depth+1); } else { // simple procedure call. This is probably the easiest one to implement so I'll // tackle it first. The only 'fancy' feature needed here is pairing off the // actual parameters with the formal parameters, type checking them, casting // if necessary, and causing an error if there are too many or too few actual // parameters. // extern int handle_ast_phrase(int Ph, int depth); // (void)handle_ast_phrase(compile(P(1), depth+1), depth+1); // returns AST_RVALUE int RValue = compile(P(1), depth+1); // A P_mktuple(AST_RVALUE, {VarIDX}) int VarIDX = P_P(RValue, 1); int VTag = VarDecl[VarIDX].varname; char *s = Tag2Str(VTag); //fprintf(stderr, "Debug VAR: VarIDX returned was %d for %s\n", VarIDX, s); PushVar(RValue); int TypeIDX = VarDecl[VarIDX].type; //DebugVarIDX(VarIDX); int param = TypeDecl[TypeIDX].parms; if (param == -1) { // fprintf(stderr, "** NO PARAMETERS\n"); } else { for (;;) { // fprintf(stderr, "** PARAMETER is %d\n", param); param = TypeDecl[param].parms; if (param == -1) break; } } // Now we have a linked list of parameters and P(2) which is P = '('')', '' // with P = ',', '' // so (TO DO) evaluate each parameter in turn, with in the context of the parameter type. #ifdef NEVER fprintf(stdout, "%s(", s); generate_c(compile(P(2), depth+1), depth+1); fprintf(stdout, ")"); #else t[1] = VarIDX; t[2] = param; t[3] = P(2); int ProcCall = P_mktuple(AST_PROC_CALL, 0, 3, t); // This could be preferred if it worked. Which at the moment it seems not to. generate_c(ProcCall, depth+1); #endif } return -1; d1233 1 d1277 2 a1283 4 int SaveOperBottom = OperStack_bottom; OperStack_bottom = OperStack_nextfree; int SaveDataBottom = DataStack_bottom; DataStack_bottom = DataStack_nextfree; a1284 3 DataStack_bottom = SaveDataBottom; OperStack_bottom = SaveOperBottom; a1286 1 if (debug_compiler) fprintf(stderr, "[result]"); d1291 1 d1293 1 a1293 2 if (debug_compiler) fprintf(stderr, "[print diagnostics]"); // go on to handle for now. d1297 1 a1298 1 if (debug_compiler) fprintf(stderr, "[stop program]"); d1303 1 a1312 1 if (debug_compiler) fprintf(stderr, "[signal event]"); @ 1.19 log @*** empty log message *** @ text @d24 14 d40 3 a42 1 # (Include files are denoted using I<...>; d54 5 a58 4 '**', '*', '!!', '!', '//', '/', '>>', '<<', '.', '\\\\', '\\', '^^', '^' { d60 1 a60 1 // (the default for char syms is separate items for each char which is unweildy) d62 2 a63 2 // It would be cleaner to return the OP_xxx enum rather than the string. Will do so later. // (The string is decoded in PushBinOp()) d260 26 d288 3 d336 1 a336 1 // when we get to a "(" on the stack. d580 1 a580 1 (void)Declaration(FALSE); d586 1 a586 1 (void)Declaration(TRUE); d599 1 a599 1 (void)Declaration(FALSE); d605 1 a605 1 (void)Declaration(TRUE); d622 1 a622 1 (void)Declaration(FALSE); d628 1 a628 1 (void)Declaration(TRUE); d818 47 a864 1 ''; d871 1 a871 1 (void)Declaration(FALSE); d877 1 a877 1 (void)Declaration(TRUE); a890 24 P = '('':'')' { // TO DO: list of upper/lower bounds }; P = ',', ''; P = { compile(P(3), depth+1); // compile(P(1), depth+1); // (void)Declaration(FALSE); { int MoreDecls = P(2); // ',' , '' { while (P_alt(MoreDecls) == 0) { fprintf(stdout, ", /*e*/"); compile(P_P(MoreDecls, 3), depth+1); // (void)Declaration(TRUE); MoreDecls = P_P(MoreDecls, 4); } } fprintf(stdout, ";\n"); return -1; }; d901 1 a901 1 (void)Declaration(FALSE); fprintf(stdout, ";\n"); d909 1 a909 1 compile(P(5), depth+1); // d911 1 a911 1 (void)Declaration(FALSE); fprintf(stdout, ";\n"); d922 1 a922 1 (void)Declaration(FALSE); d983 1 a983 1 (void)Declaration(FALSE); d989 1 a989 1 (void)Declaration(FALSE); a1089 3 P = '_', # TO DO!!!!! ''; d1141 1 d1143 1 d1145 1 d1151 18 d1171 3 a1173 3 t[1] = P_P(Ass,1); t[2] = LHS; t[3] = P_P(Ass,2); d1177 5 a1181 1 // procedure call d1183 1 a1183 2 extern int handle_ast_phrase(int Ph, int depth); a1206 1 fprintf(stdout, "%s(", s); d1210 4 a1213 2 // so evaluate each parameter in turn, with in the context of the parameter type. d1216 7 a1222 1 //generate_c(P_mktuple(AST_PROC_CALL, 0, 2, t), depth+1); // This could be preferred if it worked. Which at the moment it seems not to. d1397 1 a1397 1 // MAYBE USE RATHER THAN and handle declaration here rather than in Declaration() since labels d1427 2 a1428 2 // non-existent and calling Declaration() below outputs random crap. I'm not 100% sure that I should even // be using the 'Declaration()' method to handle the labels themselves. Maybe it would be more sensible to d1431 1 a1431 1 // TO DO: (void)Declaration(FALSE); d1550 1 a1550 1 ParentVarIDX = Declaration(FALSE); // BUT TO DO: NEEDS MORE WORK d1634 1 a1634 1 (void)Declaration(FALSE); d1708 9 a1716 1 # All there were subdivided in order to make interfacing to C easier!: @ 1.18 log @*** empty log message *** @ text @d121 1 a121 1 PushBinOp(compile(P(1),depth+1)); // TORP d142 1 a142 1 int Expr = compile(P(1),depth+1); d174 9 a182 16 { // We want to return an AST record with the VarDecl contents // (which includes the TypeDecl information) char *s; VTag = SubPhraseIdx(Ph, 2)&AST_idx_mask; // Use the scope tools to find it. int VarIDX = lookup("decl", s=Tag2Str(VTag)); if (VarIDX == -1) { fprintf(stderr, "* NAME NOT SET: %s\n", s); // exit(1); // return a literal string somehow. } t[1] = VarIDX; fprintf(stderr, "Debug VARNAME: VarIDX returned by looking up %s was %d\n", s, VarIDX); free(s); // Sorry. Should use Stringpool. return t[0] = P_mktuple(AST_RVALUE, 0/*alt no*/, 1/*phrases*/, t); } a184 1 d211 1 a211 1 // PushConst(compile(P(1),depth+1)); // more complex than that but this will do for basic testing during development. d215 1 a215 1 t[1] = compile(P(1),depth+1); d245 5 a249 1 int RValue = compile(P(1),depth+1); // A P_mktuple(AST_RVALUE, {VarIDX}) d294 1 a294 1 int Expr = compile(P(2),depth+1); d333 1 a333 1 d336 1 a336 1 // t[1] = compile(P(1),depth+1); d339 4 d344 1 a344 1 PushConst(compile(P(1),depth+1)); d346 1 d365 1 a365 1 t[2] = compile(P(1),depth+1); // walk the CST expression tree to process each atom sequentially left to right, as input to the shunting yard algorithm d397 1 a397 1 generate_c(compile(P(3),depth+1), depth+1); d425 6 a430 1 generate_c(compile(P(1), depth+1), depth+1); d698 1 a698 1 generate_c(compile(Expr3,depth+1), depth+1); d720 1 a720 1 compile(P(2),depth+1); d723 1 a723 1 compile(P(2),depth+1); d732 2 a733 2 compile(P(2),depth+1); compile(P(3),depth+1); d742 2 a743 2 compile(P(2),depth+1); // NEED () AROUND THE IF IT CONTAINS AN "&&" compile(P(3),depth+1); d904 1 a904 1 generate_c(compile(P(2),depth+1), depth+1); d921 1 a921 1 (void)Declaration(TRUE); d987 3 a989 3 compile(P(1),depth+1); compile(P(2),depth+1); compile(P(3),depth+1); d1076 1 a1076 1 int Ass = compile(P(4),depth+1); // , '' d1079 4 a1082 1 int LHS = compile(P(1),depth+1); d1088 1 a1088 1 generate_c(P_mktuple(AST_ASSIGN, 0, 3, t),depth+1); d1095 1 a1095 1 // (void)handle_ast_phrase(compile(P(1),depth+1), depth+1); // returns AST_RVALUE d1098 1 a1098 1 int RValue = compile(P(1),depth+1); // A P_mktuple(AST_RVALUE, {VarIDX}) d1124 1 a1124 1 generate_c(compile(P(2),depth+1), depth+1); d1126 1 a1126 1 //generate_c(P_mktuple(AST_PROC_CALL, 0, 2, t),depth+1); // This could be preferred if it worked. Which at the moment it seems not to. d1132 10 a1141 1 P = '->' { d1143 20 a1162 5 generate_c(compile(P(3),depth+1), depth+1); if (0) { // if is not empty, generate code for a jump to a switch statement, otherwise a plain 'goto'... fprintf(stdout, "["); compile(P(4),depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. fprintf(stdout, "/*to do*/]"); a1163 1 if (debug_compiler) fprintf(stderr, "[go to]"); d1182 1 a1182 1 int Expr = compile(P(3),depth+1); d1206 1 a1206 1 compile(P(2),depth+1); d1208 1 a1208 1 compile(P(3),depth+1); d1211 1 a1211 1 compile(P(4),depth+1); d1235 2 a1236 2 compile(P(1),depth+1); // must be encapsulated in a {...} block if more that 1 statement compile(P(2),depth+1); d1238 4 a1241 4 compile(P(2),depth+1); compile(P(3),depth+1); compile(P(1),depth+1); compile(P(4),depth+1); d1243 3 a1245 3 compile(P(2),depth+1); compile(P(1),depth+1); compile(P(3),depth+1); d1255 2 a1256 2 compile(P(1),depth+1); fprintf(stdout, "; "); compile(P(2),depth+1); d1259 1 a1259 1 compile(P(1),depth+1); d1273 2 a1274 2 compile(P(1),depth+1); fprintf(stdout, "; "); compile(P(2),depth+1); d1278 1 a1278 1 compile(P(1),depth+1); fprintf(stdout, "; "); d1297 7 a1303 3 P = '(*):', # sets a flag that tells the line reconstruction to preserve '(''):', # spaces if immediately followed by a ! or %COMMENT statement. ':' { d1306 8 a1313 1 compile(P(1),depth+1); LabelTag = Tag; d1315 1 a1315 1 Object = SWITCHDEFN; d1319 1 a1319 1 Object = SWITCHDEFN; d1322 1 a1322 1 generate_c(compile(P(3),depth+1),depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. d1325 1 a1325 1 Object = LABELDEFN; d1330 7 a1336 1 (void)Declaration(FALSE); d1531 1 a1531 1 compile(P(1), depth+1); Object = VAR; // INITDECS d1537 2 a1538 1 // TO DO: Declaration() @ 1.17 log @more working @ text @d186 1 a186 1 //fprintf(stderr, "Debug: VarIDX returned by looking up %s was %d\n", s, VarIDX); d252 11 a262 2 PushVar(compile(P(1),depth+1)); // more complex than that but this will do for basic testing during development. return -1; d313 23 a335 1 # but it will break older programs! d429 1 a429 1 fprintf(stdout, " < "); d528 1 a528 1 Declaration(); d534 1 a534 1 Declaration(); d547 1 a547 1 Declaration(); d553 1 a553 1 Declaration(); d570 1 a570 1 Declaration(); d576 1 a576 1 Declaration(); d773 1 a773 1 Declaration(); d779 1 a779 1 Declaration(); d803 1 a803 1 Declaration(); d809 1 a809 1 Declaration(); d827 1 a827 1 Declaration(); fprintf(stdout, ";\n"); d837 1 a837 1 Declaration(); fprintf(stdout, ";\n"); d848 1 a848 1 Declaration(); d909 1 a909 1 Declaration(); d915 1 a915 1 Declaration(); d1085 29 a1113 1 (void)handle_ast_phrase(compile(P(1),depth+1), depth+1); // returns AST_RVALUE a1114 1 fprintf(stdout, "("); d1287 1 a1287 1 Declaration(); d1292 26 d1405 3 a1407 2 Declaration(); // BUT TO DO: NEEDS MORE WORK d1410 1 a1410 1 /* Parent = ; */ compile(P(7), depth+1); // FPP (or void if none) @ 1.16 log @*** empty log message *** @ text @d165 1 a165 1 const char *C_Comp[] = { "== &", ">=", ">", "!= &", "!=", "!=", "<=", "<", "*ERROR*", "=", "!=", "!= &" }; d173 1 a173 5 { int Tag, VTag; // Tag for declarations, VTag for use. } P = «[A-Z][A-Z0-9]*» { a174 6 // TO DO: If the name was a perm/prim then it disappears in the output! We need to handle that by // 1) actually declaring the built-in names, and // 2) returning something that outputs the name if it wasn't found... (at least during development - // once we're live, the compiler will exit after a "NAME NOT SET" error.) d182 1 a182 1 fprintf(stderr, "* NAME NOT SET: %s\n", Tag2Str(VTag)); // exit(1); d194 2 a195 2 Tag = SubPhraseIdx(Ph, 2)&AST_idx_mask; // NO, not yet. return -1; d424 2 a425 2 P = ',', # added to enable implicit continuation in record formats (TO DO now needs to be moved) ''; d427 1 a427 1 P = ',' , '' { d429 1 a429 2 fprintf(stdout, ", "); compile(P(2), depth+1); d431 1 a431 1 return -1; d433 1 d436 1 a436 1 P = ; d485 1 d494 14 d509 1 d511 1 a511 1 Object = CODE; d515 11 a525 2 compile(P(3), depth+1); // compile(P(4), depth+1); // d527 1 a527 1 } else if (alt == 2) { d529 21 a549 1 // "name" d568 4 a571 1 ''; d630 1 a630 1 fprintf(stdout, "\n} /* End of Rt/Fn/Map */\n"); d716 1 a716 1 { d739 21 a759 1 if (alt == 1) IsArray = ARRAY; d768 17 a784 1 P = ; a787 1 if (alt == 1) IsArray = ARRAY; // Must be set to SCALAR somewhere for default. d789 19 d811 14 a824 1 ''; d846 1 a846 1 P = { d873 21 a893 1 ''; d985 1 a985 1 P = '_', d1001 3 a1003 1 '('')'; d1011 3 a1013 1 ''; d1015 3 a1017 1 '('')'; d1023 1 d1027 3 a1029 1 ''; d1146 1 d1151 1 d1155 1 d1221 1 a1221 1 generate_c(compile(P(4),depth+1),depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. d1229 1 d1266 1 d1277 1 d1309 1 a1309 1 { d1314 13 a1326 6 compile(P(2), depth+1); compile(P(3), depth+1); compile(P(4), depth+1); // Set up Spec compile(P(6), depth+1); compile(P(9), depth+1); compile(P(5), depth+1); // DECLNAME reordered so name handled after params compiled d1328 1 a1328 4 fprintf(stdout, "("); compile(P(8), depth+1); // FPP (or void if none). Not added to scope but attached to procedure definition to match with body later. // ERCC Imp (inc Imp80?) actually doesn't care even if same name is used for multiple parameters fprintf(stdout, ")"); d1331 2 a1332 4 compile(P(7), depth+1); // DOWNQ (assumes param list is in same scope as body of procedure) fprintf(stdout, "("); compile(P(8), depth+1); // FPP (or void if none) fprintf(stdout, ")"); d1340 8 a1347 1 Object = VAR; d1402 3 d1407 2 a1408 1 d1463 3 d1473 1 a1473 1 P = { // print a semicolon after certain statements. This is for C's benefit. @ 1.15 log @before new INITDECS coed @ text @d162 1 a162 1 P = '=', '>=', '>', '#', '\\=', '<>', '<=', '<', '->', '==', '##', '\\==' { d165 1 a165 1 const char *C_Comp[] = { "==", ">=", ">", "!=", "!=", "!=", "<=", "<", "*ERROR*", "== &", "!= &", "!= &" }; d203 1 a203 1 P = «[A-Z][A-Z0-9]*» { d434 1 a434 1 P = ',', # added to enable implicit continuation in record formats d437 11 d495 21 a515 3 P = , , "name" { d532 1 a532 1 P = '('')', d544 3 a546 2 P = , ''; d679 3 a681 1 if (alt == 1) { d684 2 a685 2 // Spec = NO_SPEC; } else Spec = SPEC; d699 1 a699 1 P = , d710 1 a710 1 P = ; d712 2 a713 2 P = , "array" { d718 1 a718 1 P = ',', d767 1 a767 1 P = ',''('':'')', d875 2 a876 2 P = "spec", '('')'; d889 1 a889 1 P = , d1070 3 a1072 3 P = '(*):', # sets a flag that tells the line reconstruction to preserve '(''):', # spaces if immediately followed by a ! or %COMMENT statement. ':' { d1135 4 a1138 1 Object = VAR; d1169 1 a1169 1 { d1171 23 a1193 1 Object = CODE; d1239 1 a1239 1 P = "switch"'('':'')' { a1240 1 Object = VAR; d1247 9 d1308 1 a1308 1 P = "mainep" { @ 1.14 log @*** empty log message *** @ text @d1106 1 a1106 1 P = d1134 1 a1134 1 { d1161 6 d1182 1 a1182 1 P = "switch"'('':'')' { d1188 3 a1190 1 // the scope block with a jump to the default if given, or an error message if not. d1243 1 a1243 1 P = "mainep" { @ 1.13 log @*** empty log message *** @ text @d17 7 d399 3 a401 1 // for (J = I; J < 1; J += -1) ... needs '!=' rather than '<' esp when step is known to be < 0. d405 1 a405 1 generate_c(compile(P(3), depth+1), depth+1); d1130 4 a1133 1 @ 1.12 log @*** empty log message *** @ text @a16 176 # TO DO: a line reconstruction that just works from the current location to or ';' # which can be called after if a guard tells us a '!' or "comment" follows. # One solution I only just now thought of... make line reconstruction an explicit # parse-time grammar call (C<...>) but instead of processing up to the end of line, # stop on a ':' as well. And then insert another call in the grammar after every # colon *except* the one at the RHS of a label, where an alternative line reconstruction # could be applied that preserves the spaces and case within comments for re-outputting # in C ... # The majority of phrases cause code execution only after the entire program # has been parsed and built up into a concrete syntax tree or an abstract # syntax tree. (We support both). These rules are introduced by P<...> # NOTE that the returned value from a child phrase is a tuple returned by P_mktuple(). # By default, these will be P_* phrases that exactly mirror the phrase structure, # however a programmer can construct any kind of tuple they like and return a # private phrase number which merely has to be outside the range used by the # parser generator. This enables the creation of 'proper' AST objects that # reflect the actual structure of the program being compiled. To avoid clashing # tag numbers, I recommend naming tags as AST_* with the index of first tag # set to NUM_GRAMMAR+1, as in: { typedef enum AST_CODE { AST_LVALUE = NUM_GRAMMAR+1, // VarDeclIDX - variable, name, map call AST_RVALUE, // VarDeclIDX - variable, fn call, etc AST_EXPRESSION, // EXPR - replacement for P after applying precedence AST_TYPE_INFORMATION, // TypeDeclIDX inferred type of expressions, constructed bottom-up by inference (eg int * real => real) AST_USER_PARENS, // EXPR -- for generating C source with included user brackets. AST_BINARY_ARITHMETIC_OPERATION, // OP, LEFT, RIGHT AST_UNARY_ARITHMETIC_OPERATION, // OP, ARG AST_ASSIGN, // OP, LEFT, RIGHT AST_PROC_CALL, // PROC, ARGLIST AST_LAST_OP // Do NOT add any after this item. } AST_CODE; // /home/gtoal/src/compilers101/new-parser/imps/tests/progs/imp80pass2.i //%const %byte %integer %array PRECEDENCE(0:20)=0,3,3,4,5,5,4,3,3,4,4,5,5,3,5,5, // 0(3),3,5 //%const %byte %integer %array OPVAL(0:20)=0,ADD,SUB,ANDL,IEXP,REXP,MULT,NONEQ, // ORL,INTDIV,REALDIV,RSHIFT,LSHIFT,ADD,IEXP,REXP,0(3),LNEG,NOTL typedef enum Oper { OP_NONE=0, OP_ADD, OP_SUB, OP_AND, OP_IEXP, OP_REXP, OP_MULT, OP_EOR, OP_OR, OP_INTDIV, OP_REALDIV, OP_RSHIFT, OP_LSHIFT, OP_CONCAT, OP_NEG, OP_NOT, OP_LPAREN, OP_RPAREN } Oper; typedef enum Assoc { Left, Right } Assoc; typedef struct Torpedo { Oper Op; int Prec; int Assoc; int Arity; char *C_left, *C_mid, *C_right; wchar_t *Sym; // Debugging info only } Torpedo; const Torpedo OpDetails [] = { { OP_NONE, 0, Left, 0, "", "*ERROR*", "", L"*ERROR*" }, // (I always prefer to keep 0 free to catch unassigned errors) { OP_ADD, 3, Left, 2, "", " + ", "", L"OP_ADD" }, // "+" { OP_SUB, 3, Left, 2, "", " - ", "", L"OP_SUB" }, // "-" { OP_AND, 4, Left, 2, "", " & ", "", L"OP_AND" }, // "&" { OP_IEXP, 5, Right, 2, "IEXP(", ", ", ")", L"OP_IEXP" }, // "****" or "\\". \\ is the Imp77 preferred operator. Right associative. { OP_REXP, 5, Right, 2, "REXP(", ", ", ")", L"OP_REXP" }, // "**" or "\" \ in Imp77 { OP_MULT, 4, Left, 2, "", " * ", "", L"OP_MULT" }, // "*" { OP_EOR, 3, Left, 2, "", " ^ ", "", L"OP_EOR" }, // "!!" { OP_OR, 3, Left, 2, "", " | ", "", L"OP_OR" }, // "!" or "|" (suprisingly '|' is also allowed for comments! )) { OP_INTDIV, 4, Left, 2, "", " / ", "", L"OP_INTDIV" }, // "//" { OP_REALDIV, 4, Left, 2, "", " / ", "", L"OP_REALDIV" }, // "/" { OP_RSHIFT, 5, Left, 2, "(unsigned)", " >> ", "", L"OP_RSHIFT" }, // ">>" // UNSIGNED needs a size - char/short/int/long/long long { OP_LSHIFT, 5, Left, 2, "", " << ", "", L"OP_LSHIFT" }, // "<<" { OP_CONCAT, 5, Left, 2, "", ".", "", L"OP_CONCAT" }, // "." // The only string operator. All others are invalid. So prio is irrelevant. { OP_NEG, 3, Left, 1, "", "-", "", L"OP_NEG" }, // "-" // Precedence of '-' is surprisingly low, and responsible for the -1>>1 inanity. { OP_NOT, 5, Left, 1, "", "~", "", L"OP_NOT" }, // "\" or "~" { OP_LPAREN, 0, Left, 0, "(", "", "", L"OP_LPAREN" }, // "(" // '(' and ')' do not survive as far as an ExPop() call. { OP_RPAREN, 0, Right, 0, ")", "", "", L"OP_RPAREN" }, // ")" // Note: Imp Modulus (really Absolute value) (originally !expr! and later |expr| ) is not supported in Imp80, and replaced by MOD() or IMOD(). // The C operator '%' (Modulo) is a perm call in Imp77: REM(a,b). }; typedef struct Op_or_Data { char type; // 'O' or 'D' (we'll distinguish between 'C' or 'V' later - constant or variable) int idx; // Torpedo or AST index depending on type above. } Op_or_Data; typedef struct ast_lvalue { int atom; // variable, map, name } ast_lvalue; typedef struct ast_rvalue { int atom; // variable, fn, proc, const etc - effectively an operand. but not an expr. } ast_rvalue; typedef struct ast_expression { int whatever; } ast_expression; typedef struct ast_type_information { int blah; } ast_type_information; typedef struct ast_user_parens { int Expr; } ast_user_parens; typedef struct ast_binary_arithmetic_operation { Oper binop; int left, right; } ast_binary_arithmetic_operation; typedef struct ast_unary_arithmetic_operation { Oper monop; int arg; } ast_unary_arithmetic_operation; typedef struct ast_assign { Oper assop; int LHS, RHS; } ast_assign; typedef struct ast_proc_call { int Proc, Args; } ast_proc_call; // Rough draft // Although the AST is a simple array of ints, it's cleaner if we overlay a struct over the ints // when accessing tuples, so that we don't have to remember offsets into the AST fields, which // from previous experience with the last imptoc was the cause of several coding errors. typedef struct AST_struct { AST_CODE ast_code; // reserved field int op; int alt; int phrases; int Hidden_fields[TUPLE_RESULT_FIELDS]; union { // (anonymous unions are a gcc extension) ast_lvalue lvalue; ast_rvalue rvalue; ast_expression expression; ast_type_information type_information; ast_user_parens user_parens; ast_binary_arithmetic_operation binary_arithmetic_operation; ast_unary_arithmetic_operation unary_arithmetic_operation; ast_assign assign; ast_proc_call proc_call; }; } AST_struct; } a29 89 B=0; B=1; B=2; # Parse-time code is introduced by C<...>: { //#endif int ColonFlag = 0; //#ifdef INITCODE } C = { // Noting that we have just seen a label at parse time allows us // to handle an immediately-following comment differently, i.e. // by avoiding the line reconstruction (if doing so on the fly) // and thus preserving the text of the comment for output by a // source-to-source translator. //ColonFlag = 1; // Note that occurs last in its list of alternatives so if it is // executed at all, the parse is going to be successful. // We should then reset ColonFlag at the point where it is tested (for the comments) return TRUE; }; # BIP(1019) COLON (for label) C = { // We almost certainly need to handle include files at parse time (with a stack of // input files, as I'm doing in takeon.c) return TRUE; }; # BIP(1038) INCLUDE (include file) C = { // listing control has to be done at parse time // (unless there is a *major* restructuring) return TRUE; }; # BIP(1017) LISTON (turn on listing) C = { return TRUE; }; # BIP(1018) LISTOFF (turn off listing) P = , ''; # multiple blank lines allowed, not just one. P = , ''; # Used when there is an optional line break, eg after '=' or ',' in array initialisations. P = , ';' { // fprintf(stdout, "\n"); // obviously this will need some work re {} blocks... }; # required statement separator # Note that the code associated with a grammar rule will be called in three # contexts: 1) when parsing the concrete syntax # 2) when converting the concrete syntax tree to an abstract syntax tree # 3) when walking the abstract syntax tree to perform the main function # of the program, whether actually compiling, or converting source to # source, or just re-outputting the source (possibly reindented). # to support all these options without having to invent a lot of new syntax, the # code part of each rule can restrict when it is executed by testing various # internally-predefined ifdef's. X_CST is the test for use in CST to AST conversion, # and X_AST is the test for the actual application. Currently the C code is *not* # being written to the *-ast.h file by regen.c so that's one less test to clutter up # this file with. # Scope rules: P = { // Set when testing if it *wasn't* a spec. DOWN_flag = 1; }; P = { // Act on the flag set earlier once we get to the body of the procedure. if (DOWN_flag == 1) push_scope_level(); DOWN_flag = 0; }; P = { // unconditionally set in a %begin. Could just put the 'push_scope_level()' in // the code for but for now I'll follow the structure of the // ERCC compilers and grammar. push_scope_level(); }; P = { // called on %end and %endofprogram pop_scope_level(); }; d155 8 a162 1 P = '=', '>=', '>', '#', '\\=', '<>', '<=', '<', '->', '==', '##', '\\=='; d164 1 a164 1 P = '=', '>=', '>', '#', '\\=', '<>', '<=', '<', '->', '==', '##', '\\=='; d172 6 d186 1 d221 5 a225 3 P = { // alt0 var alt1 const //PushConst(compile(P(1),depth+1)); // more complex than that but this will do for basic testing during development. d228 1 d266 8 a273 1 d275 18 a292 3 // I believe I need to ExPop() here, to get the tree representing expr off the stack. // See the equivalent code in ERCC imp80... meanwhile until I test that part, just do... int Expr = ExPop(); d294 3 d302 1 a302 1 // otherwise, compile or d309 7 a338 1 // generate_c(Expr, depth+1); // since these are AST_* phrases we shouldn't hit any undue recursion here... I HOPE! d340 4 d355 8 a362 1 ''; d365 8 a372 1 ''; d375 33 a407 1 "unless"; d409 12 a420 3 P = "while", "until", "for"'='','','; a563 3 P = , '('')', "not"; # %not is '!' in C d565 4 d570 28 a597 2 P = , ''; d601 12 a612 1 ''; d615 8 a622 1 ''; d625 8 a632 1 ''; d695 3 d717 8 a724 2 P = ',', ''; d729 1 a729 1 P = "until", d731 25 a755 1 if (alt == 0) if (debug_compiler) fprintf(stderr, "[until condition]"); d774 1 a774 1 , d784 4 a787 4 fprintf(stdout, "if (1"); // POSSIBLY "if (!())" - eventually will need full DeMorgan code here. Or #define unless(cond) if (!(cond)) compile(P(2),depth+1); compile(P(3),depth+1); fprintf(stdout, ") "); // recurse d792 1 a792 1 // recursively handle alts 1 and 2 a831 2 P = '='','',', ''; d864 1 a864 1 int Ass = compile(P(4),depth+1); d875 1 d877 8 a884 3 t[1] = compile(P(1),depth+1); t[2] = compile(P(2),depth+1); generate_c(P_mktuple(AST_PROC_CALL, 0, 2, t),depth+1); a885 2 int AUI = P(5); compile(AUI,depth+1); d892 1 a892 1 compile(P(2),depth+1); d894 3 a896 4 fprintf(stdout, "_"); compile(P(3),depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. fprintf(stdout, ":"); } else { d910 13 a922 1 compile(P(3),depth+1); d939 1 a939 1 P = "signal" { d946 1 a946 1 compile(P(2),depth+1); a954 1 fprintf(stdout, "break"); d967 1 a967 1 , a972 1 fprintf(stdout, " ("); a973 2 compile(P(4),depth+1); fprintf(stdout, ") "); d985 1 a985 3 ; P = "and" { d987 5 d994 4 d1005 1 a1005 1 compile(P(1),depth+1); d1009 2 d1012 1 a1012 1 // recurse d1035 1 a1035 1 if (alt <= 1) { d1038 7 a1044 7 if (alt == 0) { fprintf(stdout, "%s_default:", Tag2Str(LabelTag)); } else { fprintf(stdout, "%s_", Tag2Str(LabelTag)); compile(P(4),depth+1); // NO: needs to be a decimal constant with special handling for negative numbers. fprintf(stdout, ":"); } d1048 2 a1049 1 fprintf(stdout, "%s: ", Tag2Str(LabelTag)); d1055 1 a1055 4 P = { fprintf(stdout, "if (1"); // POSSIBLY "if (!())" - eventually will need full DeMorgan code here. Or #define unless(cond) if (!(cond)) compile(P(2),depth+1); compile(P(3),depth+1); fprintf(stdout, ") "); a1056 1 // recurse d1059 12 a1070 1 P = "cycle", d1073 1 a1073 1 push_ctrl(REPEAT); d1075 1 a1075 1 fprintf(stdout, "for (;;) /* to do */"); d1077 2 a1078 1 fprintf(stdout, "while/until () /* to do */"); d1081 1 a1081 3 // if it turns out that 'repeat until' is only allowed after a plain 'cycle', then // we might add a REPEATUNTIL enum token to the stack. // recurse d1085 2 a1086 7 { int EXPECTED = pop_ctrl(); if (EXPECTED != REPEAT) { fprintf(stderr, "\n* %%REPEAT not expected here (expecting %s)\n", ctrl_debug[EXPECTED]); exit(1); } fprintf(stdout, "\n} /* possible until */\n"); } d1088 1 d1134 1 a1134 1 fprintf(stdout, "if (_imp_onevent(...)) {\n"); d1215 1 a1215 1 fprintf(stdout, "asm(...)"); d1233 1 a1233 1 fprintf(stdout, "; "); d1238 1 a1238 1 , @ 1.11 log @*** empty log message *** @ text @d78 1 d103 2 a104 2 { OP_IEXP, 5, Right, 2, "IEXP(", ", ", ")", L"OP_IEXP" }, // "****" or "\\". { OP_REXP, 5, Right, 2, "REXP(", ", ", ")", L"OP_REXP" }, // "**" or "\" d107 1 a107 1 { OP_OR, 3, Left, 2, "", " | ", "", L"OP_OR" }, // "!" or "|" d112 2 a113 1 { OP_NEG, 3, Left, 1, "", "-", "", L"OP_NEG" }, // "-" d115 2 a116 1 { OP_LPAREN, 0, Left, 0, "(", "", "", L"OP_LPAREN" }, // "(" d118 2 d251 3 a253 1 P = , ';'; # required statement separator d367 1 a367 1 t[1] = wstrtopool(L"^^"); d375 1 d380 1 d402 1 d440 1 a440 1 //fprintf(stderr, "Debug: VarIDX returned by looking up %s was %d\n", s, VarIDX); d449 1 d477 1 d504 1 d533 1 d548 1 a559 43 P = { { #ifdef NEVER // Although a straight walk of the tree would re-output the expression in a way that // looks very similar to the original, we must not do that because C's operator priorities // are quite different from Imp's. So we need to build up a proper AST tree for the expression // (probably by converting to reverse polish same as the ERCC compilers did) and the // re-outputting that tree with any brackets added that might be required to have C match // the original operator priorities. // Note that temporarily, variables are only the top-level tag, I haven't yet added // parameters, indexes or record fields. // Also to make this work I need to build VarDecl descriptors for consts as well. // Maybe that will have to be ConstDecl descriptors. // currently compiling also has the side-effect of printing the source back out, // so you can spot missing calls to compile by looking at the output to see what's // missing! Note that the P() macro has a hidden parameter 'Ph' which is the // phrase passed in to compile() int monop = t[1] = compile(P_P(Ph, 1), depth+1); int left = t[2] = compile(P_P(Ph, 2), depth+1); int restofexpr = t[3] = compile(P_P(Ph, 3), depth+1); PushMonOp(monop); while (restofexpr != -1) { //if (P_alt(restofexpr) == 0) { int sub_monop = P_P(restofexpr, 1); int sub_left = P_P(restofexpr, 2); restofexpr = P_P(restofexpr, 3); fprintf(stderr, "sub_monop: %08x\n", sub_monop); fprintf(stderr, " sub_left: %08x\n", sub_left); fprintf(stderr, "sub_rexpr: %08x\n", restofexpr); //} } return t[0] = P_mktuple(P_RESTOFEXPR, alt, 3, t); #endif } // let the normal code handle it: ... }; d576 1 a576 1 P = "while", d609 1 a609 1 case 4: Signedness = SIGNED; Precision = BYTE; Basetype = INTEGER; break; // byte (integer)? d654 9 d664 2 a665 5 P = ',', # due to an uncharacteristic example of laziness in the original imp80 grammar, the ''; # comma in a formal parameter list was actually optional, and surprisingly several # ERCC Imp programs actually took advantage of this, eg: # %externalintegerfnspec jim(%real a,b %integername c) # So I should probably introduce an = ',', ''; phrase. d681 1 a681 1 int EXPECTED = pop_ctrl_inner(262,"imp80.g"); d687 1 d705 1 a705 1 int EXPECTED = pop_ctrl_inner(280,"imp80.g"); d711 1 a830 37 P = , , ; P = "start", "then""start", "then" { if (alt < 2) { /* NOTE: quotes in comments like this are causing problems with takeon... This is what we are trying to do with the control stack. Currently its just a note of whats allowed - theres no associated data such as an AST block. {} begin END ifI=2start FINISHELSE END finishelseifJ=3start FINISHELSE END finishelsestart FINISH END finish END endofprogram {} */ push_ctrl(FINISHELSE); if (debug_compiler) fprintf(stderr, "[then start]"); } else { if (debug_compiler) fprintf(stderr, "[then UI]"); } }; P = "and", ''; P = "else", '' { // if (alt == 0) if (debug_compiler) fprintf(stderr, "[IRRELEVANT ELSE]"); }; d836 1 a836 1 int EXPECTED = pop_ctrl_inner(400,"imp80.g"); d838 4 a841 1 if (alt == 0) if (debug_compiler) fprintf(stderr, "[else]"); d843 1 d848 1 a848 1 { d851 3 a853 1 if (debug_compiler) fprintf(stderr, "[else start]"); d856 5 a860 1 if (debug_compiler) fprintf(stderr, "[else if start]"); d862 2 a863 1 if (debug_compiler) fprintf(stderr, "[else ]"); d865 1 d868 25 a892 1 P = '_', d937 1 a937 1 P = { d962 8 d971 1 d975 1 d980 3 d984 1 d987 2 a988 1 P = "monitor" { d990 1 d994 1 d996 1 d1000 9 d1010 1 d1014 2 d1020 1 d1024 50 a1073 2 P = , , a1075 1 , d1088 2 d1093 7 d1103 1 d1105 2 a1108 2 P = ; d1110 3 d1114 1 d1118 1 a1118 1 "cycle" { d1121 6 d1129 1 d1134 1 a1134 1 int EXPECTED = pop_ctrl_inner(530,"imp80.g"); d1138 1 d1186 1 d1202 1 d1207 7 a1213 1 if (ctrl_depth() == 0) push_ctrl(ENDOFPROGRAM); else push_ctrl(END); d1219 4 d1232 1 a1232 1 int EXPECTED = pop_ctrl_inner(607,"imp80.g"); d1235 1 d1250 1 a1250 1 fprintf(stderr, "\n* Spurious %%FINISH\n"); exit(1); d1252 2 a1253 1 int EXPECTED = pop_ctrl_inner(626,"imp80.g"); d1261 1 d1267 1 d1280 1 d1284 4 a1288 1 d1290 1 a1290 1 , d1307 5 a1311 4 , , , , @ 1.10 log @*** empty log message *** @ text @d109 1 a109 1 { OP_RSHIFT, 5, Left, 2, "(unsigned)", " >> ", "", L"OP_RSHIFT" }, // ">>" d634 1 a634 1 case 1: Signedness = SIGNED; Precision = LONGWORD; Basetype = FLOAT; break; // real d744 1 a744 1 "not"; d750 2 a751 2 P = "and", "or", @ 1.9 log @*** empty log message *** @ text @d8 11 d22 8 d34 1 a34 1 # NOTE that the returned value from a child phrase is a tuple returned by mktuple(). d430 1 a430 1 fprintf(stderr, "Debug: VarIDX returned by looking up %s was %d\n", s, VarIDX); d453 1 a453 1 P = , a457 1 PushConst(P(1)); d460 8 d512 1 a512 1 // otherwise, compile: d519 1 a519 1 PushConst(P(1)); @ 1.8 log @*** empty log message *** @ text @d11 158 a172 1 a223 21 # The majority of phrases cause code execution only after the entire program # has been parsed and built up into a concrete syntax tree or an abstract # syntax tree. (We support both). These rules are introduced by P<...> # NOTE that the returned value from a child phrase is a tuple returned by mktuple(). # By default, these will be P_* phrases that exactly mirror the phrase structure, # however a programmer can construct any kind of tuple they like and return a # private phrase number which merely has to be outside the range used by the # parser generator. This enables the creation of 'proper' AST objects that # reflect the actual structure of the program being compiled. I recommend # using tags AST_* with the first tag starting at NUM_GRAMMAR+1, as in: { typedef enum AST_CODE { AST_VARIABLE = NUM_GRAMMAR+1, AST_TYPE_INFORMATION, AST_BINARY_ARITHMETIC_OPERATION, AST_UNARY_ARITHMETIC_OPERATION } AST_CODE; } d269 85 a353 2 P = '+', '-', '&', '****', '**', '*', '!!', '!', '//', '/', '>>', '<<', '.', '\\\\', '\\', '^^', '^'; d355 14 a368 1 P = '+', '-', '\\', '~', ''; # BIP(1028) phrase +' (unary operator): d370 20 a389 1 P = '==', '=', '<-', '->'; a396 20 void PrintTag(int Literal) { int i; for (i = atom(Literal).start; i < atom(Literal).end; i++) fprintf(stdout, "%lc", source(i).ch); } char *Tag2Str(int Literal) { int i; char Str[atom(Literal).end-atom(Literal).start+1]; for (i = atom(Literal).start; i < atom(Literal).end; i++) Str[i-atom(Literal).start] = (char)source(i).ch&255; // *UNICODE* not relevant to Imp identifiers. Yet. Str[atom(Literal).end-atom(Literal).start] = '\0'; return strdup(Str); // :-@@ } STRING Str2Pool(char *s) { int p = Stringpool_nextfree; for (;;) { char c = *s++; _Stringpool(Stringpool_nextfree++) = c; if (c == '\0') break; } return p; } d410 2 d413 1 a413 2 t[1] = VarIDX; return t[0] = P_mktuple(AST_VARIABLE, 1/*alt no*/, 1/*phrases*/, t); d438 3 a440 1 «[MBKXRH]»; # Multi Binary oKtal heX Realhex (K'7777' not O'7777' - earlier error) d452 1 a452 1 return -1; d457 1 d461 8 a468 1 P = , d470 18 a487 1 '('')'; d492 3 d496 1 a496 1 P = , , '('')'; d498 6 a503 1 P = ; d508 1 a508 1 P = { d510 12 d544 1 a544 4 fprintf(stderr, "\n"); fprintf(stderr, "monop: %08x\n", monop); fprintf(stderr, " left: %08x\n", left); fprintf(stderr, "rexpr: %08x\n", restofexpr); d546 1 a546 1 if (restofexpr != -1) { d550 1 a550 1 int sub_restofexpr = P_P(restofexpr, 3); d553 1 a553 1 fprintf(stderr, "sub_rexpr: %08x\n", sub_restofexpr); d557 2 d560 1 d564 1 a564 8 '' { if (alt == 0) { int monop = t[1] = compile(P_P(Ph, 1), depth+1); int left = t[2] = compile(P_P(Ph, 2), depth+1); int restofexpr = t[3] = compile(P_P(Ph, 3), depth+1); return t[0] = P_mktuple(P_RESTOFEXPR, alt, 3, t); } else return -1; }; d734 1 a734 3 P = , # ':' was here. Moved to . Maybe try putting back now everything else is fixed... '' { d797 3 d801 1 a801 1 P = ',', d932 22 a953 1 P = { d988 1 a988 1 P = , @ 1.7 log @Adding name tables @ text @d32 5 a36 6 # Currently most of these are holdovers from the ERCC compiler and # are unlikely to be used, at least in the same style. They'll # most likely all be removed, though parse-time rules with some # similarity to these may return in a different form later, when # I get round to implementing the compiler and realise why they # were here in the first place :-) a37 2 C = { return TRUE; }; # MARK (set marker for linkage) C = { return TRUE; }; d44 3 d104 2 a105 1 # being written to the *-ast.h file (regen.c) d110 1 d115 2 a116 3 if (DOWN_flag == 1) { push_scope_level(); } d121 3 d128 1 d166 1 a166 3 //$define _TypeDecl[x] WRITE(x, TypeDeclRA, TypeDecl) //$define TypeDecl[x] READ(x, TypeDeclRA, TypeDecl) a167 1 } d170 12 a181 11 // We want to return an AST record with the VarDecl contents // (which includes the TypeDecl information) char *s; VTag = P(2)&AST_idx_mask; // Use the scope tools to find it. int VarIDX = lookup("decl", s=Tag2Str(VTag)); if (VarIDX == -1) { fprintf(stderr, "* NAME NOT SET: %s\n", Tag2Str(VTag)); // exit(1); } free(s); // Sorry. Should use Stringpool. t[1] = VarIDX; a182 1 return t[0] = mktuple(AST_VARIABLE, 1/*alt no*/, 1/*phrases*/, t); d187 1 a187 1 Tag = P(2)&AST_idx_mask; d236 1 a236 1 P = ; d238 1 a238 1 P = , d241 42 a282 1 P = ; d285 8 a292 1 ''; d306 3 a308 3 P = "while", "until", "for"'='','','; d310 1 a310 1 P = "alias", d382 1 a382 1 P = '('')', d385 1 a385 1 P = ',', # due to an uncharacteristic example of laziness in the original imp80 grammar, the d501 2 a502 2 P = , "array" { d507 1 a507 1 P = ',', d558 1 a558 1 "then" { d601 1 a601 1 , d659 1 a659 1 P = { d720 1 a720 1 P = ; d722 1 a722 1 P = { a743 166 P = { Basetype = UNINIT_BASETYPE; Signedness = UNINIT_SIGNEDNESS; Precision = UNINIT_PRECISION; Proc = UNINIT_PROC; NameInfo = NO_NAME; // UNINIT_AN_N //IsFormat = NO_FORMAT; // UNINIT_ISFORMAT Spec = NO_SPEC; // UNINIT_SPEC IsArray = SCALAR; // UNINIT_ISARRAY // NOTE: A data declaration (e.g. %integer I) at the top level of a file of externals // or before the begin/endofprogram block is an error in Imp77 and I'm guessing in Imp80. Area = (ctrl_depth() == 0 ? EXTDATA : STACK); // UNINIT_AREA // However a %routine at the same level (as opposed to an %externalroutine) appears to be accepted. Linkage = UNINIT_LINKAGE; // Ditto? Check with ERCC compiler. Or ask Bob. }; P = { }; { static int quiet=0; } P = { quiet=1; }; P = { quiet=0; }; P = { // NOT ONLY OUTPUT THE DECLARATION IN C CODE, BUT ALSO ADD DECLARATION TO SCOPED NAME TABLES. if (quiet) return -1; fprintf(stdout, "\n /* "); PrintTag(Tag); fprintf(stdout, ": Object=%s ", Object_name[Object]); if (Object == CODE) { fprintf(stdout, "Proc=%s ", Proc_name[Proc]); fprintf(stdout, "Linkage=%s ", Linkage_name[Linkage]); } else if (Object == VAR) { fprintf(stdout, "Area=%s ", Area_name[Area]); } if (Object == VAR || (Object == CODE && Proc != ROUTINE) || Object == ARRAYFORMAT) { if (Basetype != RECORD && Precision != COMPOUND) fprintf(stdout, "Sign=%s ", Signedness_name[Signedness]); fprintf(stdout, "Prec=%s ", Precision_name[Precision]); fprintf(stdout, "Type=%s ", Basetype_name[Basetype]); fprintf(stdout, "NameInfo=%s ", NameInfo_name[NameInfo]); } if (Object == VAR) { if (Proc != UNINIT_PROC) fprintf(stdout, "Proc=%s ", Proc_name[Proc]); //fprintf(stdout, "format=%s ", IsFormat_name[IsFormat]); fprintf(stdout, "IsArray=%s ", IsArray_name[IsArray]); } fprintf(stdout, "Spec=%s ", Spec_name[Spec]); fprintf(stdout, "*/\n"); switch (Object) { case VAR: if (Area == OWN) fprintf(stdout, "static "); else if (Area == EXTDATA) fprintf(stdout, "extern "); else if (Area == CONSTANT) fprintf(stdout, "const "); if (Basetype == RECORD) { // If the type was given as a format name then let's use the typedef otherwise // we'll need to put the struct definition inline. And remember when defining // a record format, if it refers to itself as a %name field eg to link to a next // item, you need to predefine it as "typedef recfm recfm" before you actually // typedef the contents of recfm! fprintf(stdout, "/*pending name*/recfm "); } else if (Basetype == STRINGTYPE) { fprintf(stdout, "Imp_String "); } else { if (Basetype == INTEGER) { if (Signedness == UNSIGNED) fprintf(stdout, "unsigned "); if (Precision == BYTE) fprintf(stdout, "char "); else if (Precision == SHORT) fprintf(stdout, "short int "); else if (Precision == WORD) fprintf(stdout, "int "); else if (Precision == LONGWORD) fprintf(stdout, "long int "); else if (Precision == QUADWORD) fprintf(stdout, "long long int "); else if (Precision == ADDR) fprintf(stdout, "void *"); } else if (Basetype == FLOAT) { if (Precision == WORD) fprintf(stdout, "float "); else if (Precision == LONGWORD) fprintf(stdout, "double "); else if (Precision == QUADWORD) fprintf(stdout, "long double "); else if (Precision == ADDR) fprintf(stdout, "void *"); } } if (NameInfo == OBJECTNAME) fprintf(stdout, "*"); else if (NameInfo == ARRAYNAME) fprintf(stdout, "**"); PrintTag(Tag); if (IsArray == ARRAY) fprintf(stdout, "[]"); fprintf(stdout, ";\n"); break; case CODE: if (Linkage == EXTPROC) fprintf(stdout, "extern "); else if (ctrl_depth() == 0) { fprintf(stdout, "static "); } else { if (Spec == SPEC) fprintf(stdout, "auto "); // GCC requires "auto" for forward references to nested procedures, but not for the procedures themselves. } if (Proc == ROUTINE) { fprintf(stdout, "void "); } else if (Basetype == RECORD) { fprintf(stdout, "/*pending name*/recfm "); } else if (Basetype == STRINGTYPE) { fprintf(stdout, "Imp_String "); } else { if (Basetype == INTEGER) { if (Signedness == UNSIGNED) fprintf(stdout, "unsigned "); if (Precision == BYTE) fprintf(stdout, "char "); else if (Precision == SHORT) fprintf(stdout, "short int "); else if (Precision == WORD) fprintf(stdout, "int "); else if (Precision == LONGWORD) fprintf(stdout, "long int "); else if (Precision == QUADWORD) fprintf(stdout, "long long int "); else if (Precision == ADDR) fprintf(stdout, "void *"); } else if (Basetype == FLOAT) { if (Precision == WORD) fprintf(stdout, "float "); else if (Precision == LONGWORD) fprintf(stdout, "double "); else if (Precision == QUADWORD) fprintf(stdout, "long double "); else if (Precision == ADDR) fprintf(stdout, "void *"); } } if (NameInfo == OBJECTNAME) fprintf(stdout, "*"); if (Proc == MAP) fprintf(stdout, "*"); PrintTag(Tag); fprintf(stdout, "(...)"); if (Spec == SPEC) fprintf(stdout, ";"); else fprintf(stdout, " {};"); break; case RECORDFORMAT: break; case ARRAYFORMAT: break; case SWITCHDEFN: break; case LABELDEFN: break; case UNINIT_OBJECT: break; } _TypeDecl[TypeDecl_nextfree].Object = Object ; _TypeDecl[TypeDecl_nextfree].Basetype = Basetype ; _TypeDecl[TypeDecl_nextfree].Signedness = Signedness ; _TypeDecl[TypeDecl_nextfree].Precision = Precision ; _TypeDecl[TypeDecl_nextfree].Proc = Proc ; _TypeDecl[TypeDecl_nextfree].NameInfo = NameInfo ; _TypeDecl[TypeDecl_nextfree].Spec = Spec ; _TypeDecl[TypeDecl_nextfree].IsArray = IsArray ; _TypeDecl[TypeDecl_nextfree].Area = Area ; _TypeDecl[TypeDecl_nextfree].Linkage = Linkage ; // TO DO: _TypeDecl[TypeDecl_nextfree].arrfm = -1 ; _TypeDecl[TypeDecl_nextfree].parms = -1 ; _TypeDecl[TypeDecl_nextfree].recfm = -1 ; _TypeDecl[TypeDecl_nextfree].strfm = -1 ; TypeDecl_nextfree += 1; _VarDecl[VarDecl_nextfree].varname = _VarDecl[VarDecl_nextfree].aliasname = _VarDecl[VarDecl_nextfree].c_name = _VarDecl[VarDecl_nextfree].type = TypeDecl_nextfree-1; _VarDecl[VarDecl_nextfree].aform = VarDecl_nextfree += 1; add_entry("decl", Tag2Str(Tag), VarDecl_nextfree-1); }; d745 1 a745 1 P = { d775 1 a775 1 { @ 1.6 log @*** empty log message *** @ text @d81 4 a84 3 AST_BINARY_ARITHMETIC_OPERATION = NUM_GRAMMAR+1, AST_UNARY_ARITHMETIC_OPERATION, AST_IMP_MODULUS d140 1 a140 1 int Tag; d143 22 a164 2 for (i = atom(Literal).start; i < atom(Literal).end; i++) fprintf(stdout, "%lc", source(i).ch); } d166 18 a183 1 P = «[A-Z][A-Z0-9]*» ; # BIP(1001) pname(TRUE) # USED ONLY TO ACCESS A VARIABLE d186 1 a186 1 }; # BIP(1001) pname(TRUE) # USED ONLY TO DECLARE A VARIABLE d720 1 a720 1 P = { d818 1 a818 1 if (Spec == SPEC) fprintf(stdout, ";"); else fprintf(stdout, " {};"); a820 1 break; d832 26 @ 1.5 log @*** empty log message *** @ text @d71 16 d138 7 d146 3 a148 1 P = «[A-Z][A-Z0-9]*» ; # BIP(1001) pname(TRUE) # USED ONLY TO DECLARE A VARIABLE d212 1 a212 1 P = ',' , d297 1 a297 1 P = ',', # due to an uncharacteristic example of laziness in the original imp80 grammar, the d411 1 a411 1 P = ; d414 1 a414 1 "array" { d437 1 a437 2 P = '=' , d542 1 a542 3 P = ',', # added ''; d546 1 d550 3 d556 1 d684 55 a738 8 if (debug_declarations) { fprintf(stderr, "\n[ "); fprintf(stderr, "Object=%s ", Object_name[Object]); if (Object == CODE) { fprintf(stderr, "Proc=%s ", Proc_name[Proc]); fprintf(stderr, "Linkage=%s ", Linkage_name[Linkage]); } else if (Object == VAR) { fprintf(stderr, "Area=%s ", Area_name[Area]); d740 13 a752 5 if (Object == VAR || (Object == CODE && Proc != ROUTINE) || Object == ARRAYFORMAT) { if (Basetype != RECORD) fprintf(stderr, "Sign=%s ", Signedness_name[Signedness]); fprintf(stderr, "Prec=%s ", Precision_name[Precision]); fprintf(stderr, "Type=%s ", Basetype_name[Basetype]); fprintf(stderr, "NameInfo=%s ", NameInfo_name[NameInfo]); d754 21 a774 4 if (Object == VAR) { if (Proc != UNINIT_PROC) fprintf(stderr, "Proc=%s ", Proc_name[Proc]); //fprintf(stderr, "format=%s ", IsFormat_name[IsFormat]); fprintf(stderr, "IsArray=%s ", IsArray_name[IsArray]); d776 18 a793 2 fprintf(stderr, "Spec=%s ", Spec_name[Spec]); fprintf(stderr, "]\n"); d796 1 @ 1.4 log @*** empty log message *** @ text @d14 2 a49 7 # Used to keep track of nested control statements. C = { return TRUE; }; # BIP(1034) NOTESTART (note %START) C = { return TRUE; }; # BIP(1039) DUMMYSTART: give same 'ar' as %ELSE %START C = { return TRUE; }; # BIP(1035) NOTEFINISH (note %FINISH) C = { return TRUE; }; # BIP(1029) NOTECYCLE (note start of %CYCLE) C = { return TRUE; }; # BIP(1036) NOTEREPEAT (note %REPEAT) d200 1 a200 1 P = ',', # added to enable implicit continuation in record formats a205 8 { typedef enum BASETYPE { UNINIT_BASETYPE=0, INTEGER, FLOAT, STRINGTYPE, RECORD } BASETYPE; BASETYPE Basetype; } a211 13 { typedef enum SIGNEDNESS { UNINIT_SIGNEDNESS=0, SIGNED, UNSIGNED } SIGNEDNESS; SIGNEDNESS Signedness; typedef enum PRECISON { UNINIT_PRECISION=0, COMPOUND, BYTE, SHORT, WORD, LONGWORD, QUADWORD } PRECISON; PRECISON Precision; } d225 1 a225 1 case 3: Signedness = SIGNED; Precision = WORD; break; // long (integer | real) a234 8 { typedef enum PROC { UNINIT_PROC=0, NONE, ROUTINE, FN, MAP } PROC; PROC Proc; } d250 3 a252 3 P = , , "name" { a256 7 { typedef enum AN_N { UNINIT_AN_N=0, OBJECT, ARRAYNAME, OBJECTNAME } AN_N; AN_N NameInfo; } d264 1 a264 1 case 2: NameInfo = OBJECT; break; d324 1 a324 8 { typedef enum ISFORMAT { UNINIT_ISFORMAT=0, NOT_FORMAT, IS_FORMAT } ISFORMAT; ISFORMAT IsFormat; } P = "format", d326 3 a328 2 IsFormat = (alt == 0 ? IS_FORMAT : NOT_FORMAT ); if (debug_declarations > 1) fprintf(stderr, "[format=%d]", IsFormat); a354 9 { typedef enum SPECQ { UNINIT_SPEC=0, ACTUAL, SPEC } SPECQ; SPECQ Spec; } d360 1 a360 1 Spec = ACTUAL; d375 2 a376 2 P = , "array" { d386 1 a386 10 P = ; { typedef enum ISARRAY { UNINIT_ISARRAY=0, SINGLE, ARRAY } ISARRAY; ISARRAY IsArray; } d389 2 a390 2 "array" { if (alt == 1) IsArray = ARRAY; // Must be set to SINGLE somewhere for default. a396 7 { typedef enum AREA { UNINIT_AREA=0, OWN, EXTDATA, CONSTANT // extrinsic is an implied %spec } AREA; AREA Area; } d444 2 a445 2 P = "start", "then""start", d488 1 a488 1 P = "start", a507 6 { typedef enum LINKAGE { UNINIT_LINKAGE=0, EXTPROC, SYSTEM, DYNAMIC, PRIM, PERM, AUTO } LINKAGE; LINKAGE Linkage; } d521 2 a522 2 P = "spec", '('')'; d524 1 a524 1 P = , d530 1 a530 1 P = , d592 3 a594 3 P = '(*):', # sets a flag that tells the line reconstruction to preserve '(''):', # spaces if immediately followed by a ! or %COMMENT statement. ':' { d596 7 d611 2 a612 2 P = "cycle", "cycle" { d619 1 a619 1 P = "repeat" { d634 8 a641 5 NameInfo = UNINIT_AN_N; IsFormat = UNINIT_ISFORMAT; Spec = UNINIT_SPEC; IsArray = SINGLE; // UNINIT_ISARRAY; Area = UNINIT_AREA; // or (ctrl_depth() == 0 ? EXTPROC : AUTO) ??? d648 7 d656 1 d659 19 a677 12 fprintf(stderr, "Linkage=%d ", Linkage); fprintf(stderr, "Area=%d ", Area); fprintf(stderr, "Sign=%d ", Signedness); fprintf(stderr, "Prec=%d ", Precision); fprintf(stderr, "Type=%d ", Basetype); fprintf(stderr, "NameInfo=%d ", NameInfo); fprintf(stderr, "Proc=%d ", Proc); fprintf(stderr, "format=%d ", IsFormat); fprintf(stderr, "IsArray=%d ", IsArray); fprintf(stderr, "Spec=%d ", Spec); //fprintf(stderr, "label "); //fprintf(stderr, "switch "); d684 1 d712 1 a712 1 { d714 1 d719 1 d722 1 a722 1 P = "on""start" { d729 2 d746 1 a746 1 P = "switch"'('':'')' { d748 1 d755 1 a755 1 P = "else" { d772 1 a772 1 P = "finish" { d797 1 a797 1 P = "mainep" { d830 1 @ 1.3 log @Declarations well underway, control structures working - looking very promising @ text @d127 2 a128 1 P = «[A-Z][A-Z0-9]*» ; # BIP(1001) pname(TRUE) d166 1 a166 1 P = , d174 1 a174 1 P = , , '('')'; d200 1 a200 1 "for"'='','','; d205 1 a205 1 P = ',', # added to enable implicit continuation in record formats d213 1 a213 1 INTEGER=0, FLOAT, STRINGTYPE, RECORD d227 1 a227 1 IRRELEVANT=0, SIGNED, UNSIGNED d232 1 a232 1 COMPOUND=0, BYTE, SHORT, WORD, LONGWORD, QUADWORD d249 1 a249 1 case 1: Signedness = SIGNED; Precision = LONGWORD; Basetype = INTEGER; break; // real d258 1 a258 1 if (debug_declarations) fprintf(stderr, "[ Sign=%d Prec=%d Type=%d ]", Signedness, Precision, Basetype); d264 1 a264 1 NONE=0, ROUTINE, FN, MAP d273 1 a273 2 } else { Proc = NONE; d281 1 d286 2 a287 1 "name"; d293 1 a293 1 OBJECT = 0, ARRAYNAME, OBJECTNAME d307 1 a307 1 if (debug_declarations) fprintf(stderr, "[NameInfo=%d]", NameInfo); d367 1 a367 1 NOT_FORMAT = 0, IS_FORMAT d375 1 a375 1 if (debug_declarations) fprintf(stderr, "[format=%d]", IsFormat); d405 1 a405 1 ACTUAL = 0, SPEC d418 1 d425 1 d447 1 a447 1 SINGLE = 0, ARRAY d456 1 a456 1 if (debug_declarations) fprintf(stderr, "[IsArray=%d]", IsArray); d464 1 a464 1 OWN=0, EXTDATA, CONSTANT // extrinsic is an implied %spec d481 1 a481 1 if (debug_declarations) fprintf(stderr, "[Area=%d]", Area ); d574 1 a574 1 P = '_', d582 1 a582 1 EXTPROC=0, SYSTEM, DYNAMIC, PRIM, PERM, AUTO d588 3 a590 2 Linkage = alt; if (debug_declarations) fprintf(stderr, "[Proc Linkage=%d]", Linkage); d593 1 a593 1 P = '='','',', d622 1 a622 1 P = { d625 1 a625 1 P = '->' { d701 32 a732 10 Basetype = INTEGER; Signedness = IRRELEVANT; Precision = WORD; Proc = NONE; NameInfo = OBJECT; IsFormat = NOT_FORMAT; Spec = ACTUAL; IsArray = SINGLE; Area = EXTDATA; Linkage = (ctrl_depth() == 0 ? EXTPROC : AUTO); d764 1 a764 1 { d795 1 a795 1 if (debug_declarations) fprintf(stderr, "[switch declaration]"); @ 1.2 log @adding scope handling etc @ text @d1 1 a1 1 # (v.P<.t.>.l20\t.>.i. .20l0r20(v. .e)0 (v/=/r(v/ /e)0i/ /)?pm,m)0 m-0 ( (v/P/\m)0 m (v/ / (v/ /e)0i/ /22 m )0 , m )0 d3 14 a16 1 ## TO DO: fix double-quotes within parsed comments a17 1 # derived from imp80-rde-2022/genps/ps86.dat d19 3 a21 2 # Get the bip code from ~/src/compilers101/imp80-rde-2022/compiler/pass1.c # One or two examples already inserted below. d23 3 a25 3 { #include "symtab.h" } a26 5 C = { #ifdef IN_PARSER // regen etc use this too but don't need a lot of what uparse.c needs. // perform any initialisation required by the parse-time semantic routines. // Note that for now, we have no way of declaring data outside of // those procedures. Obviously this will have to change. d28 1 a28 16 // LINE RECONSTRUCTION *might* GO HERE. But probably not. // initially default scope is at the level appropriate for perms //debug_scope = 1; //push_scope_level(); // Declare perms, prims here //add_entry("decls", "NEWLINE", 42); // param can be anything. Usually index into an array of records of the appropriate type. During initial code creation, we'll just use random tags. // initial top-level file-level scope, eg for %externalroutines //push_scope_level(); // ready for externals #endif return TRUE; }; d30 6 a35 9 C = { #ifdef IN_PARSER // perform any final tidy-up required by the parse-time semantic routines. //pop_scope_level(); // from externals back to perms //pop_scope_level(); // from perms to none. #endif return TRUE; }; d37 8 a44 6 C = { #ifdef IN_PARSER while (source(TP).ch==' ' || source(TP).ch=='\t' || source(TP).ch=='\f') { TP += 1; } #endif d46 1 a46 1 }; d48 6 a53 3 C = { #ifdef IN_PARSER int debug_stropping = 0; d55 5 a59 69 // The source file has already been read trivially into source(). // We will copy from source() into temp(), then perform line reconstruction // on temp(), writing back to source(). The parser will then parse source() // into atoms according to the grammar. Initially it will only store the // reconstructed characters into the atoms, but once it is working, I will // modify it to also store the unreconsructed source for use in source-to-source // translations, where whitespace, embedded comments, and indentation is // desired in the translation, in order to mirror the original file. // Because unfortunately underlining in Unicode is done by a *following* // underline joiner character (818) rather than being a single unicode // code point, it is difficult to use a single-character encoding of a // stropped keyword letter - what the old Imp compilers would represent // by adding 128 to the character. However there *is* an alternive // source of upper case and lower case letters in the mathematics area! // A:Z could be encoded as 1D400:1D419 and a:z as 1D41A:1D433 :-) // but for now I'm encoding keywords in lower case and variables in // upper case. // The 1D400+ encoding looks more or less like ordinary text if it happens // to be displayed (e.g. during debugging) although there should never be // any need to display internally-coded keywords to users of the // compilers built with this parser. // All arrays are flex and the upper bound is a limit, not a minimum. DECLARE(SYM, reconstructed, 128000000/*600000*/); #define _SYM(x) WRITE(x,SYM,reconstructed) #define SYM(x) READ(x,SYM,reconstructed) int LASTP, P = 0; while (source(P).ch != 0 /* WEOF */) { _SYM(P).ch = source(P).ch; _SYM(P).start = P; _SYM(P).end = P+1; P += 1; } _SYM(P).ch = 0 /* WEOF */; _SYM(P).start = P; _SYM(P).end = P; // no chars for EOF LASTP = P; if (debug_stropping) { int I; fprintf(stderr, "source() moved to SYM(0:%d) = \"", LASTP); for (I = 0; I < LASTP; I++) { fprintf(stderr, "%lc", SYM(I).ch); } if (SYM(LASTP).ch != 0) fprintf(stderr, "[%d]", SYM(LASTP).ch); fprintf(stderr, "\";\n"); }; int FP = 0, PP = 0; // Fetch Pointer, Put Pointer. #define DONE() \ do { \ FP -= 1; /* the terminating 0*/ \ _source(PP).ch = 0; \ _source(PP).end = SYM(FP).end; \ if (debug_stropping) { \ int I; \ fprintf(stderr, "SYM(0:%d) moved back to source(0:%d) = \"", FP, PP); \ for (I = 0; I < PP; I++) { \ fprintf(stderr, "%lc", source(I).ch); \ } \ if (source(PP).ch != 0) fprintf(stderr, "[%d]", source(PP).ch); \ fprintf(stderr, "\";\n"); \ } \ return TRUE; \ } while (0) d61 5 a65 1 wint_t WC; d67 1 a67 173 // NOTE THAT WITH THIS IMP77 GRAMMAR, '\n' IS NOT WHITESPACE. LINE ENDINGS ARE EXPLICITLY // ENTERED IN THE GRAMMAR. (See the phrases , and . // uparse.c has been modified so that its implicit whitespace skipping no longer skips '\n'. // (The algol60 parser in contrast treats all \n's the same as spaces) // HOW TO HANDLE ' IN A PARSED COMMENT? // // %COMMENT A ' MESSES UP! // // because it keeps scanning until a closing quote. However if you don't scan between quotes, // line reconstruction will lose spaces within strings! // // You can't just end a quoted string at a newline because embedded newlines are allowed. // And I checked Imp77 - it allows a single quote ch in a comment. // If line reconstruction were being done on the fly then it could be modified if we knew we were // in a comment, but since we're doing it all in advance, the only option to handle this appears // to be that whenever we're in a comment, we throw away all the following line reconstruction and // re-do it, with that comment handled differently. // Or bite the bullet and work out how to do line reconstruction on the fly (which my previous // imptoc did eventually manage using the 'demandload' call. So *every* fetch via TP would have // to be recoded as a procedure call, with on-the-fly line reconstruction, and either a way to // undo it if backtracking or simply never doing it any farther past TP and undoing it on backtracking. // What a can of worms just to handle badly designed comments. TO DO. #define CHECK_EOF(x) do if ((x) == 0) DONE(); else { _source(PP).end = SYM(FP-1).end; } while (0) // PP is the 'current' slot we are writing into. _source(PP).start = SYM(FP).start; for (;;) { _source(PP).end = SYM(FP).end; // Keep updated. WC = SYM(FP++).ch; CHECK_EOF(WC); if (WC == '%') { // We found a keyword. It will always be read up to the last character of the keyword. for (;;) { WC = SYM(FP++).ch; CHECK_EOF(WC); if (WC == '%') { } else if (!isalpha(WC)) { // It's possible to have a bunch of '%' signs and *no* keyword characters. --FP; // point FP back to the non-keyword character, not as currently, the one past that. break; } else { // isalpha(WC) if (isupper(WC)) WC = tolower(WC); _source(PP).end = SYM(FP-1).end; // | 128 _source(PP++).ch = WC; // | 128 _source(PP).start = SYM(FP).start; // | 128 } } continue; } else if (WC == '{') { for (;;) { WC = SYM(FP++).ch; CHECK_EOF(WC); if (WC == '\n') { --FP; /* re-read the \n as a significant character */ // _source(PP).end = SYM(FP-1).end; // point FP back to the newline break; } if (WC == '}') { // Not sure if \n should be gobbled for {this style break; // but still looking. } } continue; } else if (WC == '\'') { _source(PP++).ch = WC; for (;;) { WC = SYM(FP++).ch; CHECK_EOF(WC); if (WC == '\'') { // peek ahead: int Peek = SYM(FP).ch; CHECK_EOF(Peek); if (Peek == '\'') { // doubled 's _source(PP++).ch = WC; _source(PP++).ch = Peek; FP++; } else { _source(PP).ch = WC; _source(PP).end = SYM(FP-1).end; // Leave Peek for later. PP++; break; } } else { _source(PP++).ch = WC; } } continue; } else if (WC == '"') { // TO DO: Update ' and " items in imp77 as well _source(PP++).ch = WC; for (;;) { WC = SYM(FP++).ch; CHECK_EOF(WC); if (WC == '"') { // peek ahead: int Peek = SYM(FP).ch; CHECK_EOF(Peek); if (Peek == '"') { // doubled "s _source(PP++).ch = WC; _source(PP++).ch = Peek; FP++; } else { _source(PP).ch = WC; _source(PP).end = SYM(FP-1).end; // Leave Peek for later. PP++; break; } } else { _source(PP++).ch = WC; } } continue; } else if (WC == ' ' || WC == '\t' || WC == '\f') { // use iswblank(WC) instead? continue; } else { // everything else just returns one significant non-space character. This includes '\n'. if ((WC == '\n') && (source(PP-1).ch == 'c')) { // BEWARE WHEN CHANGING STROPPING ENCODING: Looking for a preceding '%C' ... if (PP>0) _source(PP-1).ch = ' '; // remove the '%c' _source(PP++).ch = ' '; // remove the newline // This is the only place where we gobble spaces *after* a token rather than before. // It may be cleaner to set a 'continuation' flag and gobble them before the next // symbol fetch rather than do it here in a lookahead. Esp. wrt to reconstituting source // from the array for the listing file etc etc. // BUT FOR NOW, %C IS HANDLED BY THS HACK: int Lookahead = FP; while (SYM(Lookahead).ch == '\n' || SYM(Lookahead).ch == ' ' || SYM(Lookahead).ch == '\t' || SYM(Lookahead).ch == '\f') { // Use iswblank()? // No worries about '{...}' - this behaviour seems to be identical to Imp77's _SYM(Lookahead).ch = ' '; // gobble following newlines and whitespace before next significant character. Lookahead++; } continue; } if (iswalpha(WC) && iswlower(WC)) { WC = towupper(WC); // ALSO TEMPORARY } _source(PP++).ch = WC; continue; } // Still skipping whitespace ... } DONE(); P = 0; while (source(P).ch != 0) { if (debug_stropping) fprintf(stderr, "%d: ch='%lc' start=%d:end=%d\n", P, source(P).ch, source(P).start, source(P).end); P++; } #undef DONE #endif d69 1 a69 1 }; a70 1 # regular rules. d72 3 a74 1 # P IS THE MAIN ENTRY POINT!: d76 2 a77 9 B=0; B=1; B=2; B=3; P = , ''; P = , ''; C = { return TRUE; }; # BIP(1014) DUMMYAPP d79 1 a79 5 # C and C are called at the wrong time, so had to be # converted to P and P. However there's no mechanism # in the grammar file for tacc-style code embedding and because # of the new features that have been added, working from imp80-comp.c # as the master isn't going to work either, I suspect. d81 6 a86 2 # Perhaps the thing to do might be to ensure that all the other stuff # in the .g file gets copied to imp80-comp.h as well? d88 5 a92 2 # But more likely it's time to bite the bullet and add AST actions # in takeon... d94 1 a94 3 # Note maybe I want to allow both C and P so that # one gets called during parse time and the other at runtime? # Hmmm. Maybe not. d96 2 a97 2 C = { // BIP(1015) DOWN (new textual level) return TRUE; d100 5 a104 2 C = { // BIP(1016) UP (previous textual level) return TRUE; d107 1 a107 3 P = { #ifdef X_APP debug_scope = 1; a108 1 #endif d111 1 a111 2 P = { #ifdef X_APP a112 1 #endif a114 153 C<_UOP> = { return TRUE; }; # BIP(1028) phrase +' (unary operator): C = { return TRUE; }; # MARK (set marker for linkage) C<_ALIASTEXT> = { return TRUE; }; # BIP(1013) TEXTTEXT (alias text) # BIP(1004) CHXTYPE (check extended type) # first letter is (B,H,I,L,R,S) # third letter (A,L,N,R,T,C) C = { return TRUE; }; # BIP(1001) NAME C = { return TRUE; }; # BIP(1030) phrase ,' (parameter list): # ',', none # Aha! This is why a comma was missing # between two parameters in one program. C<_COMP1> = { return TRUE; }; # BIP(1032) COMP1 (comparison operator) C<_COMP2> = { return TRUE; }; # BIP(1037) COMP2 (is 2nd half of dsided) C<_ASSOP> = { return TRUE; }; # BIP(1033) ASSOP (assignment operators): ==,=,<-,-> C = { return TRUE; }; C = { return TRUE; }; C = { return TRUE; }; C<_READLINEP> = { return TRUE; }; # BIP(1012) READLINE? (skip to non-empty line) C = { return TRUE; }; # BIP(1019) COLON (for label) C<_S> = { return TRUE; }; # S (separator) C = { return TRUE; }; C = { return TRUE; }; # BIP(1022) SETNEM (set mnemonic) C = { return TRUE; }; # BIP(1002) xconst(TRUE, &strpos) C = { return TRUE; }; # BIP(1031) UCWRONG; machine code error C = { return TRUE; }; C = { return TRUE; }; C = { return TRUE; }; # BIP(1023) PRIMFORM; primary format mnemonic C = { return TRUE; }; # BIP(1024) SECNFORM; secondary format mnemonic C = { return TRUE; }; # BIP(1025) TERTFORM; tertiary format mnemonic C = { return TRUE; }; # (machine code jump format?) C = { return TRUE; }; # (machine code floating point register?) C = { return TRUE; }; # integer # BIP(1009) N255 (number 0-255 only) # BIP(1005) phrase N (16 bit decimal number) C<_N> = { return TRUE; #ifdef NEVER /* phrase N (16 bit decimal number) */ i = cc[q]; /* obtain current character */ if(!(isdigit(i))) goto fail; s = 0; /* initialise for accumulation */ while(isdigit(i)) { s = 10*s + (i - '0'); q++; i = cc[q]; } toar2(r, (INT) s); r += 2; goto succ; #endif }; C = { return TRUE; }; # BIP(1026) phrase OP - various operators: C<_OP> = { return TRUE; #ifdef NEVER /* phrase OP - various operators: */ /* +,-,&,****,**,*,!!,!, */ /* //,/,>>,<<,.,\\,\,^^,^ */ i = cc[q]; /* obtain current character */ if((i <= 32) || (i >= 127) || ((0x80000000 >> ((i-32) & 0x1f) & 0x4237000a) == 0)) goto fail; q++; if(i == '+') { a[r] = 1; goto upr; } if(i == '-') { a[r] = 2; goto upr; } if(i == '&') { a[r] = 3; goto upr; } j = cc[q]; if(i == '*') { if(j != '*') { a[r] = 6; /* op * */ goto upr; } if((cc[q+1] == '*') && (cc[q+2] == '*')) { a[r] = 4; /* op **** */ q += 3; goto upr; } a[r] = 5; /* op ** */ q++; goto upr; } if(i == '/') { if(j != '/') { a[r] = 10; /* op / */ goto upr; } a[r] = 9; /* op // */ q++; goto upr; } if(i == '!') { if(j != '!') { a[r] = 8; /* op ! */ goto upr; } a[r] = 7; /* op !! */ q++; goto upr; } if(i == '.') { a[r] = CONCOP; /* op . */ goto upr; } if((i == j) && (j == '<')) { a[r] = 12; /* op << */ q++; goto upr; } if((i == j) && (j == '>')) { a[r] = 11; /* op >> */ q++; goto upr; } if((i == '\\') || (i == '^')) { if(j != i) { a[r] = 15; /* op \ or op ^ */ goto upr; } q++; a[r] = 14; /* op \\ or op ^^ */ goto upr; } goto fail; #endif }; C = { return TRUE; }; # BIP(1027) CHUI - check that statement # starts with valid character: # letter,'-',%C,%E,%M,%R,%S C<_TEXT> = { return TRUE; }; # BIP(1007) TEXT (comment text) (! or |) C = { return TRUE; }; # BIP(1034) NOTESTART (note %START) C = { return TRUE; }; # BIP(1035) NOTEFINISH (note %FINISH) C = { return TRUE; }; # BIP(1029) NOTECYCLE (note start of %CYCLE) C = { return TRUE; }; # BIP(1036) NOTEREPEAT (note %REPEAT) C = { return TRUE; }; # BIP(1038) INCLUDE (include file) C = { return TRUE; }; # BIP(1021) TRACE (for %ON conditions) C = { return TRUE; }; # BIP(1017) LISTON (turn on listing) C = { return TRUE; }; # BIP(1018) LISTOFF (turn off listing) C = { return TRUE; }; # BIP(1039) DUMMYSTART: give same 'ar' as %ELSE %START P = , ';'; # statements separator d117 1 a117 1 '//', '/', '>>', '<<', '.', '\\\\', '\\', '^^', '^'; d127 1 d129 4 a132 5 # and as regexps rather than BIPs... (demo, incomplete) ##P = «[A-Za-z][A-Za-z0-9]*»; # BIP(1001) pname(TRUE) P = «[A-Z][A-Z0-9]*» { fprintf(stderr, " [Name] "); }; # BIP(1001) pname(TRUE) a136 1 # TO DO: CONST is all types of const, not just plain numbers (see for 16 bit numbers for example) d140 1 a140 1 # BIP(1003) xconst(FALSE, &strpos) d145 2 a146 2 «[MBKXR]»; # Multi Binary oKtal heX Realhex (K'7777' not O'7777' - earlier error) a147 1 ##P = «\".*\"»; # BIP(1013) TEXTTEXT (alias text) [I'll handle "" later.] d150 1 a150 1 P = «.» ; a154 3 ##P = ; P = '!' , '|' , "comment" ; P = «.» , ''; # comment text d156 3 d160 6 a165 1 P = , d167 5 a171 6 '('')' { #ifdef X_APP // Execute application code here. fprintf(stdout, " ****** "); #endif }; d173 1 a173 3 P = , , '('')'; d204 1 a204 1 P = ',', # added to enable implicit continuation in record formats d210 8 d219 17 a235 1 "integer"; d239 1 a239 1 "longlong", d245 22 a266 1 "record"'('')'; d269 7 a275 1 ; d279 3 a281 1 "function"; d283 3 a285 3 P = , , "name"; d287 1 a287 1 P = "name", d289 7 d297 1 a297 1 P = "array""name", d299 8 a306 1 ''; d311 5 a315 2 P = ',', ''; d318 1 d321 48 a368 1 ; # of Rt/Fn/Map or begin block d371 4 a374 1 ''; d396 3 a398 1 ''; a399 2 P = "spec", ; d401 18 d420 3 a422 1 ''; d427 4 a430 2 P = , "array"; d432 3 a434 1 P = ; d438 1 d441 13 a453 2 P = , "array" ; d455 1 a455 1 P = ',', d458 7 d469 10 a478 1 "const"; d500 1 a500 1 P = ',''('':'')', d504 3 a506 1 ''; d514 23 a536 1 "then"; d542 13 a554 1 ''; d558 11 a568 1 ; d570 1 a570 1 P = '_', d573 1 a573 3 P = '('':'')'; P = '=', d576 11 a586 4 P = "system", "external", "dynamic", ''; d603 4 a606 2 P = , "array"; d611 4 d617 34 a650 11 P = , '->', "return", "result", "monitor", "stop", "signal", "exit", "continue"; P = ; d652 9 a660 1 ##P=''; d665 46 a710 13 P = '(*):', '(''):', ':', , , , "finish", "cycle", "repeat", "cycle", , "end", "record""format", d712 2 a713 1 # pushes a scope level, but we want to d717 1 a717 1 # nested scope level. So should only set d721 128 a848 12 , , "include", "begin", "on""start", "switch"'('':'')', "list", "else", '*', "trustedprogram", "mainep", "control", @ 1.1 log @Initial revision @ text @d10 4 d15 2 a16 1 // peform any initialisation required by the parse-time semantic routines. d20 14 a33 2 // LINE RECONSTRUCTION GOES HERE! d38 1 d40 4 d87 1 a87 1 DECLARE(SYM, reconstructed, 600000); d317 2 a318 2 P = , ''; P = , ''; d322 36 a357 2 C = { return TRUE; }; # BIP(1015) DOWN (new textual level) C = { return TRUE; }; # BIP(1016) UP (previous textual level) d511 1 a511 1 P = , ';'; # statements separator d513 1 a513 1 P = '+', '-', '&', '****', '**', '*', '!!', '!', d516 1 a516 1 P = '+', '-', '\\', '~', ''; # BIP(1028) phrase +' (unary operator): d518 1 a518 1 P = '==', '=', '<-', '->'; d520 1 a520 1 P = '=', '>=', '>', '#', '\\=', '<>', '<=', '<', '->', '==', '##', '\\=='; d522 1 a522 1 P = '=', '>=', '>', '#', '\\=', '<>', '<=', '<', '->', '==', '##', '\\=='; d527 3 a529 1 P = «[A-Z][A-Z0-9]*»; # BIP(1001) pname(TRUE) d531 1 a531 1 P = «'([^\']|'')*'»; # , , ; d533 1 a533 1 P = «''''», «''», «'.'»; # , , ; d536 2 a537 2 P = '@@', ''; P = '.', ''; d539 13 a551 5 P = , , , 'E', # EBCDIC «[MBKXR]»; # Multi Binary oKtal heX Realhex (K'7777' not O'7777' - earlier error) d553 1 a553 5 P = «[0-9][0-9]*»; P = «\".*\"»; # BIP(1013) TEXTTEXT (alias text) [I'll handle "" later.] P = ';'; ##P = «.» ; d555 2 a556 2 P = '!' , '|' , "comment" ; P = «.» , ''; # comment text d559 8 a566 3 P = , , '('')'; d568 3 a570 3 P = , , '('')'; d572 1 a572 1 P = ; d574 2 a575 2 P = , '*'; d577 1 a577 1 P = ; d579 2 a580 2 P = , ''; d582 2 a583 2 P = , ''; d585 2 a586 2 P = '('')', ''; d588 2 a589 2 P = ',' , ''; d591 2 a592 2 P = "if", "unless"; d594 3 a596 3 P = "while", "until", "for"'='','','; d598 2 a599 2 P = "alias", ''; d601 2 a602 2 P = ',', ''; d604 2 a605 2 P = "integer", ''; d607 2 a608 2 P = "real", "integer"; d610 9 a618 9 P = "integer", "real", "longlong", "long", "byte", "string", "half", "short", "record"'('')'; d620 2 a621 2 P = "routine", ; d623 3 a625 3 P = "fn", "map", "function"; d627 3 a629 3 P = , , "name"; d631 2 a632 2 P = "name", ''; d634 3 a636 3 P = "array""name", "name", ''; d638 2 a639 2 P = '('')', ''; d641 2 a642 2 P = ',', ''; d644 4 a647 4 P = "ofprogram", "offile", "oflist", ; # of Rt/Fn/Map d649 2 a650 2 P = "format", ''; d652 3 a654 3 P = , '('')', "not"; d657 2 a658 2 P = , ''; d660 3 a662 3 P = "and", "or", ''; d664 2 a665 2 P = "and", ''; d667 2 a668 2 P = "or", ''; d670 3 a672 3 P = , # ':' was here. Moved to . Maybe try putting back now everything else is fixed... ''; d674 2 a675 2 P = "spec", ; d677 2 a678 2 P = "spec", ''; d680 2 a681 2 P = ','':', ''; d683 2 a684 2 P = , "array"; d686 1 a686 1 P = ; d688 2 a689 2 P = ',', ''; d692 2 a693 2 P = , "array" ; d695 2 a696 2 P = ',', ''; d698 5 a702 5 P = "own", "external", "extrinsic", "constant", "const"; d704 3 a706 3 P = '=' , ''; d708 2 a709 2 P = ',', ''; d711 2 a712 2 P = '('')', ''; d715 2 a716 2 P = ',', ''; d718 2 a719 2 P = "event", ''; d721 2 a722 2 P = ',', ''; d724 2 a725 2 P = ',''('':'')', ''; d727 2 a728 2 P = "until", ''; d730 3 a732 3 P = , , ; d734 3 a736 3 P = "start", "then""start", "then"; d738 2 a739 2 P = "and", ''; d741 2 a742 2 P = "else", ''; d744 3 a746 3 P = "start", , ; d748 2 a749 2 P = '_', ''; d751 1 a751 1 P = '('':'')'; d753 2 a754 2 P = '=', ''; d756 4 a759 4 P = "system", "external", "dynamic", ''; d761 2 a762 2 P = '='','',', ''; d764 2 a765 2 P = ',', ''; d767 2 a768 2 P = "spec", '('')'; d770 2 a771 2 P = , ; d773 2 a774 2 P = , '('')'; d776 2 a777 2 P = , "array"; d779 2 a780 2 P = "or", ''; d782 1 a782 1 P = «.» , ''; # TO DO: semicolons in an asm statement eg *LD_1,';' d784 11 a794 11 P = , '->', "return", "result", "monitor", "stop", "signal", "exit", "continue"; P = ; d798 2 a799 2 P = , ''; P = ; d801 35 a835 26 P = '(*):', '(''):', ':', , , , "finish", "cycle", "repeat", "cycle", , "end", "record""format", , , "include", "begin", "on""start", "switch"'('':'')', "list", "else", '*', "trustedprogram", "mainep", "control", ; @