--- Graham Toal wrote: > If not, then it would make an interesting exercise for our > group to spend an hour looking at the code and > reverse-engineering a grammar from it. I gave it a start: factor = '(' boolexpr ')' | id [ '[' expr ']' ] | num | hexnum | '*' ptr | '&' adr [ '[' expr ']' ] boolexpr = boolterm { ('|' boolor) | ('~' boolxor) } boolxor = boolterm boolor = boolterm boolterm = notfactor { '&' notfactor } notfactor = ['!'] relation relation = expression [ '=' equal | '<' less | '>' greater | '!' nequal ] nequal = '=' notequal greater = '=' nextexpression | '>' expression | compareexpression less = '=' lessorequal | '>' notequal | '<' expression | compareexpression notequal = nextexpression lessorequal = nextexpression equal = nextexpression nextexpression = compareexpression compareexpression = expression expression = ['+'|'-'] | term {'+' add | '-' subtract} term = factor { '*' multiply | '/' divide } if = 'if' '(' boolexpr ')' block ['else' block ] 'endif' while = 'while' '(' boolexpr ')' block 'endwhile' for = 'for' '(' assign ';' boolexpr ';' assign ';' ')' block 'endfor' block = { if | while | for | ident | '*' assign | inline | define | ifdef | endifdef } programblock = 'begin' locals block "end" procedureblock = 'procedure' 'id' '(' formallist ')' 'begin' locals block 'end' topdecl = { define | unsigned | chardef | intdef | ifdef | endifdef | include } procedureblock programblock But I ran out of steam - and, I'm sure I got lots of it wrong. But see below. > In fact, it might be a fun idea to have a compiler-writing > competition... anyone up for a weekend session where we all try > to implement this language using our favourite tools and each > in his own special way :-) It would be a really instructive > exercise to have several versions of a compiler written in > completely different styles and languages. > The only competition rules of significance would be that it > must > * be elegant and well documented, as a teaching example > * be able to run the resulting program (whether through > direct execution, interpretation, emulation - doesn't > matter) > In fact, rather than making this a weekend compo, it might be a > neat idea to make it an ongoing rite of passage for all > compilers101 members :-) Once you think you know enough to > write a compiler, you do a 'tiny' implementation as your > graduation exercise and we enshrine it for all time on our web > site :-) I think this is a great idea. Just in case you don't get a Tiny grammar, here is something that might suffice: tiny = { decls } { funcdecls } stat { stat }. decls = constdecl | vardecl . constdecl = "const" ident '=' number|string { ',' ident '=' number|string } ';' . vardecl = type ident ['[' number|ident ']'] { ',' ident ['[' number|ident ']'] } ';' . type = "integer" | "char" . funcdecls = forward|function . forward = "forward" ("func"|"proc") ident '(' [formal] ')' ';'. function = ("func"|"proc") ident '(' [formal] ')' '{' { decls } { stat } '}' . formal = ["var"] type ident ["[]"] {',' ["var"] type ident ["[]"]} . actual = expr {',' expr} . stat = ident ( '=' expr ';' | '(' [actual] ')' ';' | '[' expr ']' '=' expr ';' ) | "if" '(' expr ')' stat [ "else" stat ] | "while" '(' expr ')' stat | "return" [ expr ] ';' | "write" '(' expr {',' expr } ')' ';' | '{' { stat } '}' | ';' . expr = andexpr { "or" andexpr } . andexpr = equalexpr { "and" equalexpr } . equalexpr = relexpr [ ('='|'!=') relexpr ] . relexpr = addexpr [ ('<'|'<='|'>'|'>=') addexpr ] . addexpr = term { ('+'|'-') term } . term = factor { ('*'|'/'|'mod') factor }. factor = ident [ '(' [actual] ')' | '[' expr ']' ] | number | ('+'|'-') factor | "not" factor | "true" | "false" | '(' expr ')' | string | "getchar" "()" . ident = letter {letter | digit | '_' }. number = digit {digit}. string = "'" { no_single_quote } "'" | '"' { no_double_quote } '"' . Comments are "//" until eol, and "/*" to "*/" Semantic info: - The ident specified in the array size must be a constant integer. - Arrays start at 0, and go to n-1. - Forward declarations must agree with the actual definition. - Only "func" can return a value, which must be an integer. - "var" means the parameter is passed by reference, and changes will be reflected in the caller. Non-var parameters cannot be changed, and are treated as constants. - Strings can only be assigned to char's and char arrays. - Arrays can be assigned, appropriate truncation can occur if they are not the same size. - getchar() returns the next character from stdin as an integer. -1 signals eof. - Single character strings and single character string constants, int's and char's are type compatible, and may be freely mixed in expressions. Note so with multi-char strings and multi-char string constants. - The only thing that can be done with multi-char strings or multi-char string constants are: assign them to character arrays and pass them to routines. If passed as a parameter, the parameter must be a (non-var) char array. - Yes, I think it should be able to compile itself - with some fancy footwork with the symbol table. --