Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Help with unambiguous grammar

Name: long long 2012-01-13 22:59

I'm creating yet another fail C-family compiler.
Objectives: simple language, generics, type inference, and unambiguous grammar.
I would like to hear your opinions, any help is very appreciated.

bison parser:
%union {
    char *str;
}

%right '=' "+=" "-=" "*=" "/=" "%=" "~=" "&=" "|=" "^=" "<<=" ">>="
%left ':'
%right '?'
%left '|'
%left '^'
%left '&'
%left "or"
%left "and"
%left "==" "!="
%left '<' "<=" '>' ">="
%left "<<" ">>"
%left '+' '-'
%left '*' '/' '%'
%right "not" '~' PREFIX
%left "++" "--"
%left '['
%left '.'
%right '$'

%token <str> ID INT HEX BIN DOUBLE FLOAT COMPLEX STRING CHAR
%token-table

%%

program:
       | program import
       | program decl
       ;

import: "import" module
      | "import" module "as" identifier
      ;

module: identifier
      | module '.' identifier
      ;

decl: declvar
    | declfunc
    | declstruct
    ;

declvar: "var" identifier
       | "var" identifier '=' expr
       | "def" identifier '=' expr
       ;

declstruct: "def" identifier '{' declattrs '}'
          ;

declattrs: declattr
         | declattrs declattr
         ;

declattr: "var" identifier
        | declfunc
        ;

declfunc: "def" identifier '(' declargs ')' block
        ;

declargs: declarg
        | declargs ',' declarg
        ;

declarg: identifier
       | "var" identifier
       | identifier identifier
       ;

block: '{' '}'
     | '{' stmts '}'
     ;

stmts: stmt
     | block
     | stmts stmt
     | stmts block
     ;

stmt: declvar
    | purestmt
    | identifier ':' purestmt
    ;

purestmt: assign
        | call
        | if
        | for
        | while
        | switch
        | "goto" identifier
        | "return" expr
        ;

assign: lvalue '=' expr
      | lvalue "+=" expr
      | lvalue "-=" expr
      | lvalue "*=" expr
      | lvalue "/=" expr
      | lvalue "%=" expr
      | lvalue "~=" expr
      | lvalue "&=" expr
      | lvalue "|=" expr
      | lvalue "^=" expr
      | lvalue "<<=" expr
      | lvalue ">>=" expr
      | lvalue "++"
      | lvalue "--"
//      | "++" lvalue %prec PREFIX
//      | "--" lvalue %prec PREFIX
      ;

lvalue: identifier
      | lvalue '.' identifier
      | lvalue '[' expr ']'
      | '$' lvalue
      ;

identifier: ID
          ;

expr: literal
    | lvalue
    | assign
    | call
    | '~' expr
    | '@' lvalue
    | '-' expr %prec PREFIX
    | expr '+' expr
    | expr '-' expr
    | expr '*' expr
    | expr '/' expr
    | expr '%' expr
    | expr '&' expr
    | expr '|' expr
    | expr '^' expr
    | "not" expr
    | expr "==" expr
    | expr "!=" expr
    | expr '<' expr
    | expr "<=" expr
    | expr '>' expr
    | expr ">=" expr
    | expr "<<" expr
    | expr ">>" expr
    | expr "and" expr
    | expr "or" expr
    | expr '?' expr ':' expr
    | "proc" '(' declargs ')' block
    | '(' expr ')'
    ;

literal: INT
       | HEX
       | BIN
       | FLOAT
       | DOUBLE
       | COMPLEX
       | STRING
       | CHAR
       ;

call: onecall
    | call '.' onecall
    ;

onecall: lvalue '(' args ')'
       ;

args:
    | expr
    | args ',' expr
    ;

if: "if" '(' expr ')' block
  | "if" '(' expr ')' block "else" block
  ;

for : "for" '(' forarg "in" foriter ')' loop
    | "for" '(' forinit ';' expr ';' forincr ')' loop
    ;

forarg: identifier
      | "var" identifier
      ;

forinit: forinitarg
       | forinit ',' forinitarg
       ;

forinitarg: assign
          | "var" identifier '=' expr
          ;

forincr: assign
       | forincr ',' assign
       ;

foriter: lvalue
       | call
       ;

loop: ';'
    | loopblock
    ;

loopblock: '{' '}'
         | '{' loopstmts '}'
         ;

loopstmts: loopstmt
         | loopstmts loopstmt
         ;

loopstmt: "break"
        | "continue"
        | loopblock
        | stmt
        ;

while: "while" '(' expr ')' loop
     | "do" loopblock "while" '(' expr ')'
     ;

switch: "switch" '(' expr ')' '{' cases '}'
      ;

cases: allcases "default" ':' stmts
     ;

allcases: "case" expr ':' casestmts
        | allcases "case" expr ':' casestmts
        ;

casestmts: casestmt
         | casestmts casestmt
         ;

casestmt: "break"
        | caseblock
        | stmt
        ;

caseblock: '{' '}'
         | '{' casestmts '}'
         ;

Name: Anonymous 2012-01-15 11:26

>>52
You can write a function and pass pointers or references to it:
def getm(obj) { return obj.m }
var s = {1,2,3}
getm(s)  // getm(s :     array[int] ) → int
getm(@s) // getm(s : ptr[array[int]]) → int


>>51
I was thinking in compiling to some sort of intermediate representation [just one step back from the llvm-as], that could be portably executed in a VM [for easy distribution], and also natively compiled [for -march=native -mtune=native speed].
This representation also will help distributing libraries with generic functions/structs, whose types will be inferred at compile time, generating instances. Just include a simple “make” in the compiler: you can pass source and output directories, and the source tree is recursively searched for source files, which will be compiled if its or any of its imports timestamp [recursively and cached] are newer than the matching object file in the output directory. Then we can remember the template occurrences with same type parameters in the source tree and compile each just once.

I did this for my C++ projects, 200 lines of python [except the template caching part] and it runs very nice, I can even parallelize all sources until all memory is consumed by the compiler.

$ cd ~/foo
$ make
./make.py -ssrc -obuild -j3 gcc debug
compiling src/a.cc     to build/gcc-debug/a.o
compiling src/b.cc     to build/gcc-debug/b.o
compiling src/bar/b.cc to build/gcc-debug/bar_b.o
(waiting sources compilation)
linking build/*.o      to build/gcc-debug/foo.exe


Without breaking a sweat. Compiler flags [for platform dependent CFLAGS/LDFLAGS] are put in a separate make.ini file, under different sections ([default] [win32] [linux64]).

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List