This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 98142

Summary: LALR parser for easy Erlang support
Product: obsolete Reporter: _ dcaoyuan <dcaoyuan>
Component: languagesAssignee: issues@obsolete <issues>
Status: NEW ---    
Severity: blocker    
Priority: P3    
Version: 5.x   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Exception Reporter:
Attachments: Erlang nbs file

Description _ dcaoyuan 2007-03-17 20:06:46 UTC
I'm trying to write Erlang (http://www.erlang.org) editor support based on this
language module. The grammar definition is copied and lightly modified from
Erlang's self yacc definition. There are a lot of ParseException: cycle
detected! threw out as:
----------------------------------------
org.netbeans.api.languages.ParseException: cycle detected! expr_400
[attribute#1, typed_attr_val, expr, expr_100, expr_150, expr_160, expr_200,
expr_300, expr_400]
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:401)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.f2(Petra.java:413)
	at org.netbeans.modules.languages.parser.Petra.first2(Petra.java:332)
	at
org.netbeans.modules.languages.parser.LLSyntaxAnalyser.read(LLSyntaxAnalyser.java:90)
	at
org.netbeans.modules.languages.ParserManagerImpl.parse(ParserManagerImpl.java:260)
	at
org.netbeans.modules.languages.ParserManagerImpl.parseAST(ParserManagerImpl.java:211)
	at
org.netbeans.modules.languages.ParserManagerImpl.access$000(ParserManagerImpl.java:60)
	at
org.netbeans.modules.languages.ParserManagerImpl$1.run(ParserManagerImpl.java:123)
	at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:541)
[catch] at
org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:963)
----------------------------

Although I can modify the definition to pass the parser, but, as the grammar
definition should be legal (correct me if any error), I hope it can pass without
too many modifications.

Following is the Erlang.nbs I'm using:

-------------------------------------
########### tokens #############################################################
TOKEN:comment: ("/*" - "*/")
TOKEN:line_comment: ("%"[^"\n""\r"]*)

TOKEN:keyword: (
    "after" | "begin" | "case" | "try" | "catch" | "end" | "fun" | "if" | "of" |
"receive" | "when" |
    "andalso" | "orelse" | "query" |
    "cond" |
    "bnot" | "not" |
    "div" | "rem" | "band" | "and" |
    "bor" | "bxor" | "bsl" | "bsr" | "or" | "xor" |
    "!"
)

TOKEN:operator: (
    "(" | ")" | "->" | ":-" | "{" | "}" | "[" | "]" | "|" | "||" | "<-" | ":" |
"#" | "." |
    "*" | "/" |  
    "+" | "-" | 
    "++" | "--" | 
    "==" | "/=" | "=<" | "<" | ">=" | ">" | "=:=" | "=/=" | 
    "<<" | ">>" | 
    "=" | "::" |
    "?"
)

TOKEN:separator: ( 
    ";" | ","
)

TOKEN:whitespace: (
    [" " "\t" "\n" "\r"]+
)

### atomic terminals ###
#TOKEN:char: (
#    "\'" "\'"
#)
TOKEN:integer: (
    ["0"-"9"]+ 
)
TOKEN:float: (
    ["0"-"9"]+ "." ["0"-"9"]* (["e" "E"] ["+" "-"]? ["0"-"9"]+)? |
    "." ["0"-"9"]* (["e" "E"] ["+" "-"]? ["0"-"9"]+)? |
    ["0"-"9"]+ ["e" "E"] ["+" "-"]? ["0"-"9"]+ |
    ["0"-"9"]+ (["e" "E"] ["+" "-"]? ["0"-"9"]+)?
)
TOKEN:atom: (
    ["a"-"z"] ["a"-"z" "A"-"Z" "0"-"9" "_" "@"]* |
    "\'" ( 
        [^ "\\" "\"" "\n" "\r"] |
        ("\\" (
            . |
            (["0"-"7"] ["0"-"7"] ["0"-"7"]) |
            ("x" ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"]) |
            ("u" ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9"
"a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"])
        ))
    )* 
    "\'"
    
)
TOKEN:string: (
    "\"" ( 
        [^ "\\" "\"" "\n" "\r"] |
        ("\\" (
            . |
            (["0"-"7"] ["0"-"7"] ["0"-"7"]) |
            ("x" ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"]) |
            ("u" ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"] ["0"-"9"
"a"-"f" "A"-"F"] ["0"-"9" "a"-"f" "A"-"F"])
        ))
    )* 
    "\""
)
TOKEN:var: (
    ["A"-"Z" "_"] ["a"-"z" "A"-"Z" "0"-"9" "_" "@"]*
)


########### grammar ############################################################

SKIP:whitespace
SKIP:comment
SKIP:line_comment

S = (form)+;

### grammar defination from erl_parse.yrl ###

form = attribute "."; # : '$1'.
form = function "."; # : '$1'.
form = rule "."; # : '$1'.

attribute = "-" <attribute_name> "(" attr_val ")"; # : build_attribute('$2', '$4').
attribute = "-" <attribute_name> "(" typed_attr_val ")"; # :
build_t_attribute('$2', '$4').

attribute_name = <atom>;

typed_attr_val = expr "," typed_tuple; # : ['$1' , '$3'].

typed_tuple = "{" typed_exprs "}"; # : {tuple,line('$1'),'$2'}.

typed_exprs = typed_expr; # : ['$1'].
typed_exprs = typed_expr "," typed_exprs; # : ['$1'|'$3'].
typed_exprs = typed_expr "," exprs; # : ['$1'|add_type_info('$3')].
typed_exprs = expr "," typed_exprs; # : [add_type_info('$1')|'$3'].

typed_expr = expr "::" function_call; # : {typed,'$1','$3'}.

attr_val = exprs; # : '$1'.

function = function_clauses; # : build_function('$1').

function_clauses = function_clause; # : ['$1'].
function_clauses = function_clause ";" function_clauses; # : ['$1'|'$3'].

function_clause = function_name clause_args clause_guard clause_body; # :
{clause,line('$1'),element(3, '$1'),'$2','$3','$4'}.

function_name = <atom>;

clause_args = argument_list; # : element(1, '$1').

clause_guard = "when" guard; # : '$2'.
clause_guard = "$empty"; # : [].

clause_body = "->" exprs; # : '$2'.

expr = "catch" expr; # : {'catch',line('$1'),'$2'}.
expr = expr_100; # : '$1'.

expr_100 = expr_150 "=" expr_100; # : {match,line('$2'),'$1','$3'}.
expr_100 = expr_150 "!" expr_100; # : mkop('$1', '$2', '$3').
expr_100 = expr_150; # : '$1'.

expr_150 = expr_160 "orelse" expr_150; # : mkop('$1', '$2', '$3').
expr_150 = expr_160; # : '$1'.

expr_160 = expr_200 "andalso" expr_160; # : mkop('$1', '$2', '$3').
expr_160 = expr_200; # : '$1'.

expr_200 = expr_300 comp_op expr_300; # : mkop('$1', '$2', '$3').
expr_200 = expr_300; # : '$1'.

expr_300 = expr_400 list_op expr_300; # : mkop('$1', '$2', '$3').
expr_300 = expr_400; # : '$1'.

expr_400 = expr_400 add_op expr_500; # : mkop('$1', '$2', '$3').
expr_400 = expr_500; # : '$1'.

expr_500 = expr_500 mult_op expr_600; # : mkop('$1', '$2', '$3').
expr_500 = expr_600; # : '$1'.

expr_600 = prefix_op expr_700; # : mkop('$1', '$2').
expr_600 = expr_700; # : '$1'.

expr_700 = function_call; # : '$1'.
expr_700 = record_expr; # : '$1'.
expr_700 = expr_800; # : '$1'.

expr_800 = expr_900 ":" expr_max; # : {remote,line('$2'),'$1','$3'}.
expr_800 = expr_900; # : '$1'.

expr_900 = "." <atom>; # : {record_field,line('$1'),{atom,line('$1'),''},'$2'}.
expr_900 = expr_900 "." <atom>; # : {record_field,line('$2'),'$1','$3'}.
expr_900 = expr_max; # : '$1'.

expr_max = <var>; # : '$1'.
expr_max = atomic; # : '$1'.
expr_max = list; # : '$1'.
expr_max = binary; # : '$1'.
expr_max = list_comprehension; # : '$1'.
expr_max = tuple; # : '$1'.
expr_max = "(" expr ")"; # : '$2'.
expr_max = "begin" exprs "end"; # : {block,line('$1'),'$2'}.
expr_max = if_expr; # : '$1'.
expr_max = case_expr; # : '$1'.
expr_max = receive_expr; # : '$1'.
expr_max = fun_expr; # : '$1'.
expr_max = try_expr; # : '$1'.
expr_max = query_expr; # : '$1'.

list = "[" "]"; # : {nil,line('$1')}.
list = "[" expr tail; # : {cons,line('$1'),'$2','$3'}.

tail = "]"; # : {nil,line('$1')}.
tail = "|" expr "]"; # : '$2'.
tail = "," expr tail; # : {cons,line('$2'),'$2','$3'}.

binary = "<<" ">>"; # : {bin,line('$1'),[]}.
binary = "<<" bin_elements ">>"; # : {bin,line('$1'),'$2'}.

bin_elements = bin_element; # : ['$1'].
bin_elements = bin_element "," bin_elements; # : ['$1'|'$3'].

bin_element = bit_expr opt_bit_size_expr opt_bit_type_list; # :
{bin_element,line('$1'),'$1','$2','$3'}.

bit_expr = prefix_op expr_max; # : mkop('$1', '$2').
bit_expr = expr_max; # : '$1'.

opt_bit_size_expr = ":" bit_size_expr; # : '$2'.
opt_bit_size_expr = "$empty"; # : default.

opt_bit_type_list = "/" bit_type_list; # : '$2'.
opt_bit_type_list = "$empty"; # : default.

bit_type_list = bit_type "-" bit_type_list; # : ['$1' | '$3'].
bit_type_list = bit_type; # : ['$1'].

bit_type = <atom>; # : element(3,'$1').
bit_type = <atom> ":" <integer>; # : { element(3,'$1'), element(3,'$3') }.

bit_size_expr = expr_max; # : '$1'.

list_comprehension = "[" expr "||" lc_exprs "]"; # : {lc,line('$1'),'$2','$4'}.

lc_exprs = lc_expr; # : ['$1'].
lc_exprs = lc_expr "," lc_exprs; # : ['$1'|'$3'].

lc_expr = expr; # : '$1'.
lc_expr = expr "<-" expr; # : {generate,line('$2'),'$1','$3'}.

tuple = "{" "}"; # : {tuple,line('$1'),[]}.
tuple = "{" exprs "}"; # : {tuple,line('$1'),'$2'}.

# N.B. This is called from expr_700.
# N.B. Field names are returned as the complete object, even if they are
# always atoms for the moment, this might change in the future.
record_expr = "#" <atom> "." <atom>; # : {record_index,line('$1'),element(3,
'$2'),'$4'}.
record_expr = "#" <atom> record_tuple; # : {record,line('$1'),element(3,
'$2'),'$3'}.
record_expr = expr_max "#" <atom> "." <atom>; # :
{record_field,line('$2'),'$1',element(3, '$3'),'$5'}.
record_expr = expr_max "#" <atom> record_tuple; # :
{record,line('$2'),'$1',element(3, '$3'),'$4'}.

record_tuple = "{" "}"; # : [].
record_tuple = "{" record_fields "}"; # : '$2'.

record_fields = record_field; # : ['$1'].
record_fields = record_field "," record_fields; # : ['$1' | '$3'].

record_field = <var> "=" expr; # : {record_field,line('$1'),'$1','$3'}.
record_field = <atom> "=" expr; # : {record_field,line('$1'),'$1','$3'}.

# N.B. This is called from expr_700.
function_call = expr_800 argument_list; # : {call,line('$1'),'$1',element(1, '$2')}.

if_expr = "if" if_clauses "end"; # : {'if',line('$1'),'$2'}.

if_clauses = if_clause; # : ['$1'].
if_clauses = if_clause ";" if_clauses; # : ['$1' | '$3'].

if_clause = guard clause_body; # : {clause,line(hd(hd('$1'))),[],'$1','$2'}.

case_expr = "case" expr "of" cr_clauses "end"; # : {'case',line('$1'),'$2','$4'}.

cr_clauses = cr_clause; # : ['$1'].
cr_clauses = cr_clause ";" cr_clauses; # : ['$1' | '$3'].

cr_clause = expr clause_guard clause_body; # : {clause,line('$1'),['$1'],'$2','$3'}.

receive_expr = "receive" cr_clauses "end"; # : {'receive',line('$1'),'$2'}.
receive_expr = "receive" "after" expr clause_body "end"; # :
{'receive',line('$1'),[],'$3','$4'}.
receive_expr = "receive" cr_clauses "after" expr clause_body "end"; # :
{'receive',line('$1'),'$2','$4','$5'}.

fun_expr = "fun" <atom> "/" <integer>; # :
{'fun',line('$1'),{function,element(3, '$2'),element(3, '$4')}}.
fun_expr = "fun" <atom> ":" <atom> "/" <integer>; # :
{'fun',line('$1'),{function,element(3, '$2'),element(3, '$4'),element(3,'$6')}}.
fun_expr = "fun" fun_clauses "end"; # : build_fun(line('$1'), '$2').

fun_clauses = fun_clause; # : ['$1'].
fun_clauses = fun_clause ";" fun_clauses; # : ['$1' | '$3'].

fun_clause = argument_list clause_guard clause_body; # : {Args,Pos} = '$1',
{clause,Pos,'fun',Args,'$2','$3'}.

try_expr = "try" exprs "of" cr_clauses try_catch; # :
build_try(line('$1'),'$2','$4','$5').
try_expr = "try" exprs try_catch; # :build_try(line('$1'),'$2',[],'$3').

try_catch = "catch" try_clauses "end"; # : {'$2',[]}.
try_catch = "catch" try_clauses "after" exprs "end"; # : {'$2','$4'}.
try_catch = "after" exprs "end"; # : {[],'$2'}.

try_clauses = try_clause; # : ['$1'].
try_clauses = try_clause ";" try_clauses; # : ['$1' | '$3'].

try_clause = expr clause_guard clause_body; # : L = line('$1'),
{clause,L,[{tuple,L,[{atom,L,throw},'$1',{var,L,'_'}]}],'$2','$3'}.
try_clause = <atom> ":" expr clause_guard clause_body; # : L = line('$1'),
{clause,L,[{tuple,L,['$1','$3',{var,L,'_'}]}],'$4','$5'}.
try_clause = <var> ":" expr clause_guard clause_body; # : L = line('$1'),
{clause,L,[{tuple,L,['$1','$3',{var,L,'_'}]}],'$4','$5'}.

query_expr = "query" list_comprehension "end"; # : {'query',line('$1'),'$2'}.

argument_list = "(" ")"; # : {[],line('$1')}.
argument_list = "(" exprs ")"; # : {'$2',line('$1')}.

exprs = expr; # : ['$1'].
exprs = expr "," exprs; # : ['$1' | '$3'].

guard = exprs; # : ['$1'].
guard = exprs ";" guard; # : ['$1'|'$3'].

#atomic = <char>; # : '$1'.
atomic = <integer>; # : '$1'.
atomic = <float>; # : '$1'.
atomic = <atom>; # : '$1'.
atomic = strings; # : '$1'.

strings = <string>; # : '$1'.
strings = <string> strings; # : {string,line('$1'),element(3, '$1') ++
element(3, '$2')}.

prefix_op = "+"; # : '$1'.
prefix_op = "-"; # : '$1'.
prefix_op = "bnot"; # : '$1'.
prefix_op = "not"; # : '$1'.

mult_op = "/"; # : '$1'.
mult_op = "*"; # : '$1'.
mult_op = "div"; # : '$1'.
mult_op = "rem"; # : '$1'.
mult_op = "band"; # : '$1'.
mult_op = "and"; # : '$1'.

add_op = "+"; # : '$1'.
add_op = "-"; # : '$1'.
add_op = "bor"; # : '$1'.
add_op = "bxor"; # : '$1'.
add_op = "bsl"; # : '$1'.
add_op = "bsr"; # : '$1'.
add_op = "or"; # : '$1'.
add_op = "xor"; # : '$1'.

list_op = "++"; # : '$1'.
list_op = "--"; # : '$1'.

comp_op = "=="; # : '$1'.
comp_op = "/="; # : '$1'.
comp_op = "=<"; # : '$1'.
comp_op = "<"; # : '$1'.
comp_op = ">="; # : '$1'.
comp_op = ">"; # : '$1'.
comp_op = "=:="; # : '$1'.
comp_op = "=/="; # : '$1'.

rule = rule_clauses; # : build_rule('$1').

rule_clauses = rule_clause; # : ['$1'].
rule_clauses = rule_clause ";" rule_clauses; # : ['$1'|'$3'].

rule_clause = rule_name clause_args clause_guard rule_body; # :
{clause,line('$1'),element(3, '$1'),'$2','$3','$4'}.

rule_name = <atom>;

rule_body = ":-" lc_exprs; # : '$2'.
---------------------------------------------------------
Comment 1 _ dcaoyuan 2007-03-17 20:12:05 UTC
Created attachment 39611 [details]
Erlang nbs file
Comment 2 _ dcaoyuan 2007-03-18 08:53:17 UTC
It seems the parser does not support left-recursion.
Comment 3 _ dcaoyuan 2007-03-19 08:07:39 UTC
Ok, I got that the "Grammar have to fulfill some rules LL(k)." from:
http://wiki.netbeans.org/wiki/view/SchliemannPlannedFeatures

So, I'll fix the Erlang.nbs, currently I've got 80% definitions correctly parsed.
Comment 4 _ dcaoyuan 2007-03-19 19:23:10 UTC
I changed this issue as feature request.

To support Erlang and may other languages easily, please add also a LALR parser
for language module. 
Comment 5 _ dcaoyuan 2007-03-21 18:46:15 UTC
I've got 90% Erlang grammar definition passed now. But as current version of GLF
contains simple LL syntax analyzer, it seems no possible to parse something like
function declaration and function call in Erlang, here is an example:

function declare:

testfunc(A,B) ->
    A + B.

function call:

callfunc(A,B) ->
    testfunc(A,B).

I define them as:
function_declaration = declaration "->" exprs ".";
declaration = declaration_head [clause_guard];
declaration_head = declaration_name "(" [argument_list] ")";
declaration_name = <atom>;

function_call_expr = declaration_name "(" [argument_list] ")"; # which is
exactly declaration_head here.

Can this be parsed under current version of GLF?

 
Comment 6 _ dcaoyuan 2007-03-22 22:19:27 UTC
Ok, I've got a LL(k) definition of Erlang grammar, which is just slightly looser
than the original yacc definition.

Here's a snapshot of it:
http://blogtrader.net/page/dcaoyuan?entry=erlang_editor_support_based_on

I'd like to contribute it to NetBeans Community.
Comment 7 Jiri Prox 2007-09-17 20:48:13 UTC
Obsolete milestone, please reevaluate
Comment 8 Jan Jancura 2007-09-18 17:21:24 UTC
We do not have time to implement it to nb6.0.