performance - ANTLR 4 - How to reduce prediction time for IF statements with optional END-IF (COBOL grammar) -
i've been working on cobol grammar , found generated parser runs slow programs having nested if statements.
i think it's better @ sample code first, explain 1 of cobol's "annoying" feature - 1 dot (full stop) can close level of unclosed nested if statements. here is:
if condition-1 s1 if condition-2 s2 (more nested if conditions) if condition-n sn (0~n "end-if"s ) .
the text between parenthesis pseudo code. key point "end-if"s optional before dot closes so-called "sentence" in cobol. allow "end-if"s optional, i've written grammar way:
sentence: statement_list dot ; statement_list: statement+ ; statement: if_statement | ... //other statements ; if_statement: if condition then? statement_list (else statement_list)? end_if? ;
this grammar turn out slow style of code sample above. when debugging found parser busy predicting inside if_statement. expected there ambiguities in grammar. sample code, "if condition-2 ..." can recognized either child or brother of "if condition-1 ...", because there can omitted "end-if" right after s1. inner "if condition-n" can have n choices. similarly, every end-if before dot have alternatives on "if" pair if total number not n.
i tried cut off alternatives adding semantic predicates, below:
statement_list: statement+ {_input.la(1)==end_if || _input.la(1)==else || _input.la(1)==dot}? ; if_statement: if condition then? statement_list ( {_input.la(1)!=else}? | else statement_list) ( {_input.la(1)!=end-if}? | end_if) ;
this reduced parse time of large programs 2/3, not reduce linear time, deep prediction on if_statement , statement_list still observed. closer antlr 4 runtime seems show semantic predicates not used @ during prediction.
does know of trick make "optional end-if" unambiguous, or make predict fast?
actually there ambiguity statements before "else", before end-if.
note: paring sample 30 levels of nested else-if uses 3 seconds on machine. , i'm facing large programs have many occurrence of such deep nesting , uses more 10 minutes parse. 1 of them goes stack overflow when predicting.
another note: i'm using antlr 4.2.
Comments
Post a Comment