antlr-基础-通配符和非贪婪模式

通配符和非贪婪模式

(...)?, (...)* and (...)+是贪婪的
在parser和lexer都可以使用,后面加上?就是非贪婪

Nongreedy Lexer Subrules

c分割的注释词法,消耗全部的字符,直到*/结束

COMMENT : '/*' .*? '*/' -> skip ; // .*? matches anything until the first */

另一种
例子中的\貌似第一个是转义符

grammar Nongreedy;
s : STRING+ ;
STRING : '"' ( '\\"' | . )*? '"' ; // match "foo", "\"", "x\"\"y", ...
WS : [ \r\t\n]+ -> skip ;

非贪婪模式会使问题更复杂

ACTION1 : '{' ( STRING | . )*? '}' ; // Allows {"foo}
ACTION2 : '[' ( STRING | ~'"' )*? ']' ; // Doesn't allow ["foo]; nongreedy *?
ACTION3 : '<' ( STRING | ~[">] )* '>' ; // Doesn't allow <"foo>; greedy *
STRING : '"' ( '\\"' | . )*? '"' ;

Nongreedy Parser Subrules

非贪婪的子规则和通配符

grammar FuzzyJava;

/** Match anything in between constant rule matches */
file : .*? (constant .*?)+ ;

/** Faster alternate version (Gets an ANTLR tool warning about
 * a subrule like .* in parser that you can ignore.)
 */
altfile : (constant | .)* ; // match a constant or any token, 0-or-more times

/** Match things like "public static final SIZE" followed by anything */
constant
    :   'public' 'static' 'final' 'int' Identifier
        {System.out.println("constant: "+$Identifier.text);}
    ;

Identifier : [a-zA-Z_$] [a-zA-Z_$0-9]* ; // simplified
OTHER : . -> skip ;

嵌套类

class A {
        String name = "parrt";
        class Nested {
            any filthy shite we want in here { }}}}}}
        }
}

class B {
        int x;   
        int getDubX() {
                return 2*x;
        }
}
grammar Island;
file : clazz* ;
clazz : 'class' ID '{' ignore '}' ;
ignore : (method|clazz|.)*? ; // <- only change is to add clazz alt here
method : type ID '()' block ;
type : 'int' | 'void' ;
block : '{' (block | .)*? '}' ;
ID : [a-zA-Z] [a-zA-Z0-9]* ;
WS : [ \r\t\n]+ -> skip ;
ANY : . ;

posted @ 2016-11-29 20:23  zhangshihai1232  阅读(914)  评论(0)    收藏  举报