antlr-基础-监听器和树解析

监听器

antlr解析器构造一个树,记录解析器如果识别输入语句;

tree的内部节点都是用来识别和鉴定它们子节点的短语名,root是最抽象的短语名,在这里stat;
解析树的输入是tokens,Parse trees在语言识别和解释器之间;
这是非常有效的数据结构,因为它们包含所有的输入和完善的把符号组织成语法的知识;
它们很好理解,解析器是自动生成它们的;
因为我们用一系列rules组织,指定短语结构,parse tree子树的根对应到rule names上;
ANTLR有一个ParseTreeWalker,它知道如何从这些parse tree上walk并且在创建的监听器对象中触发事件;
ANTLR生成监听器接口、也能生成visitors,比如Java.g4语法,生成如下

public interface JavaListener extends ParseTreeListener<Token> {
	void enterClassDeclaration(JavaParser.ClassDeclarationContext ctx);
	void exitClassDeclaration(JavaParser.ClassDeclarationContext ctx);
	void enterMethodDeclaration(JavaParser.MethodDeclarationContext ctx);
	...
}

语法文件中的每一个rule都有一个enter和一个exit方法;
也会生成一个基本的listner类JavaBaseListener;
当创建了一个监听器MyListener,下面是如何调用java解析并且遍历解析树

JavaLexer lexer = new JavaLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaParser parser = new JavaParser(tokens);
JavaParser.CompilationUnitContext tree = parser.compilationUnit(); // parse a compilationUnit

MyListener extractor = new MyListener(parser);
ParseTreeWalker.DEFAULT.walk(extractor, tree); // initiate walk of tree with listener in use of default walker

Listeners和visitors非常好,它们把应用和语法分开;
listener方法由antlr提供的walker对象独立调用,visitor方法必须用特定的visit调用walk它们的孩子,比较容易遗忘;

Parse Tree匹配和XPath

Parse tree patterns
为了测试subtree是否有特殊的结构,我们使用tree pattern;
比如测试子树是否满足某个条件,parrent可能看起来如下

<ID> = <expr>;
ParseTree t = ...; // assume t is a statement
ParseTreePattern p = parser.compileParseTreePattern("<ID> = <expr>;", MyParser.RULE_statement);
ParseTreeMatch m = p.match(t);
if ( m.succeeded() ) {...}

我们也能测试特定的表达式和tokens的值,比如测试t

ParseTree t = ...; // assume t is an expression
ParseTreePattern p = parser.compileParseTreePattern("<ID>+0", MyParser.RULE_expr);
ParseTreeMatch m = p.match(t);

也能通过ParseTreeMatch结果测试

String id = m.get("ID");

可以通过pattern matcher改变tag结束符

ParseTreePatternMatcher m = new ParseTreePatternMatcher();
m.setDelimiters("<<", ">>", "$"); // $ is the escape character

可以使模式<<ID>> = <<expr>> ;$<< ick $>>解释为元素ID, =, expr, and ;<< ick >>.

String xpath = "//blockStatement/*";
String treePattern = "int <Identifier> = <expression>;";
ParseTreePattern p =
parser.compileParseTreePattern(treePattern,
JavaParser.RULE_localVariableDeclarationStatement);
List<ParseTreeMatch> matches = p.findAll(tree, xpath);

Pattern labels
可以通过get和getAll()获取匹配的子树,如果有很多匹配的返回第一个;

XPath识别解析树node set
XPath是表示子树的字符串,分隔符如下

表达式 描述
nodename token或rule name
/ root
// 路径下所有匹配的节点,比如://ID
! 除了该节点,比如:/classdef/!field
例子
/prog/func, -> all funcs under prog at root
/prog/*, -> all children of prog at root
/*/func, -> all func kids of any root node
prog, -> prog must be root node
/prog, -> prog must be root node
/*, -> any root
*, -> any root
//ID, -> any ID in tree
//expr/primary/ID, -> any ID child of a primary under any expr
//body//ID, -> any ID under a body
//'return', -> any 'return' literal in tree
//primary/*, -> all kids of any primary
//func/*/stat, -> all stat nodes grandkids of any func node
/prog/func/'def', -> all def literal kids of func kid of prog
//stat/';', -> all ';' under any stat node
//expr/primary/!ID, -> anything but ID under primary under any expr node
//expr/!primary, -> anything but primary under any expr node
//!*, -> nothing anywhere
/!*, -> nothing at root

给定一颗解析树,典型的遍历机制是循环

for (ParseTree t : XPath.findAll(tree, xpath, parser) ) {
    ... process t ...
}

把path描述的node识别出,并生成一个list的通用方法

List<String> nodes = new ArrayList<String>();
for (ParseTree t : XPath.findAll(tree, xpath, parser) ) {
    if ( t instanceof RuleContext) {
        RuleContext r = (RuleContext)t;
        nodes.add(parser.getRuleNames()[r.getRuleIndex()]);    }      
    else { 
        TerminalNode token = (TerminalNode)t;
        nodes.add(token.getText());
    }      
}

合并XPath和tree模式匹配

// assume we are parsing Java
ParserRuleContext tree = parser.compilationUnit();
String xpath = "//blockStatement/*"; // get children of blockStatement
String treePattern = "int <Identifier> = <expression>;";
ParseTreePattern p =
    parser.compileParseTreePattern(treePattern,   
        ExprParser.RULE_localVariableDeclarationStatement);
List<ParseTreeMatch> matches = p.findAll(tree, xpath);
System.out.println(matches);

https://github.com/antlr/antlr4/blob/master/doc/tree-matching.md

posted @ 2016-12-01 10:22  zhangshihai1232  阅读(1282)  评论(0)    收藏  举报