ANTLR4 词法分析器应用之利用文法动作直接在G4文件中加入处理逻辑用法(XML解析)

词法分析器是基于编译原理的应用。可以很好的解析文本和修改文本。

今天就以简单的XML文件解析来简单说明其用法。

注:适用读者,对词法分析器已入门,或有一定了解,并基本熟悉java语言

1,G4文件原版出处,直接到GitHub下载

2,在原版上修改G4文件。

/*
 [The "BSD licence"]
 Copyright (c) 2013 Terence Parr
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 3. The name of the author may not be used to endorse or promote products
    derived from this software without specific prior written permission.

 THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

/** XML parser derived from ANTLR v4 ref guide book example */
parser grammar XMLParser;

options { tokenVocab=XMLLexer; }

@parser::header{
import java.util.HashMap;
import java.util.Map;
import java.util.Stack;
import org.apache.commons.lang3.StringUtils;
}

@parser::members{
Stack<String> nodeStack = new Stack<>();

private Map<String, List<String>> nodeValueMap = new HashMap<>();

private String createKey() {
	return StringUtils.join(nodeStack, ".");
}

private void add(String value){
	String key = createKey().toLowerCase();
	if(!nodeValueMap.containsKey(key)) {
		nodeValueMap.put(key, new ArrayList<>());
	}
	nodeValueMap.get(key).add(value);
}
}

document returns[Map<String, List<String>> resultMap]    :   prolog? misc* element misc* {$resultMap = nodeValueMap;};

prolog      :   XMLDeclOpen attribute* SPECIAL_CLOSE ;

content     :   (chardata {add($chardata.text);})?
                ((element | reference {add($reference.text);} | CDATA {add($CDATA.text);} | PI {add($PI.text);} | COMMENT {add($COMMENT.text);}) (chardata {add($chardata.text);})?)* ;

element     :   '<' Name {nodeStack.push($Name.text);} attribute* '>' content '<' '/' Name '>' {nodeStack.pop();}
            |   '<' Name {nodeStack.push($Name.text);} attribute* '/>' {nodeStack.pop();}
            ;

reference   :   EntityRef | CharRef ;

attribute   :  Name{nodeStack.push($Name.text);}   '=' STRING {add($STRING.text == null ? $STRING.text : $STRING.text.replaceAll("^\"|\"$", "").replaceAll("^'|'$", ""));nodeStack.pop();}; // Our STRING is AttValue in spec

/** ``All text that is not markup constitutes the character data of
 *  the document.''
 */
chardata    :   TEXT | SEA_WS ;

misc        :   COMMENT | PI | SEA_WS ;

3,编译G4文件生成java类,下面请看调用词法分析器XML的代码

	public static final Map<String, List<String>> getXmlInfo(String xmlFile, String encoding) throws IOException {
		CharStream stream = null;
		if(!FileUtil.fileExist(xmlFile)) {
			stream = CharStreams.fromStream(KnwhwCfgUtils.class.getClassLoader().getResource(xmlFile).openStream(), Charset.forName(encoding));
		}else {
			stream = CharStreams.fromFileName(xmlFile, Charset.forName(encoding));
		}
		Lexer lexer = new XMLLexer(stream);
		CommonTokenStream commonTokenStream = new CommonTokenStream(lexer);
		XMLParser parser = new XMLParser(commonTokenStream);
		return parser.document().resultMap;
	}

4,G4文件简单说明,请看图解(不够详细,如有疑问请留言)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值