怎么看java源程序的注释,如何在源代码中找到所有注释?

小编典典

为了可靠地在Java源文件中找到所有注释,我不会使用regex,而是使用真正的词法分析器(aka Tokenizer)。

Java的两个流行选择是:

与流行的看法相反,ANTLR也可用于 仅 创建词法分析器而不使用语法分析器。

这是ANTLR快速演示。您需要在同一目录中包含以下文件:

JavaCommentLexer.g(语法)

Main.java

Test.java(有效(!)的Java源文件,带有奇异注释)

JavaCommentLexer.g

lexer grammar JavaCommentLexer;

options {

filter=true;

}

SingleLineComment

: FSlash FSlash ~('\r' | '\n')*

;

MultiLineComment

: FSlash Star .* Star FSlash

;

StringLiteral

: DQuote

( (EscapedDQuote)=> EscapedDQuote

| (EscapedBSlash)=> EscapedBSlash

| Octal

| Unicode

| ~('\\' | '"' | '\r' | '\n')

)*

DQuote {skip();}

;

CharLiteral

: SQuote

( (EscapedSQuote)=> EscapedSQuote

| (EscapedBSlash)=> EscapedBSlash

| Octal

| Unicode

| ~('\\' | '\'' | '\r' | '\n')

)

SQuote {skip();}

;

fragment EscapedDQuote

: BSlash DQuote

;

fragment EscapedSQuote

: BSlash SQuote

;

fragment EscapedBSlash

: BSlash BSlash

;

fragment FSlash

: '/' | '\\' ('u002f' | 'u002F')

;

fragment Star

: '*' | '\\' ('u002a' | 'u002A')

;

fragment BSlash

: '\\' ('u005c' | 'u005C')?

;

fragment DQuote

: '"'

| '\\u0022'

;

fragment SQuote

: '\''

| '\\u0027'

;

fragment Unicode

: '\\u' Hex Hex Hex Hex

;

fragment Octal

: '\\' ('0'..'3' Oct Oct | Oct Oct | Oct)

;

fragment Hex

: '0'..'9' | 'a'..'f' | 'A'..'F'

;

fragment Oct

: '0'..'7'

;

Main.java

import org.antlr.runtime.*;

public class Main {

public static void main(String[] args) throws Exception {

JavaCommentLexer lexer = new JavaCommentLexer(new ANTLRFileStream("Test.java"));

CommonTokenStream tokens = new CommonTokenStream(lexer);

for(Object o : tokens.getTokens()) {

CommonToken t = (CommonToken)o;

if(t.getType() == JavaCommentLexer.SingleLineComment) {

System.out.println("SingleLineComment :: " + t.getText().replace("\n", "\\n"));

}

if(t.getType() == JavaCommentLexer.MultiLineComment) {

System.out.println("MultiLineComment :: " + t.getText().replace("\n", "\\n"));

}

}

}

}

Test.java

\u002f\u002a

multi

line

comment // not a single line comment

\u002A/

public class Test {

// single line "not a string"

String s = "\u005C" \242 not // a comment \\\" \u002f \u005C\u005C \u0022;

/*

regular multi line comment

*/

char c = \u0027"'; // the " is not the start of a string

char q1 = '\u005c''; // == '\''

char q2 = '\u005c\u0027'; // == '\''

char q3 = \u0027\u005c\u0027\u0027; // == '\''

char c4 = '\047';

String t = "/*";

\u002f\u002f another single line comment

String u = "*/";

}

现在,要运行演示,请执行以下操作:

bart@hades:~/Programming/ANTLR/Demos/JavaComment$ java -cp antlr-3.2.jar org.antlr.Tool JavaCommentLexer.g

bart@hades:~/Programming/ANTLR/Demos/JavaComment$ javac -cp antlr-3.2.jar *.java

bart@hades:~/Programming/ANTLR/Demos/JavaComment$ java -cp .:antlr-3.2.jar Main

并且您将看到以下内容打印到控制台:

MultiLineComment :: \u002f\u002a

SingleLineComment :: // single line "not a string"

SingleLineComment :: // a comment \\\" \u002f \u005C\u005C \u0022;

MultiLineComment :: /*\n regular multi line comment\n */

SingleLineComment :: // the " is not the start of a string

SingleLineComment :: // == '\''

SingleLineComment :: // == '\''

SingleLineComment :: // == '\''

SingleLineComment :: \u002f\u002f another single line comment

编辑

当然,您可以使用正则表达式自己创建一种词法分析器。但是,以下演示不处理源文件中的Unicode文字:

Test2.java

/*

multi

line

comment // not a single line comment

*/

public class Test2 {

// single line "not a string"

String s = "\" \242 not // a comment \\\" ";

/*

regular multi line comment

*/

char c = '"'; // the " is not the start of a string

char q1 = '\''; // == '\''

char c4 = '\047';

String t = "/*";

// another single line comment

String u = "*/";

}

Main2.java

import java.util.*;

import java.io.*;

import java.util.regex.*;

public class Main2 {

private static String read(File file) throws IOException {

StringBuilder b = new StringBuilder();

Scanner scan = new Scanner(file);

while(scan.hasNextLine()) {

String line = scan.nextLine();

b.append(line).append('\n');

}

return b.toString();

}

public static void main(String[] args) throws Exception {

String contents = read(new File("Test2.java"));

String slComment = "//[^\r\n]*";

String mlComment = "/\\*[\\s\\S]*?\\*/";

String strLit = "\"(?:\\\\.|[^\\\\\"\r\n])*\"";

String chLit = "'(?:\\\\.|[^\\\\'\r\n])+'";

String any = "[\\s\\S]";

Pattern p = Pattern.compile(

String.format("(%s)|(%s)|%s|%s|%s", slComment, mlComment, strLit, chLit, any)

);

Matcher m = p.matcher(contents);

while(m.find()) {

String hit = m.group();

if(m.group(1) != null) {

System.out.println("SingleLine :: " + hit.replace("\n", "\\n"));

}

if(m.group(2) != null) {

System.out.println("MultiLine :: " + hit.replace("\n", "\\n"));

}

}

}

}

如果运行Main2,则会在控制台上打印以下内容:

MultiLine :: /*

SingleLine :: // single line "not a string"

MultiLine :: /*\n regular multi line comment\n */

SingleLine :: // the " is not the start of a string

SingleLine :: // == '\''

SingleLine :: // another single line comment

2020-11-19

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值