本文目的:
简单分析一个源码的构成部分,让你大概知道它的重量级和基本信息
文件读写(简)+正则操作(重点)
一、源码字符串的读取与准备
先捡个软柿子捏,
Bundle
类的大小还好,1270行,中等,就他了
1.读取
看AndroidStudio最上面有源码的磁盘路径,新建
JavaSourceParser.java
类
由于源码是既定的字符串文本,使用FileReader
,我想要一行一行读包个BufferedReader
为了看起来爽快一点,异常就直接抛了
public class JavaSourceParser {
@Test
public void parse() throws IOException {
read("H:\\sdk\\sources\\android-27\\android\\os\\Bundle.java");
}
private void read(String name) throws IOException {
File file = new File(name);
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
}
br.close();
}
}
2.源码实体类:SourceBean.java
先定义这几个字段,还是为了方便观看和使用,成员变量用public
/**
* 作者:张风捷特烈<br/>
* 时间:2019/1/18/018:8:30<br/>
* 邮箱:1981462002@qq.com<br/>
* 说明:源码对象
*/
public class SourceBean {
public String name;//类名
public String pkgName;//包名
public int fatherName;//父类名称
public List<String> itfName;//实现的接口名称
public int fullName;//全名称 包名+类名
public List<String> importClass;//导入的类
public int lineCount;//源码行数
public int realLineCount;//真实源码行数---去除注释和空行
public List<String> attrs;//成员变量数量
public List<String> methods;//方法名
}
二、正则的数据解析
1.捕获自己包名
先来练练手,熟悉一下正则,如何精确匹配
package android.os;
你可能会说:"你在逗我吗?一个contains不就搞定了"
1.1:做个小测试
可以看出contains在精确度上有所欠佳
public void match() {
String str1 = "package android.os;";
String str2 = "int countOfpackage = 1;";
System.out.println("str1:"+str1.contains("package"));//str1:true
System.out.println("str2:"+str2.contains("package"));//str2:true
}
1.2:使用正则匹配
\\b?package\\b.*
看这句什么意思?\b
是判断单词边界,两边界夹住package
,
说明有package
作为单词出现,然而package作为一个关键字,是不能用作变量名的,故精确匹配
public void match() {
String str1 = "package android.os;";
String str2 = "int countOfpackage = 1;";
String regex = "\\b?package\\b.*";
System.out.println("str1:" + str1.matches(regex));//str1:true
System.out.println("str2:" + str2.matches(regex));//str2:false
}
1.3:实际使用
ok,捕获类的包名
private void read(String name) throws IOException {
SourceBean sourceBean = new SourceBean();
File file = new File(name);
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
String packageRegx = "\\bpackage\\b.*";
while ((line = br.readLine()) != null) {
if (line.matches(packageRegx)) {
sourceBean.pkgName = line.split("package")[1].replace(";","");
}
}
br.close();
System.out.println(sourceBean.pkgName);// android.os
}
2.捕获引入包名
分析一下:
importClasses
是一个字符串列表,一般都会有很多,方法和上面一样
import android.annotation.Nullable;
import android.util.ArrayMap;
import android.util.Size;
import android.util.SizeF;
import android.util.SparseArray;
import com.android.internal.annotations.VisibleForTesting;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;
//导入类名列表
ArrayList<String> importClasses = new ArrayList<>();
String importRegx = "\\bimport\\b.*";
---->[循环中]-------------
if (line.matches(importRegx)) {
String importClass = line.split("import ")[1].replace(";", "");
importClasses.add(importClass);
}
--------------------------
sourceBean.importClass = importClasses;
3.捕获一些数目
为了方便些,将几个数目单独封装
/**
* 作者:张风捷特烈<br/>
* 时间:2019/1/18/018:10:16<br/>
* 邮箱:1981462002@qq.com<br/>
* 说明:数数bean
*/
public class CountBean {
public int lineCount;//源码行数
public int annoLineCount;//注释行数
public int blankLineCount;//空格行数
public int realLineCount;//真实源码行数---去除注释和空行
public int methodCount;//方法个数
public int attrCount;//成员字段个数
}
---->[SourceBean.java]----------
public CountBean countBean;//数目对象
[番外]---总算有点明白为什么文档注释为什么一列星
这个疑问来源于经常拷贝源码的注释去翻译,每次都要一个个删 * ,存在即合理。
现在看来,对解析真的很方便,因为注释里的可以出现关键字,这就会造成解析时的不精确
注释的首行都是 * ,读行时是 * 就 continue,有助于过滤注释和记录注释的行数
/**
* Constructs a new, empty Bundle sized to hold the given number of
* elements. The Bundle will grow as needed.
*
* @param capacity the initial capacity of the Bundle
*/
public Bundle(int capacity) {
super(capacity);
mFlags = FLAG_HAS_FDS_KNOWN | FLAG_ALLOW_FDS;
}
获取源码总行数、空白行数、注释行数、真实源码行数
CountBean countBean = new CountBean();
int annoLineCount = 0;//注释行数
int blankLineCount = 0;//空白行数
int allCount = 0;//空白行数
while ((line = br.readLine()) != null) {
allCount++;
if (line.contains("*")) {
annoLineCount++;
continue;
}
if (line.equals("")) {
blankLineCount++;
continue;
}
//...同上
-------------------------------------
countBean.annoLineCount = annoLineCount;
countBean.blankLineCount = blankLineCount;
countBean.lineCount = allCount;
countBean.realLineCount = allCount - blankLineCount - annoLineCount;
sourceBean.countBean = countBean;
System.out.println(sourceBean.countBean.annoLineCount);//560
System.out.println(sourceBean.countBean.blankLineCount);//96
System.out.println(sourceBean.countBean.lineCount);//1275
System.out.println(sourceBean.countBean.realLineCount);//619
4.捕获类名、父类名,实现接口名
4.1.封装一下ClassBean.java
/**
* 作者:张风捷特烈<br/>
* 时间:2019/1/18/018:10:36<br/>
* 邮箱:1981462002@qq.com<br/>
* 说明:类的基本信息
*/
public class ClassBean {
public String perFix;//前缀修饰
public String name;//类名
public String fatherName;//父类名称
public List<String> itfNames;//实现的接口名称
public String fullName;//全名称 包名+类名
}
---->[SourceBean.java]----------
public ClassBean classBean;//类的基本信息
4.2:解析类的基本信息
获取下一个单词的方法封装,
单词必须一个空格隔开
源码中适用
/**
* 获取下一个单词(//TODO 适用:单词必须一个空格隔开)
* @param line 字符串
* @param target 目标字符串
* @return 下一个单词
*/
private String getNextWordBy(String line, String target) {
if (!line.contains(target+" ") || line.endsWith(target)) {
return "NO FOUND";
}
return line.split(target + " ")[1].split(" ")[0];
}
4.3:解析类名、父类名,实现接口名
ClassBean classBean = new ClassBean();
String classRegx = ".*\\bclass\\b.*";
String className = "";//类名
String fatherName = "";//父类名
String perFix = "";//前缀秋色
ArrayList<String> itfNames = new ArrayList<>();//接口名
---->[循环中]-------------
//处理类名、父类名、接口名
if (line.matches(classRegx)) {
perFix = line.split(" class ")[0];
className = getNextWordBy(line, "class");//类名
if (line.contains("extends")) {//父类名
fatherName = getNextWordBy(line, "extends");
} else {
fatherName = "Object";
}
if (line.contains("implements")) {
String implementsStr = line.split("implements ")[1].split(" \\{")[0];
String[] split = implementsStr.replaceAll(" ","").split(",");
itfNames.addAll(Arrays.asList(split));
}
}
----------------------------
classBean.name = className;
classBean.fatherName = fatherName;
classBean.fullName = pkgName + "." + className;
classBean.itfNames = itfNames;
classBean.perFix = perFix;
sourceBean.classBean = classBean;
System.out.println(sourceBean.classBean.name);//Bundle
System.out.println(sourceBean.classBean.fatherName);//BaseBundle
System.out.println(sourceBean.classBean.fullName);//android.os.Bundle
System.out.println(sourceBean.classBean.perFix);//public final
5、获取字段信息
暂时先获取字段的字符串:
public List<String> attrs;//成员变量集合
5.1匹配成员变量
观察一下,再结合实际,定义成员变量时:
(访问限定符) (修饰符) 类型 名称 (= 默认值);
其中括号里是可省略的,多番考虑,无法分辨方法内部变量和成员变量
所以使用宏观上,将代码合成字符串,再做考量,根据成员变量在类的最上面这一点来进行分割
StringBuffer pureCodeSb = new StringBuffer();//无注释的代码
---->[循环中,排除空行,注释后]-------------
pureCodeSb.append(line + "\n");
---------------------------------------------
String pureCode = pureCodeSb.toString();//无注释的纯代码
String attrDirty = pureCode.split("\\{")[1];//脏乱的属性
System.out.println(attrDirty);
5.3:成员变量的解析
将获取的字符串分割
private void handleAttr(String code) {
String attrDirty = code.split("\\{")[1];//脏乱的属性
String[] split = attrDirty.split(";");
for (int i = 0; i < split.length-1; i++) {
System.out.println(split[i]);
}
}
5.4:成员变量的归整
换行和过多的空行都不要,正则表达式
"\n|( {2,})
//成员变量集合
attrs = new ArrayList<>();
private void handleAttr(String code) {
String attrDirty = code.split("\\{")[1];//脏乱的属性
String[] split = attrDirty.split(";");
for (int i = 0; i < split.length - 1; i++) {
String result = split[i].replaceAll("\n|( {2,})", "-");
attrs.add(result);
}
}
6.匹配方法
有限定符的方法正则:
(\b?(private|public|protecte).*\(.*)\{
String methodRegex = "(.*(private|public|protecte).*\\(.*)\\{";
ArrayList<String> methods = new ArrayList<>();
//方法名的解析
if (line.matches(methodRegex)) {
String result = line.replaceAll("\\{|( {2,})", "");
methods.add(result);
}
数据在手,天下我有,显示一下呗。
三、优化与适配
局限性还是有的,就是内部类会来坏事,一行一行读也就无法满足需求了,那就整个吞吧
1.小适配
下面的情况刚才没有想到,修改起来很简单价格空格就行了,以{结尾就行了
(.*\b class\b.*)\{
2.获取内部类、接口、枚举名称
使用正则匹配
(.*\\b (class|interface|enum)\\b.*)\\{
获取信息
ArrayList<String> sonClasses = new ArrayList<>();//导入类名列表
String code = pureCodeSb.toString();
String classRegx = "(.*\\b (class|interface|enum)\\b.*)\\{";
Pattern pattern = Pattern.compile(classRegx);
Matcher matcher = pattern.matcher(code);
while (matcher.find()) {
String aClass = matcher.group(0);
System.out.println(aClass.replaceAll("\\{|( {2,})",""));
sonClasses.add(aClass.replaceAll("\\{|( {2,})",""));
}
mMainSource.innerClassName = sonClasses;
V0.01就这样,当然还有很多可优化点,
比如通过内部类的再解析
属性方法字符串的再解析
根据解析的数据来自定定义控件来完美展现源码信息
比如不同的修饰符不同颜色,或者似有和公有方法的占比图
还有注释的展现也可以去做。
最后把总的源码贴上
/**
* 作者:张风捷特烈<br/>
* 时间:2019/1/18/018:8:33<br/>
* 邮箱:1981462002@qq.com<br/>
* 说明:源码分析器
*/
public class JavaSourceParser {
private List<String> attrs;
private int annoLineCount;//注释行数
private int blankLineCount;//空白行数
private int allCount;//全部行数
private StringBuffer pureCodeSb;
// private StringBuffer codeSb;
private boolean mainOk;
private final SourceBean mMainSource;
public JavaSourceParser() {
mMainSource = new SourceBean();
}
public SourceBean parse(String name) {
File file = new File(name);
BufferedReader br = null;
// codeSb = new StringBuffer();
//无注释的代码
pureCodeSb = new StringBuffer();
//成员变量集合
attrs = new ArrayList<>();
String aLine;
String packageRegx = "\\bpackage\\b.*";
String importRegx = "\\bimport\\b.*";
String pkgName = "";
ArrayList<String> importClasses = new ArrayList<>();//导入类名列表
ArrayList<String> sonClasses = new ArrayList<>();
try {
br = new BufferedReader(new FileReader(file));
while ((aLine = br.readLine()) != null) {
// codeSb.append(aLine + "\n");
//处理数量
allCount++;
if (aLine.contains("*")) {
annoLineCount++;
continue;
}
if (aLine.equals("")) {
blankLineCount++;
continue;
}
pureCodeSb.append(aLine + "\n");
//处理包名
if (aLine.matches(packageRegx)) {
pkgName = aLine.split("package ")[1].replace(";", "");
}
//处理导入包名
if (aLine.matches(importRegx)) {
String importClass = aLine.split("import ")[1].replace(";", "");
importClasses.add(importClass);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null) {
br.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
String code = pureCodeSb.toString();
String classRegx = "(.*\\b (class|interface|enum)\\b.*)\\{";
Pattern pattern = Pattern.compile(classRegx);
Matcher matcher = pattern.matcher(code);
while (matcher.find()) {
String aClass = matcher.group(0);
System.out.println(aClass.replaceAll("\\{|( {2,})", ""));
sonClasses.add(aClass.replaceAll("\\{|( {2,})", ""));
}
SourceBean sourceBean = parseCode(pureCodeSb.toString(), mMainSource);
mMainSource.pkgName = pkgName;
mMainSource.importClass = importClasses;
mMainSource.innerClassName = sonClasses;
mMainSource.classBean.fullName = mMainSource.pkgName + "." + mMainSource.classBean.name;
return sourceBean;
}
private SourceBean parseCode(String code, SourceBean sourceBean) {
CountBean countBean = new CountBean();
ClassBean classBean = new ClassBean();
String classRegx = "(.*\\b class\\b.*)\\{";
String methodRegex = "(.*(private|public|protecte).*\\(.*)\\{";
ArrayList<String> methods = new ArrayList<>();
String className = "";//类名
String fatherName = "";//父类名
String perFix = "";//前缀修饰
ArrayList<String> itfNames = new ArrayList<>();//接口名
String[] lines = code.split("\n");
for (String line : lines) {
//方法名的解析
if (line.matches(methodRegex)) {
String result = line.replaceAll("\\{|( {2,})", "");
methods.add(result);
}
//处理类名、父类名、接口名
if (line.matches(classRegx) && !mainOk) {
perFix = line.split(" class ")[0];
className = getNextWordBy(line, "class");//类名
if (line.contains("extends")) {//父类名
fatherName = getNextWordBy(line, "extends");
} else {
fatherName = "Object";
}
if (line.contains("implements")) {
String implementsStr = line.split("implements ")[1].split(" \\{")[0];
String[] split = implementsStr.replaceAll(" ", "").split(",");
itfNames.addAll(Arrays.asList(split));
}
mainOk = true;
}
}
handleAttr(pureCodeSb.toString());//无注释的纯代码
countBean.annoLineCount = annoLineCount;
countBean.blankLineCount = blankLineCount;
countBean.lineCount = allCount;
countBean.realLineCount = allCount - blankLineCount - annoLineCount;
countBean.attrCount = attrs.size();
countBean.methodCount = methods.size();
sourceBean.countBean = countBean;
classBean.name = className;
classBean.fatherName = fatherName;
classBean.itfNames = itfNames;
classBean.perFix = perFix;
sourceBean.classBean = classBean;
sourceBean.attrs = attrs;
sourceBean.methods = methods;
return sourceBean;
}
private void handleAttr(String code) {
String attrDirty = code.split("\\{")[1];//脏乱的属性
String[] split = attrDirty.split(";");
for (int i = 0; i < split.length - 1; i++) {
String result = split[i].replaceAll("\n|( {2,})", "-");
attrs.add(result);
}
}
/**
* 获取下一个单词(//TODO 适用:单词必须一个空格隔开)
*
* @param line 字符串
* @param target 目标字符串
* @return 下一个单词
*/
private String getNextWordBy(String line, String target) {
if (!line.contains(target + " ") || line.endsWith(target)) {
return "NO FOUND";
}
return line.split(target + " ")[1].split(" ")[0];
}
}
后记:捷文规范
1.本文成长记录及勘误表
项目源码 | 日期 | 备注 |
---|---|---|
V0.1-github | 2018-1-18 | 锻造正则神兵之Java源码分析器-V0.01 |
2.更多关于我
笔名 | 微信 | 爱好 | |
---|---|---|---|
张风捷特烈 | 1981462002 | zdl1994328 | 语言 |
我的github | 我的简书 | 我的掘金 | 个人网站 |
3.声明
1----本文由张风捷特烈原创,转载请注明
2----欢迎广大编程爱好者共同交流
3----个人能力有限,如有不正之处欢迎大家批评指证,必定虚心改正
4----看到这里,我在此感谢你的喜欢与支持