javac源码详解openJDKSE8版本4DataFlow源码详解
DataFlow概述
1.Flow: 数据流分析的主要类,基本包含了所有数据流分析的代码
2.TreeScanner: 遍历抽象语法树,根据不同的语法树进行不同的数据流分析
3. BaseAnalyzer: 提供4大数据流分析的基本方法,同时也是主要处理Jumps(break;continue;return;throw)的地方
4. AliveAnalyzer: 活性分析,检查所有的语句是否都有机会被执行。重要的属性alive标识当前抽象语法树是否可以有机会被执行。
5. AssignAnalyzer: 赋值分析,变量初始化分析,检查变量再使用时是否已经被赋初值,检查final变量是否被二次赋值。
6. AbstractAssignAnalyzer: 抽象赋值分析,变量初始化分析,检查变量再使用时是否已经被赋初值,检查final变量是否被二次赋值。常用的属性有:inits(被赋值的变量的集合)。uninits(未被赋值 的变量的集合)。uninitsTry(try block中没有赋值的变量的集合)。initsWhenTrue(控制条件为真时,被赋值的变量的集合)。initsWhenFalse(控制条件为假时,被赋值的变量的集合)。uninitsWhenTrue(控制条件为真时,未被赋值的变量的集合)。uninitsWhenFalse(控制条件为假时,未被赋值的变量的集合)。vardecls(被定义的变量的数组)。classDef(当前被定义的类)。firstadr(当前被定义的类的第一个变量的再被定义的变量的数组中的下标)。nextadr(当前被定义的类的下一个变量的再被定义的变量的数组中的下标)。returnadr(当前被定义的block的第一个可以return的变量的再被定义的变量的数组中的下标)。syms(symbolTable)。names(nameTable)
7. FlowAnalyzer: 异常分析,主要检测变量checked异常是否都被处理。主要的属性有:preciseRethrowTypes(标识最后一行代码是否可以被正确执行)。thrown(抛出的checked异常)。caught(被捕获的checked异常)。
8. CaptureAnalyzer: 主要检测final变量的二次赋值。
9. PendingExit: 主要用来保存continue,break,return,exception这四种具有jump能力的语句
10. AbstractAssignPendingExit:主要用来保存退出block时,已经被初始化的变量和未被初始化的变量还有一开始进入block时的初始化变量和为被初始化的变量。主要的属性为:inits,uninits,exit_inits,exit_uninits
11. Bits:用来控制对变量状态的变化,以及对inits,uninits集合的操作。
11.讲解的安排: 这里会粗略讲解 AliveAnalyzer(这里非常简单),着重讲解AssignAnalyzer(编译原理中多次以此为例子,刚好可以拿来印证下),不讲解FlowAnalyzer(感兴趣的可以自行阅读源码)和FlowAnalyzer(这个是做为AssignAnalyzer的一种特列)。
入口
从com.sun.tools.javac.main.JavaCompiler的863行
delegateCompiler.compile2();
delegateCompiler.close();
elapsed_msec = delegateCompiler.elapsed_msec;
一直点进来到com.sun.tools.javac.comp.Flow的208行
public void analyzeTree(Env<AttrContext> env, TreeMaker make) {
new AliveAnalyzer().analyzeTree(env, make);
new AssignAnalyzer(log, syms, lint, names).analyzeTree(env);
new FlowAnalyzer().analyzeTree(env, make);
new CaptureAnalyzer().analyzeTree(env, make);
}
AliveAnalyzer概述
这里的检查是相对比较简单的:主要针对如:if条件始终为false,while条件始终为false,return后面有语句,break后面有语句。
AliveAnalyzer入口
老样子,使用观察者模式进行,遍历抽象语法树
/**************************************************************************
* main method
*************************************************************************/
/** Perform definite assignment/unassignment analysis on a tree.
*/
public void analyzeTree(Env<AttrContext> env, TreeMaker make) {
analyzeTree(env, env.tree, make);
}
public void analyzeTree(Env<AttrContext> env, JCTree tree, TreeMaker make) {
try {
attrEnv = env;
Flow.this.make = make;
pendingExits = new ListBuffer<PendingExit>();
alive = true;
scan(tree);
} finally {
pendingExits = null;
Flow.this.make = null;
}
}
AliveAnalyzer 处理class
经过attribute环节,仅需要处理静态初始化方法,实例初始化方法和所有的方法即可
/* ------------ Visitor methods for various sorts of trees -------------*/
public void visitClassDef(JCClassDecl tree) {
if (tree.sym == null) return;
boolean alivePrev = alive;
ListBuffer<PendingExit> pendingExitsPrev = pendingExits;
Lint lintPrev = lint;
pendingExits = new ListBuffer<PendingExit>();
lint = lint.augment(tree.sym);
try {
// process all the static initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) != 0) {
scanDef(l.head);
}
}
// process all the instance initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) == 0) {
scanDef(l.head);
}
}
// process all the methods
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(METHODDEF)) {
scan(l.head);
}
}
} finally {
pendingExits = pendingExitsPrev;
alive = alivePrev;
lint = lintPrev;
}
}
AliveAnalyzer 处理method
处理方法内的语句,如果有不可达的语句,则使用日志记录下来
public void visitMethodDef(JCMethodDecl tree) {
if (tree.body == null) return;
Lint lintPrev = lint;
lint = lint.augment(tree.sym);
Assert.check(pendingExits.isEmpty());
try {
alive = true;
scanStat(tree.body);
if (alive && !tree.sym.type.getReturnType().hasTag(VOID))
log.error(TreeInfo.diagEndPos(tree.body), "missing.ret.stmt");
List<PendingExit> exits = pendingExits.toList();
pendingExits = new ListBuffer<PendingExit>();
while (exits.nonEmpty()) {
PendingExit exit = exits.head;
exits = exits.tail;
Assert.check(exit.tree.hasTag(RETURN));
}
} finally {
lint = lintPrev;
}
}
AliveAnalyzer 典例while
这里主要以while语句来说明,校验的过程。
prevPendingExits 保存着之前的jumps。
先处理条件,如果条件始终为false,那么就会标记当前while语句body部分不可达,并在遍历body时打印,不可达信息。
接下来处理,内部的语句,如果内部有语句不可达,则alive变为false。
然后处理所有的continue语句。
然后处理所有的break语句。
public void visitWhileLoop(JCWhileLoop tree) {
ListBuffer<PendingExit> prevPendingExits = pendingExits;
pendingExits = new ListBuffer<PendingExit>();
scan(tree.cond);
alive = !tree.cond.type.isFalse();
scanStat(tree.body);
alive |= resolveContinues(tree);
alive = resolveBreaks(tree, prevPendingExits) ||
!tree.cond.type.isTrue();
}
body错误的打印
/** Analyze a statement. Check that statement is reachable.
*/
void scanStat(JCTree tree) {
if (!alive && tree != null) {
log.error(tree.pos(), "unreachable.stmt");
if (!tree.hasTag(SKIP)) alive = true;
}
scan(tree);
}
AssignAnalyzer概述
遍历抽象语法树,通过inits,uninits来处理没有分支的变量的已被复制和未被赋值问题。通过initsWhenTrue,initsWhenFalse,uninitsWhenTrue,uninitsWhenFalse来处理有分支的变量的已被复制和未被赋值问题。而这些基本都是使用二进制表示的,相关的运算也是使用二进制运算。
AssignAnalyzer入口
老样子,使用观察者模式进行,遍历抽象语法树。并对必要的对象赋初值。
/**************************************************************************
* main method
*************************************************************************/
/** Perform definite assignment/unassignment analysis on a tree.
*/
public void analyzeTree(Env<?> env) {
analyzeTree(env, env.tree);
}
public void analyzeTree(Env<?> env, JCTree tree) {
try {
startPos = tree.pos().getStartPosition();
if (vardecls == null)
vardecls = new JCVariableDecl[32];
else
for (int i=0; i<vardecls.length; i++)
vardecls[i] = null;
firstadr = 0;
nextadr = 0;
pendingExits = new ListBuffer<>();
this.classDef = null;
unrefdResources = new Scope(env.enclClass.sym);
scan(tree);
} finally {
// note that recursive invocations of this method fail hard
startPos = -1;
resetBits(inits, uninits, uninitsTry, initsWhenTrue,
initsWhenFalse, uninitsWhenTrue, uninitsWhenFalse);
if (vardecls != null) {
for (int i=0; i<vardecls.length; i++)
vardecls[i] = null;
}
firstadr = 0;
nextadr = 0;
pendingExits = null;
this.classDef = null;
unrefdResources = null;
}
}
}
AssignAnalyzer 处理class
经过attribute环节,这里先处理静态字段,后处理静态初始化方法,后处理,实例字段,后处理实例初始化方法,最后处理所有方法。
这里处理静态字段和实例字段的还有和方法内的变量具有相似之处。
/* ------------ Visitor methods for various sorts of trees -------------*/
@Override
public void visitClassDef(JCClassDecl tree) {
if (tree.sym == null) {
return;
}
JCClassDecl classDefPrev = classDef;
int firstadrPrev = firstadr;
int nextadrPrev = nextadr;
ListBuffer<P> pendingExitsPrev = pendingExits;
pendingExits = new ListBuffer<P>();
if (tree.name != names.empty) {
firstadr = nextadr;
}
classDef = tree;
try {
// define all the static fields
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(VARDEF)) {
JCVariableDecl def = (JCVariableDecl)l.head;
if ((def.mods.flags & STATIC) != 0) {
VarSymbol sym = def.sym;
if (trackable(sym)) {
newVar(def);
}
}
}
}
// process all the static initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) != 0) {
scan(l.head);
}
}
// define all the instance fields
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(VARDEF)) {
JCVariableDecl def = (JCVariableDecl)l.head;
if ((def.mods.flags & STATIC) == 0) {
VarSymbol sym = def.sym;
if (trackable(sym)) {
newVar(def);
}
}
}
}
// process all the instance initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) == 0) {
scan(l.head);
}
}
// process all the methods
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(METHODDEF)) {
scan(l.head);
}
}
} finally {
pendingExits = pendingExitsPrev;
nextadr = nextadrPrev;
firstadr = firstadrPrev;
classDef = classDefPrev;
}
}
1.静态字段的处理
在class的成员中 找到所有被static修饰的字段, 判断,是否可以被追踪。如果是,则生成生成变量。
// define all the static fields
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(VARDEF)) {
JCVariableDecl def = (JCVariableDecl)l.head;
if ((def.mods.flags & STATIC) != 0) {
VarSymbol sym = def.sym;
if (trackable(sym)) {
newVar(def);
}
}
}
}
是否可以被追踪:1.symbol的位置大于等于当前的起点symbol的位置
2 symol属于方法 或者 (symol是被final修饰,且不能是参数,且symol定义时未带初始化语句,且symbol再当前定义的方法中)
/** Do we need to track init/uninit state of this symbol?
* I.e. is symbol either a local or a blank final variable?
*/
protected boolean trackable(VarSymbol sym) {
return
sym.pos >= startPos &&
((sym.owner.kind == MTH ||
((sym.flags() & (FINAL | HASINIT | PARAMETER)) == FINAL &&
classDef.sym.isEnclosedBy((ClassSymbol)sym.owner))));
}
生成边变量的定义:当需要进行跟踪时。把该变量的定义放入到vardecls数组中,并把该位置放入到uninits集合中,同时调整nextadr,表示下一个可以放置变量定义的位置。
/** Initialize new trackable variable by setting its address field
* to the next available sequence number and entering it under that
* index into the vars array.
*/
void newVar(JCVariableDecl varDecl) {
VarSymbol sym = varDecl.sym;
vardecls = ArrayUtils.ensureCapacity(vardecls, nextadr);
if ((sym.flags() & FINAL) == 0) {
sym.flags_field |= EFFECTIVELY_FINAL;
}
sym.adr = nextadr;
vardecls[nextadr] = varDecl;
exclVarFromInits(varDecl, nextadr);
uninits.incl(nextadr);
nextadr++;
}
2.静态初始化代码处理
对有static关键字修饰的代码进行处理,会用到scan方法,scan方法到时候会在处理方法的时候在去讲解
// process all the static initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) != 0) {
scan(l.head);
}
}
3.实例字段处理
和静态处理的非常类似
// define all the instance fields
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(VARDEF)) {
JCVariableDecl def = (JCVariableDecl)l.head;
if ((def.mods.flags & STATIC) == 0) {
VarSymbol sym = def.sym;
if (trackable(sym)) {
newVar(def);
}
}
}
}
4.实例初始化代码处理
和实例代码块处理的非常类似
// process all the instance initializers
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (!l.head.hasTag(METHODDEF) &&
(TreeInfo.flags(l.head) & STATIC) == 0) {
scan(l.head);
}
}
5.所有方法的处理
和实例代码块处理的非常类似
// process all the methods
for (List<JCTree> l = tree.defs; l.nonEmpty(); l = l.tail) {
if (l.head.hasTag(METHODDEF)) {
scan(l.head);
}
}
AssignAnalyzer 处理method
经过简单的处理之后,会直接调用父类的方法处理method
1.保存在进入方法之前的状态(initsPrev uninitsPrev nextadr firstadr returnadr ),处理完本方法后返回之前的状态。
2.生成参数的定义,并默认参数已经初始化。
@Override
public void visitMethodDef(JCMethodDecl tree) {
if (tree.body == null) {
return;
}
/* MemberEnter can generate synthetic methods ignore them
*/
if ((tree.sym.flags() & SYNTHETIC) != 0) {
return;
}
Lint lintPrev = lint;
lint = lint.augment(tree.sym);
try {
super.visitMethodDef(tree);
} finally {
lint = lintPrev;
}
}
@Override
public void visitMethodDef(JCMethodDecl tree) {
if (tree.body == null) {
return;
}
/* Ignore synthetic methods, except for translated lambda methods.
*/
if ((tree.sym.flags() & (SYNTHETIC | LAMBDA_METHOD)) == SYNTHETIC) {
return;
}
final Bits initsPrev = new Bits(inits);
final Bits uninitsPrev = new Bits(uninits);
int nextadrPrev = nextadr;
int firstadrPrev = firstadr;
int returnadrPrev = returnadr;
Assert.check(pendingExits.isEmpty());
try {
boolean isInitialConstructor =
TreeInfo.isInitialConstructor(tree);
if (!isInitialConstructor) {
firstadr = nextadr;
}
for (List<JCVariableDecl> l = tree.params; l.nonEmpty(); l = l.tail) {
JCVariableDecl def = l.head;
scan(def);
Assert.check((def.sym.flags() & PARAMETER) != 0, "Method parameter without PARAMETER flag");
/* If we are executing the code from Gen, then there can be
* synthetic or mandated variables, ignore them.
*/
initParam(def);
}
// else we are in an instance initializer block;
// leave caught unchanged.
scan(tree.body);
if (isInitialConstructor) {
boolean isSynthesized = (tree.sym.flags() &
GENERATEDCONSTR) != 0;
for (int i = firstadr; i < nextadr; i++) {
JCVariableDecl vardecl = vardecls[i];
VarSymbol var = vardecl.sym;
if (var.owner == classDef.sym) {
// choose the diagnostic position based on whether
// the ctor is default(synthesized) or not
if (isSynthesized) {
checkInit(TreeInfo.diagnosticPositionFor(var, vardecl),
var, "var.not.initialized.in.default.constructor");
} else {
checkInit(TreeInfo.diagEndPos(tree.body), var);
}
}
}
}
List<P> exits = pendingExits.toList();
pendingExits = new ListBuffer<>();
while (exits.nonEmpty()) {
P exit = exits.head;
exits = exits.tail;
Assert.check(exit.tree.hasTag(RETURN), exit.tree);
if (isInitialConstructor) {
assignToInits(exit.tree, exit.exit_inits);
for (int i = firstadr; i < nextadr; i++) {
checkInit(exit.tree.pos(), vardecls[i].sym);
}
}
}
} finally {
assignToInits(tree, initsPrev);
uninits.assign(uninitsPrev);
nextadr = nextadrPrev;
firstadr = firstadrPrev;
returnadr = returnadrPrev;
}
}
遍历形参
public void visitVarDef(JCVariableDecl tree) {
boolean track = trackable(tree.sym);
if (track && tree.sym.owner.kind == MTH) {
newVar(tree);
}
if (tree.init != null) {
scanExpr(tree.init);
if (track) {
letInit(tree.pos(), tree.sym);
}
}
}
遍历方法体
// else we are in an instance initializer block;
// leave caught unchanged.
scan(tree.body);
遍历方法体
pendingExits是非常重要的,它里记录着含着,不同语句退出方法体内的时,初始化变量和未初始化变量,以及进来时的初始化变量和未初始化变量。
List<P> exits = pendingExits.toList();
pendingExits = new ListBuffer<>();
while (exits.nonEmpty()) {
P exit = exits.head;
exits = exits.tail;
Assert.check(exit.tree.hasTag(RETURN), exit.tree);
if (isInitialConstructor) {
assignToInits(exit.tree, exit.exit_inits);
for (int i = firstadr; i < nextadr; i++) {
checkInit(exit.tree.pos(), vardecls[i].sym);
}
}
}
返回之前的状态
} finally {
assignToInits(tree, initsPrev);
uninits.assign(uninitsPrev);
nextadr = nextadrPrev;
firstadr = firstadrPrev;
returnadr = returnadrPrev;
}
AssignAnalyzer 处理block
block的处理尤为简单,按道理说着这里需要处理pendingExits
public void visitBlock(JCBlock tree) {
int nextadrPrev = nextadr;
scan(tree.stats);
nextadr = nextadrPrev;
}
AssignAnalyzer 典例if
处理条件时,将inits和uninits各个分裂成true和false的部分。
处理完身体后,再把true和false的部分合并成inits和uninits部分。
1.先处理条件。
条件为真时:bits时reset状态。先把initsWhenFalse和initsWhenTrue合并到inits,uninitsWhenFalse和uninitsWhenTrue合并到uninits。true的直接复制,false的为全量。
条件为假时:bits时reset状态。先把initsWhenFalse和initsWhenTrue合并到inits,uninitsWhenFalse和uninitsWhenTrue合并到uninits。true的为全量,false的为复制。
条件不能判定条件为真是为假。当不能确定真假时。将Inits和uninits分别拷贝到initsWhenFalse,initsWhenTrue和uninitsWhenFalse,uninitsWhenTrue。
最后清空uninits和inits
2.处理if的body部分:先initsWhenTrue拷贝到inits,把uninitsWhenTrue拷贝到uninits。然后处理body。如果没有elsepart则将之前的initsBeforeElse合并(取交集)到inits,uninitsBeforeElse合并(取交集)到uninits
3.对于elsepart处理类似if部分。只不过条件共用了。
public void visitIf(JCIf tree) {
scanCond(tree.cond);
final Bits initsBeforeElse = new Bits(initsWhenFalse);
final Bits uninitsBeforeElse = new Bits(uninitsWhenFalse);
assignToInits(tree.cond, initsWhenTrue);
uninits.assign(uninitsWhenTrue);
scan(tree.thenpart);
if (tree.elsepart != null) {
final Bits initsAfterThen = new Bits(inits);
final Bits uninitsAfterThen = new Bits(uninits);
assignToInits(tree.thenpart, initsBeforeElse);
uninits.assign(uninitsBeforeElse);
scan(tree.elsepart);
andSetInits(tree.elsepart, initsAfterThen);
uninits.andSet(uninitsAfterThen);
} else {
andSetInits(tree.thenpart, initsBeforeElse);
uninits.andSet(uninitsBeforeElse);
}
}
/** Analyze a condition. Make sure to set (un)initsWhenTrue(WhenFalse)
* rather than (un)inits on exit.
*/
void scanCond(JCTree tree) {
if (tree.type.isFalse()) {
if (inits.isReset()) merge(tree);
initsWhenTrue.assign(inits);
initsWhenTrue.inclRange(firstadr, nextadr);
uninitsWhenTrue.assign(uninits);
uninitsWhenTrue.inclRange(firstadr, nextadr);
initsWhenFalse.assign(inits);
uninitsWhenFalse.assign(uninits);
} else if (tree.type.isTrue()) {
if (inits.isReset()) merge(tree);
initsWhenFalse.assign(inits);
initsWhenFalse.inclRange(firstadr, nextadr);
uninitsWhenFalse.assign(uninits);
uninitsWhenFalse.inclRange(firstadr, nextadr);
initsWhenTrue.assign(inits);
uninitsWhenTrue.assign(uninits);
} else {
scan(tree);
if (!inits.isReset())
split(tree.type != syms.unknownType);
}
if (tree.type != syms.unknownType) {
resetBits(inits, uninits);
}
}
AssignAnalyzer 典例whileLoop
1.先保留进入while之前的状态。
2.先处理condition跟if的condition处理一致
3.把initsWhenFalse复制到initsSkip,把uninitsWhenFalse复制到uninitsSkip待处理完while后合并
4.把initsWhenTrue复制到inits,把uninitsWhenTrue复制到uninits待处理完成后合并。
5.处理while的body。
6.处理continue,continue跳转到自身
7.处理合并:initsSkip复制到inits,uninitsSkip复制到uninits
8.处理break,break跳出当前body。
public void visitWhileLoop(JCWhileLoop tree) {
ListBuffer<P> prevPendingExits = pendingExits;
FlowKind prevFlowKind = flowKind;
flowKind = FlowKind.NORMAL;
final Bits initsSkip = new Bits(true);
final Bits uninitsSkip = new Bits(true);
pendingExits = new ListBuffer<>();
int prevErrors = getLogNumberOfErrors();
final Bits uninitsEntry = new Bits(uninits);
uninitsEntry.excludeFrom(nextadr);
do {
scanCond(tree.cond);
if (!flowKind.isFinal()) {
initsSkip.assign(initsWhenFalse) ;
uninitsSkip.assign(uninitsWhenFalse);
}
assignToInits(tree, initsWhenTrue);
uninits.assign(uninitsWhenTrue);
scan(tree.body);
resolveContinues(tree);
if (getLogNumberOfErrors() != prevErrors ||
flowKind.isFinal() ||
new Bits(uninitsEntry).diffSet(uninits).nextBit(firstadr) == -1) {
break;
}
uninits.assign(uninitsEntry.andSet(uninits));
flowKind = FlowKind.SPECULATIVE_LOOP;
} while (true);
flowKind = prevFlowKind;
//a variable is DA/DU after the while statement, if it's DA/DU assuming the
//branch is not taken AND if it's DA/DU before any break statement
assignToInits(tree.body, initsSkip);
uninits.assign(uninitsSkip);
resolveBreaks(tree, prevPendingExits);
}
DataFlow读后感
- 源码的介绍程度:上面的源码我主要主要选取比较典型的活性分析中和赋初值分析中比较典型的语法树作为例子作为分析。其他的语法树暂时未涉及。这些已经得到我想要得到了。关于二进制操作的部分尚未做详细说明。
- 编译原理:数据流分析中的赋初值分析基本上和编译原理中的分析基本一致,遵循基本快的思想。
- 算法导论:此处暂时尚未涉及到常见的算法,或者编译本身就是一件浩大的算法。
- 对于此过程实现,对于我本次的目标印证情况,非常友好,尤其是变量赋初值的分析基本上和编译原理一致。。