points-to-tutorail①_datalog规则varpointsto-CSDN博客

这篇博客深入探讨了程序分析中的核心概念，包括变量指向分析、别名分析和指针分析。内容涵盖输入关系、计算关系以及分析逻辑，如VarPointsTo、FieldSensitivity和Reachable等。此外，还讨论了异常分析，如何处理throw和catch关系，以及在调用图中传播异常。最后，介绍了反射分析，包括ConstantForClass和ReifiedMethod等输入关系，以及如何利用这些信息进行方法调用的推断。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

指向分析：偏向于变量可以指向哪些对象
别名分析：偏向于对变量或者表达式是否是别名关系，即是否指向同一个对象。
指针分析则是两者的结合。
输入关系：对应于分析的中间语言。逻辑上分为表示指令的关系和表示名称和类型信息的关系。例如Alloc关系，表示分配新的堆对象，并将它分配给方法内部的局部变量的每条指令。所有的其他指令类型（移动、加载、存储和VCall）都有类似的输入关系。
还有编码类型系统、符号表和程序环境信息的关系。例如：
输入关系
FormalArg显示没某个变量是给定方法的第几个形式参数。
LookUp将函数签名和a tyep中的实际方法定义相匹配。
HeapType将对象和其类型匹配。？
ACTUALRETURN也是函数的第一个参数，并在接受方法调用返回值的调用站点返回局部变量。
VarType将变量映射到其类型
SubType将一个类型连接到其超类。
InMethod is a function from instructions to their containing methods.？

V is a set of program variables
H is a set of heap abstractions (i.e., allocation sites)
M is a set of method identifiers
S is a set of method signatures (including name, type signature)
F is a set of fields
I is a set of instructions
T is a set of class types
N is the set of natural numbers
Alloc(var : V, heap : H, inMeth : M)		# var = new ...
Move(to : V, from : V)		# to = from
Load(to : V, base : V, fld : F)		# to = base.fld
Store(base : V, fld : F, from : V)		# base.fld = from
VCall(base : V, sig : S, invo : I, inMeth : M)		# base.sig(..)
FormalArg(meth : M, n :N, arg : V)
ActualArg(invo : I, n :N, arg : V)
FormalReturn(meth : M, ret : V)
ActualReturn(invo : I, var : V)
ThisVar(meth : M, this : V)
HeapType(heap : H, type : T)
LookUp(type : T, sig : S, meth : M)
VarType(var: V, type: T),
InMethod(instr : I, meth : M)
Subtype(type : T, superT : T)
VarPointsTo(var : V, heap : H)
CallGraph(invo : I, meth : M)
FldPointsTo(baseH : H, fld : F, heap : H)
InterProcAssign(to : V, from : V)
Reachable(meth : M)

计算关系
VarPointsTo:将变量连接到堆对象
CallGraph：调用图
分析逻辑
Filed Sensitivity：分析能够区分同一抽象对象的不同场。比如：一个存储输入fact和一个VarPointTo推断共同lead to 对给定的堆对象和field计算一个FldPointTo关
系fact。
Reachable：保存始终可访问的方法。

Andersen-style的指针分析和call-graph结构的Datalog规则

VarPointsTo(var, heap)←
Reachable(meth),Alloc(var, heap, meth).

VarPointsTo(to, heap)←
Move(to, from),V arPointsTo(from, heap).

FldPointsTo(baseH, fld, heap)←
Store(base, fld, from),V arPointsTo(from, heap),
VarPointsTo(base, baseH).

VarPointsTo(to, heap)←
Load(to, base, fld),V arPointsTo(base, baseH),
FldPointsTo(baseH, fld, heap).

Reachable(toMeth),
VarPointsTo(this, heap),
CallGraph(invo, toMeth)←
VCall(base, sig, invo, inMeth),Reachable(inMeth),
VarPointsTo(base, heap),
HeapType(heap, heapT),Lookup(heapT, sig, toMeth),
ThisVar(toMeth, this).

InterProcAssign(to, from)←
CallGraph(invo, meth),
FormalArg(meth, n, to),ActualArg(invo, n, from).

InterProcAssign(to, from)←
CallGraph(invo, meth),
FormalReturn(meth, from),ActualReturn(invo, to).

VarPointsTo(to, heap)←
InterProcAssign(to, from),
VarPointsTo(from, heap).

新的计算关系
对上面的补充，用于分析数组

输入和计算关系
ArrayLoad(to : V, base : V)		# to = base[...]
ArrayStore(base : V, from : V)		# base[...] = from

ComponentType(type : T, compT : T)
ArrayContentsPointTo(baseH : H, heap : H)

ArrayContentsPointTo
建立在VarPointsTo facts基础之上，反之亦然（也就是说他们俩互相依赖）。

Datalog Rules
ArrayContentsPointTo(baseH, heap)←
ArrayStore(base, from),
VarPointsTo(base, baseH),V arPointsTo(from, heap),
HeapType(heap, hType),HeapType(baseH, baseHType),
ComponentType(baseHtype, componentType),
Subtype(hType, componentType).

VarPointsTo(to, heap)←
ArrayLoad(to, base),
VarPointsTo(base, baseH),ArrayContentsPointTo(baseH, heap).

异常分析
异常分析有throwing和catch。
throw关系捕获由局部变量引用的表达式对象。
catch关系将指令i与变量a连接起来，指令i可以（直接或者通过调用）抛出动态类型T的异常，局部变量a将被分配到相应的捕获站点上的异常对象。catch不能直接映射到单个中间语言指令，但可以通过这种低级输入轻松的进行计算。

输入和计算关系
Throw(instr : I, e : V)		# throw(e)

Catch(heapT : T, instr : I, arg : V)		# catch(arg)

ThrowPointsTo(meth : M, heap : H)

计算的输出关系是ThrowPointsTo，它捕获方法可能向其调用方抛出的异常。

Datalog rules
ThrowPointsTo(meth, heap)←
InMethod(instr, meth),Throw(instr, e),
VarPointsTo(e, heap),HeapType(heap, heapT),
!Catch(heapT, instr, _).
#当同一个方法中不存在匹配的catch指令时，捕获引发异常的情况

ThrowPointsTo(meth, heap)←
InMethod(invo, meth),CallGraph(invo, toMeth),
ThrowPointsTo(toMeth, heap),HeapType(heap, heapT),
!Catch(heapT, invo, _),
#在调用方中不存在匹配处理程序的情况下，传递调用方法引发的异常的传播。

VarPointsTo(arg, heap)←
Throw(instr, e),V arPointsTo(e, heap),
HeapType(heap, heapT),Catch(heapT, instr, arg).
VarPointsTo(arg, heap)←
CallGraph(invo, toMeth),
ThrowPointsTo(toMeth, heap),
HeapType(heap, heapT),Catch(heapT, invo, arg).
#3/4：对本地或者通过传递调用的方法抛出但被捕获的异常的互补情况进行建模。结果是再catch子句中为异常对象指定的变量进行VarPointsTo推断。

反射分析
在基础指针分析上添加了一些辅助输入关系

输入关系
ConstantForClass(h: H, t: T) #将该类/类型编码为程序中由常量字符串表示的名称
ConstantForMethod(h: H, sig: S) #同上
ReifiedClass(t: T, h: H) #将抽象对象和其表示的类类型连接起来
ReifiedMethod(sig: S, h: H) #将抽象方法对象与方法签名连接起来
ReifiedHeapAllocation(i: I, t: T, h: H) #返回一个抽象对象，以表示在调用站点i上通过newinstance调用分配的所有类型为t的动态对象

Datalog rules
VarPointsTo(r, h)←
SCall("Class.forName", i, _),ActualReturn(i, r)
ActualArg(i, 0, p),VarPointsTo(p, c),
ConstantForClass(c, t),ReifiedClass(t, h).
1、如果forName调用的第一个参数指向一个与类名匹配的字符串常量对象，那么接受forName调用返回值的局部变量将指向该类的抽象对象h。
VarPointsTo(r, h)←
VCall(v,"Class.newInstance", i, _),
VarPointsTo(v,hc),ReifiedClass(t,hc),
ActualReturn(i, r),ReifiedHeapAllocation(i, t, h).
2、处理newinstance调用，如果调用的接收方对象h是类t的类对象，并且newinstance调用被分配给了变量r，则r可以指向特殊分配站点h，该站点指定再newinstance调用站点分配的t类型的对象。
VarPointsTo(r,hm)←
VCall(b,"Class.getMethod", i, _),ActualReturn(i, r),
VarPointsTo(b,hc),ReifiedClass(t,hc),
ActualArg(i, 1, p),
VarPointsTo(p, c),ConstantForMethod(c, s),
LookUp(t, s, _),ReifiedMethod(s,hm).
3、如果接受方b和第一个参数p（编码所需方法签名的字符串）进行此类调用，并且分析已经确定b和p可能指向的对象，则假设p指向编码签名s的字符串常量，改签名s存在于b指向的类型中，保存getMethod调用结果的变量r指向该方法签名的反射对象h。
CallGraphEdge(i, m)←
VCall(b,"Method.invoke", i, _),
VarPointsTo(b,hm),ReifiedMethod(s,hm),
ActualArg(i, 1, p),V arPointsTo(p, h),
HeapType(h, t),LookUp(t, s, m).
4、使用反射信息推断更多的调用图边。如果invoke调用的接收者b指向编码方法签名的反射对象，并且invoke调用者（反射调用的预期接收者）的第一个参数p指向对象h，则可以从方法m的反射invoke调用的调用站点推断出新的边（产生方法m的类）。

仅记录用