具体的一个上篇到了perform 逻辑
具体的perform 逻辑
展开逻辑
在把大的filter 拆分成两个小的filter以后, 把这两个小的rexcall 加入一个列表,待分类
public static void decomposeConjunction(
RexNode rexPredicate,
List<RexNode> rexList) {
if (rexPredicate == null || rexPredicate.isAlwaysTrue()) {
return;
}
if (rexPredicate.isA(SqlKind.AND)) {
for (RexNode operand : ((RexCall) rexPredicate).getOperands()) {
decomposeConjunction(operand, rexList);
}
} else {
rexList.add(rexPredicate);
}
}
classifyFilters--具体分类逻辑
-
对filters 进行一个分类
-
-
需要推到join 左侧的
-
需要推到join 右侧的
-
为什么需要对filters 进行分类呢?
因为比如join 左侧是table1 里面有a列,
右侧是table2里面有b 列
有两个filter ,一个是a> 5, 一个是b = 'zhang'
很明显这两个filter a> 5 只需要推到左侧
而b= ’zhang‘ 只需要推到右侧
具体代码逻辑
public static boolean classifyFilters(
RelNode joinRel,
List<RexNode> filters,
JoinRelType joinType,
boolean pushInto,
boolean pushLeft,
boolean pushRight,
List<RexNode> joinFilters,
List<RexNode> leftFilters,
List<RexNode> rightFilters) {
RexBuilder rexBuilder = joinRel.getCluster().getRexBuilder();
List<RelDataTypeField> joinFields = joinRel.getRowType().getFieldList();
final int nTotalFields = joinFields.size();
final int nSysFields = 0; // joinRel.getSystemFieldList().size();
final List<RelDataTypeField> leftFields =
joinRel.getInputs().get(0).getRowType().getFieldList();
final int nFieldsLeft = leftFields.size();//获取左表的总列数
final List<RelDataTypeField> rightFields =
joinRel.getInputs().get(1).getRowType().getFieldList();
final int nFieldsRight = rightFields.size();// 获取右表的总列数
// SemiJoin, CorrelateSemiJoin, CorrelateAntiJoin: right fields are not returned
assert nTotalFields == (!joinType.projectsRight()
? nSysFields + nFieldsLeft
: nSysFields + nFieldsLeft + nFieldsRight);
// set the reference bitmaps for the left and right children
ImmutableBitSet leftBitmap =// 有几列 ,创建几个位置 {0,1,2}
ImmutableBitSet.range(nSysFields, nSysFields + nFieldsLeft);
ImmutableBitSet rightBitmap = //{3,4,5}
ImmutableBitSet.range(nSysFields + nFieldsLeft, nTotalFields);
//range(fromIndex, toIndex)
// 总的列数{0,1,2,3,4,5}
// 所以左侧占据了0 1 2, 右侧占据了3 4 5
final List<RexNode> filtersToRemove = new ArrayList<>();
for (RexNode filter : filters) {
final InputFinder inputFinder = InputFinder.analyze(filter);
final ImmutableBitSet inputBits = inputFinder.inputBitSet.build();
// if current filter isn't dependent on any refs of inputs,
// we should abort current filter and continue to visit other.
if (inputBits.isEmpty()) {
if (filter.isAlwaysTrue()) {
filtersToRemove.add(filter);
}
continue;
}
// REVIEW - are there any expressions that need special handling
// and therefore cannot be pushed?
// filters can be pushed to the left child if the left child
// does not generate NULLs and the only columns referenced in
// the filter originate from the left child
if (pushLeft && leftBitmap.contains(inputBits)) {
// ignore filters that always evaluate to true
if (!filter.isAlwaysTrue()) {
// adjust the field references in the filter to reflect
// that fields in the left now shift over by the number
// of system fields
final RexNode shiftedFilter =
shiftFilter(
nSysFields,
nSysFields + nFieldsLeft,
-nSysFields,
rexBuilder,
joinFields,
nTotalFields,
leftFields,
filter);
leftFilters.add(shiftedFilter);
}
filtersToRemove.add(filter);
// filters can be pushed to the right child if the right child
// does not generate NULLs and the only columns referenced in
// the filter originate from the right child
} else if (pushRight && rightBitmap.contains(inputBits)) {
if (!filter.isAlwaysTrue()) {
// adjust the field references in the filter to reflect
// that fields in the right now shift over to the left;
// since we never push filters to a NULL generating
// child, the types of the source should match the dest
// so we don't need to explicitly pass the destination
// fields to RexInputConverter
final RexNode shiftedFilter =
shiftFilter(
nSysFields + nFieldsLeft,
nTotalFields,
-(nSysFields + nFieldsLeft),
rexBuilder,
joinFields,
nTotalFields,
rightFields,
filter);
rightFilters.add(shiftedFilter);
}
filtersToRemove.add(filter);
} else {
// If the filter can't be pushed to either child and the join
// is an inner join, push them to the join if they originated
// from above the join
if (!joinType.isOuterJoin() && pushInto) {
if (!joinFilters.contains(filter)) {
joinFilters.add(filter);
}
filtersToRemove.add(filter);
}
}
}
// Remove filters after the loop, to prevent concurrent modification.
if (!filtersToRemove.isEmpty()) {
filters.removeAll(filtersToRemove);
}
// Did anything change?
return !filtersToRemove.isEmpty();
}
将大的filter 拆分成两个小的filter ,加入到 toRemoveFilters 列表, 也就是需要push的filter
-
在看代码的时候重点观察
InputFinder extends RexVisitorImpl
这个是一个visitor ,看他真正visit 的是谁, 在上面这个案例中, 传入的是一个filter 的rexcall ,即 $2 > 30 , 想看到他在这个表里面具体是占用的$ 几 , 方法就是通过获取了他的操作对象中的左侧. Calcite 中看源码的时候比较费劲的就是各种node 接受各种visitor , 需要十分清楚此时具体接受visit 是哪种node ,而visitor 的实现类具体是哪个. 在vistitor的实现类里面要找到他具体传入的那个参数。 比如上方最终inputBitSet 要设置进的其实是个inputRef 的index . 被visit 的relnode 类型是RexInputRef , 而visitor 是继承了rexInputVistiorImpl 的InputFinder, 而InputFinder 在RelOptUitl 里面.
具体看这里
public Void visitInputRef(RexInputRef inputRef) {
inputBitSet.set(inputRef.getIndex());
return null;
}
// Void 是对基本类void 的一个包装
shiftFilter
-
shift 左侧
-
shift 右侧
需要注意的是,为什么需要一个bitmap 去记录左侧和右侧的的列分别是哪些, 而在shift 的时候 ,可以看到方法的参数,会传入一个偏移量的东西。 在方法里面会用一个
int[] adjustments = new int[nTotalFields];
for (int i = start; i < end; i++) {
adjustments[i] = offset;
}
// 在经过for 循环以后
[0, 0, 0, -3, -3, -3]
private static RexNode shiftFilter(
int start, // nSysFields + nFieldsLeft, // 左侧有多少列
int end, // nTotalFields, // 总共有多少列
int offset, // -(nSysFields + nFieldsLeft), // 偏移量
RexBuilder rexBuilder,
List<RelDataTypeField> joinFields,
int nTotalFields, //
List<RelDataTypeField> rightFields,
RexNode filter)
在爸3 这个field 下推到右侧的时候,其实需要构造出来一个 新的RexInputRef , 对于join 右侧的table 来说,他的fields 列表其实也是012, join之后的345 在去掉3个偏移量之后就变成012, 然后3--> 0 ,把原来$3 > 10 转换成 $0 > 10 ,这样一个新的RexInputRef 就产生了。
RexInputConverter extends RexShuttle 具体可以看这个visitor 是怎么visit 一个RexInputRef 的
public RexInputRef makeInputRef(
RelDataType type,
int i) {
type = SqlTypeUtil.addCharsetAndCollation(type, typeFactory);
return new RexInputRef(i, type); // 直接new 出来的。
}