MySQL中BNLJ算法的内表扫描次数的解惑

最新推荐文章于 2024-06-05 17:45:14 发布

逐梦草根

最新推荐文章于 2024-06-05 17:45:14 发布

阅读量603

点赞数

文章标签： mysql 数据库

本文链接：https://blog.csdn.net/qq_26656379/article/details/127826068

版权

对于网上很多资料说的BNLJ算法的内表扫描次数：

R*used_column_size/join_buffer_size + 1这个答案我琢磨了很久都觉得不太对，于是乎我去翻看了MySQL的官方文档：对于BNLJ算法的伪代码是这样描述的，我就不做翻译了：

For the example join described previously for the NLJ algorithm (without buffering), the join is done as follows using join buffering:

for each row in t1 matching range {

for each row in t2 matching reference key {

store used columns from t1, t2 in join buffer

if buffer is full {

for each row in t3 {

for each t1, t2 combination in join buffer {

if row satisfies join conditions, send to client

}

empty join buffer

}

if buffer is not empty {

for each row in t3 {

for each t1, t2 combination in join buffer {

if row satisfies join conditions, send to client

}

If S is the size of each stored t1, t2 combination in the join buffer and C is the number of combinations in the buffer, the number of times table t3 is scanned is:

(S * C)/join_buffer_size + 1

The number of t3 scans decreases as the value of join_buffer_size increases, up to the point when join_buffer_size is large enough to hold all previous row combinations. At that point, no speed is gained by making it larger.

请仔细看算法伪代码，然后再看那个1，我的理解如下：

这个1的意思：假如驱动表只涉及某个单列且总计有10条数据，然后join buffer为4，那么10/4=2，2个joinbuffer. 但是2个join buffer只能存下8条数据，所以需要再加一个join buffer，然后就是10/2+1 10其实在这里就是S*C的值 join buffer为2