I just encountered a strange behaviour (bug?) in MySQL 5.7 that doesn't appear in MySQL 5.5.
So awkward, I need an example to explain it .
Create a table using a left join on 2 tables
Make sure the second table is empty (no records) but is constructed by having a static value written in one of it's fields.
The Left join with no conditions produces N lines (as expected)
The Left join with a condition that never matches ALSO produces N lines.
### EXAMPLE ###
## CREATE TABLES
create table PCPL (K1 int); ## Table 1
create table AUX (K2 int); ## Table 2
## FILL IN TABLES
insert into PCPL values (1),(2),(3); ## fill main table with 3 values
truncate table AUX; ## No need to do this, just to make things clearer
## TEST 1 : "Dry Left join" => RESULT OK : Resulting Table has 3 rows
select PCPL.K1 as K1 , DERIVED.K2 as K2
from PCPL
LEFT JOIN (select K2, 1 as staticValue from AUX) DERIVED
ON PCPL.K1 = DERIVED.K2;
+------+------+
| K1 | K2 |
+------+------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
+------+------+
3 rows in set (0,00 sec)
## TEST 2 : "Never matching condition" => STRANGE : Resulting Table NOT empty
select PCPL.K1 as K1 , DERIVED.K2 as K2
from PCPL
LEFT JOIN (select K2, 1 as staticValue from AUX) DERIVED
ON PCPL.K1 = DERIVED.K2
where staticValue=1; ##### THIS CONDITION IS NEVER MET SINCE TABLE AUX IS EMPTY
+------+------+
| K1 | K2 |
+------+------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
+------+------+
3 rows in set (0,00 sec)
THIS SHOULDN'T HAPPEN !
This behaviour does not occur with MySQL 5.5
Is this a bug or some parameter in 5.5 that I forgot to set in 5.7?
Thanks for your time!
解决方案
This specifically looks like a bug in subquery merging. This is a strategy that lets MySQL avoid the overhead of materializing subqueries.
Consider the following two queries (the first is the original):
select PCPL.K1 as K1, DERIVED.K2 as K2,
(CASE WHEN DERIVED.K2 IS NULL THEN 'is null' ELSE 'not null' END) as K2_null,
staticValue
(CASE WHEN staticValue IS NULL THEN 'is null' ELSE 'not null' END) as staticValue_null
from PCPL LEFT JOIN
(select K2, 1 as staticValue from AUX) DERIVED
ON PCPL.K1 = DERIVED.K2
where staticValue IS NOT NULL;
And this modified form where the only difference in DERIVED:
select PCPL.K1 as K1, DERIVED.K2 as K2,
(CASE WHEN DERIVED.K2 IS NULL THEN 'is null' ELSE 'not null' END) as K2_null,
staticValue
(CASE WHEN staticValue IS NULL THEN 'is null' ELSE 'not null' END) as staticValue_null
from PCPL LEFT JOIN
(select K2, 1 as staticValue from AUX order by k2 limit 3) DERIVED
-------------------------------------------^XXXXXXXXXXXXXXXXXX
ON PCPL.K1 = DERIVED.K2
where staticValue IS NOT NULL;
The purpose of order by k2 limit 3 is to force the materialization of the subquery. The first version returns three rows (incorrect). The second version returns zero rows (correct). In both cases, the subquery returns the same row set.
The difference is that the forced materialization fixes the problem. Hence, this looks like a bug in subquery merging.
You can read about subquery merging in the documentation.
A small editorial note. Sophisticated databases parse a query into a DAG (directed acyclic graph) that describes the processing. They then have sophisticated rules for pushing operations through the graph -- particularly filtering operations and column computations.
MySQL starts from a simpler model of the query. The developers are moving toward a more sophisticated model. Small bugs like this are not unexpected, but it will take a version or two to work them out.