ClickHouse之Join表引擎
之所以使用join表,是因为对于大批量数据的关联查询速度会变慢,而join表将数据存储在内存上,提高了查询速度。
CREATE TABLE join_tb1 (
id UInt8,
name String,
time Datetime
) ENGINE = Log
INSERT INTO TABLE join_tb1 VALUES
(1,'ClickHouse','2019-05-01 12:00:00'),
(2,'Spark', '2019-05-01 12:30:00'),
(3,'ElasticSearch','2019-05-01 13:00:00');
INSERT INTO TABLE join_tb1 VALUES (10,'StarRocks','2020-05-01 12:00:00')
CREATE TABLE id_join_tb1 (
id UInt8,
price UInt32,
time Datetime
) ENGINE = Join (ANY, LEFT, id);
INSERT INTO TABLE id_join_tb1 VALUES
(1,100,'2019-05-01 11:55:00'),
(1,105,'2019-05-01 11:10:00'),
(2,90,'2019-05-01 12:01:00'),
(3,80,'2019-05-01 13:10:00'),
(5,70,'2019-05-01 14:00:00'),
(6,60,'2019-05-01 13:50:00');
不建议使用join进行关联查询,速度没有变化
SELECT id,name,price FROM join_tb1 ANY LEFT JOIN id_join_tb1 USING (id);
推荐使用joinGet方法进行关联查询,提高了查询速度
SELECT joinGet ('id_join_tb1', 'price', toUInt8 (1));
我们可以在sql前面增加explain 查询sql的执行步骤发现joinGet步骤少于left join
SELECT id,name,joinGet ('id_join_tb1', 'price', id) as price FROM join_tb1 ;