今天给大家分享一个Greenplum中行存与列存的性能对比,在分析型业务中,如果对数据查询的列数不是很多,还是建议优先选用列存。本文涉及的测试用例就能看出列存恰当使用时,性能能提升3倍!!
测试结果对比:
SQL
行存响应(ms)
列存响应(ms)
SELECT cab_type, count(*) FROM trips GROUP BY cab_type;
9539.607
3060.362
SELECT passenger_count,avg(total_amount) FROM trips GROUP BY passenger_count;
7186.376
2512.066
SELECT passenger_count, extract(year from pickup_datetime) AS pickup_year, count(*) FROM trips GROUP BY passenger_count, pickup_year;
9430.053
5301.265
SELECT passenger_count,extract(year from pickup_datetime) AS pickup_year, cast(trip_distance as int) AS distance,count(*) AS the_count FROM trips GROUP BY passenger_count, pickup_year, distance ORDER BY pickup_year, the_count desc;
10616.358
6627.922
测试执行过程:
--行存--
postgres=# \timing on
Timing is on.
postgres=# SELECT cab_type,
count(*)
FROM trips
GROUP BY cab_type;
cab_type | count
----------+-----------
yellow | 400000000
(1 row)
Time: 9539.607 ms
postgres=# SELECT passenger_count,
postgres-# avg(total_amount)
postgres-# FROM trips
postgres-# GROUP BY passenger_count;
passenger_count | avg
-----------------+------------------
247 | 19.44
255 | 15.25
7 | 27.3425925925926
47 | 9
36 | 61.55
4 | 13.9527243251649
65 | 50.3
249 | 9.5
177 | 17
33 | 9
129 | 9.43333333333333
1 | 13.6593432049349
9 | 21.2552830188679
254 | 6.5
6 | 14.0393473844229
3 | 13.8312939694461
8 | 25.42975
0 | 10.6683842285075
208 | 7.24198473282443
5 | 13.7613386506018
58 | 20.74
10 | 38.0166666666667
66 | 19.3
2 | 14.3546975991445
(24 rows)
Time: 7186.376 ms
postgres=# SELECT passenger_count,
postgres-# extract(year from pickup_datetime) AS pickup_year,
postgres-# count(*)
postgres-# FROM trips
postgres-# GROUP BY passenger_count,
postgres-# pickup_year;
passenger_count | pickup_year | count
-----------------+-------------+-----------
7 | 2011 | 2
255 | 2013 | 2
3 | 2012 | 7716364
7 | 2013 | 35
47 | 2011 | 1
3 | 2014 | 117300
254 | 2012 | 1
10 | 2011 | 2
58 | 2011 | 1
2 | 2011 | 6864045
6 | 2012 | 4880708
2 | 2013 | 23517494
6 | 2014 | 111403
65 | 2012 | 1
249 | 2012 | 1
177 | 2012 | 1
9 | 2014 | 2
1 | 2012 | 123955714
5 | 2013 | 10034696
5 | 2011 | 3471211
9 | 2012 | 25
1 | 2014 | 2008131
8 | 2011 | 1
8 | 2013 | 25
0 | 2011 | 805380
4 | 2012 | 3752549
0 | 2013 | 5035
208 | 2013 | 13
208 | 2011 | 7
4 | 2014 | 55319
247 | 2012 | 1
3 | 2013 | 7315829
Time: 9430.053 ms
postgres=# SELECT passenger_count,
postgres-# extract(year from pickup_datetime) AS pickup_year,
postgres-# cast(trip_distance as int) AS distance,
postgres-# count(*) AS the_count
postgres-# FROM trips
postgres-# GROUP BY passenger_count,
postgres-# pickup_year,
postgres-# distance
postgres-# ORDER BY pickup_year,
postgres-# the_count desc;
passenger_count | pickup_year | distance | the_count
-----------------+-------------+----------+-----------
1 | 2011 | 1 | 10607963
1 | 2011 | 2 | 7729133
1 | 2011 | 3 | 3412674
2 | 2011 | 1 | 2309318
1 | 2011 | 4 | 2081972
1 | 2011 | 0 | 2016713
2 | 2011 | 2 | 1746024
5 | 2011 | 1 | 1245490
1 | 2011 | 5 | 1098301
5 | 2011 | 2 | 859082
2 | 2011 | 3 | 816196
1 | 2011 | 6 | 787657
3 | 2011 | 1 | 722827
3 | 2011 | 2 | 542156
2 | 2011 | 4 | 503827
1 | 2011 | 7 | 446286
5 | 2011 | 3 | 429471
1 | 2011 | 8 | 389943
1 | 2011 | 10 | 345441
4 | 2011 | 1 | 345021
1 | 2011 | 9 | 341023
2 | 2011 | 0 | 334394
0 | 2011 | 1 | 303759
6 | 2011 | 1 | 274382
2 | 2011 | 5 | 269895
4 | 2011 | 2 | 265926
3 | 2011 | 3 | 253013
5 | 2011 | 4 | 235683
1 | 2011 | 11 | 228015
0 | 2011 | 2 | 212231
6 | 2011 | 2 | 189166
2 | 2011 | 6 | 189150
Time: 10616.358 ms
postgres=#
--列存--
postgres=# SELECT cab_type,
postgres-# count(*)
postgres-# FROM trips
postgres-# GROUP BY cab_type;
cab_type | count
----------+-----------
yellow | 400000000
(1 row)
Time: 3060.362 ms
postgres=# SELECT passenger_count,
postgres-# avg(total_amount)
postgres-# FROM trips
postgres-# GROUP BY passenger_count;
passenger_count | avg
-----------------+------------------
36 | 61.55
4 | 13.952724325165
65 | 50.3
249 | 9.5
177 | 17
33 | 9
129 | 9.43333333333333
1 | 13.6593432049358
9 | 21.2552830188679
254 | 6.5
6 | 14.039347384423
3 | 13.8312939694462
0 | 10.6683842285075
208 | 7.24198473282443
8 | 25.42975
5 | 13.7613386506019
66 | 19.3
58 | 20.74
2 | 14.3546975991445
10 | 38.0166666666667
47 | 9
255 | 15.25
7 | 27.3425925925926
247 | 19.44
(24 rows)
Time: 2512.066 ms
postgres=# SELECT passenger_count,
postgres-# extract(year from pickup_datetime) AS pickup_year,
postgres-# count(*)
postgres-# FROM trips
postgres-# GROUP BY passenger_count,
postgres-# pickup_year;
passenger_count | pickup_year | count
-----------------+-------------+-----------
8 | 2011 | 1
4 | 2012 | 3752549
0 | 2011 | 805380
0 | 2013 | 5035
8 | 2013 | 25
208 | 2013 | 13
208 | 2011 | 7
4 | 2014 | 55319
247 | 2012 | 1
3 | 2011 | 2090860
3 | 2013 | 7315829
7 | 2012 | 17
66 | 2012 | 1
10 | 2012 | 1
6 | 2011 | 764407
2 | 2012 | 24900623
6 | 2013 | 6764789
2 | 2014 | 389248
129 | 2013 | 1
1 | 2011 | 30417318
5 | 2012 | 12063339
1 | 2013 | 121959711
33 | 2011 | 1
9 | 2013 | 26
129 | 2011 | 2
5 | 2014 | 162696
36 | 2011 | 1
4 | 2011 | 1018515
0 | 2012 | 1274858
4 | 2013 | 3582103
208 | 2012 | 107
8 | 2012 | 13
Time: 5301.265 ms
postgres=# SELECT passenger_count,
postgres-# extract(year from pickup_datetime) AS pickup_year,
postgres-# cast(trip_distance as int) AS distance,
postgres-# count(*) AS the_count
postgres-# FROM trips
postgres-# GROUP BY passenger_count,
postgres-# pickup_year,
postgres-# distance
postgres-# ORDER BY pickup_year,
postgres-# the_count desc;
passenger_count | pickup_year | distance | the_count
-----------------+-------------+----------+-----------
1 | 2011 | 1 | 10607963
1 | 2011 | 2 | 7729133
1 | 2011 | 3 | 3412674
2 | 2011 | 1 | 2309318
1 | 2011 | 4 | 2081972
1 | 2011 | 0 | 2016713
2 | 2011 | 2 | 1746024
5 | 2011 | 1 | 1245490
1 | 2011 | 5 | 1098301
5 | 2011 | 2 | 859082
2 | 2011 | 3 | 816196
1 | 2011 | 6 | 787657
3 | 2011 | 1 | 722827
3 | 2011 | 2 | 542156
2 | 2011 | 4 | 503827
1 | 2011 | 7 | 446286
5 | 2011 | 3 | 429471
1 | 2011 | 8 | 389943
1 | 2011 | 10 | 345441
4 | 2011 | 1 | 345021
1 | 2011 | 9 | 341023
2 | 2011 | 0 | 334394
0 | 2011 | 1 | 303759
6 | 2011 | 1 | 274382
2 | 2011 | 5 | 269895
4 | 2011 | 2 | 265926
3 | 2011 | 3 | 253013
5 | 2011 | 4 | 235683
1 | 2011 | 11 | 228015
0 | 2011 | 2 | 212231
6 | 2011 | 2 | 189166
2 | 2011 | 6 | 189150
Time: 6627.922 ms
postgres=#