set hive.optimize.sampling.orderby=true;
set hive.optimize.sampling.orderby.number=10000;
set hive.optimize.sampling.orderby.percent=0.1f;
记录一下,Hive中并行排序参数;
hive.optimize.sampling.orderby
Default Value: false
Added In: Hive 0.12.0 with HIVE-1402
Uses sampling on order-by clause for parallel execution.
hive.optimize.sampling.orderby.number
Default Value: 1000
Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, total number of samples to be obtained to calculate partition keys.
hive.optimize.sampling.orderby.percent
Default Value: 0.1
Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, probability with which a row will be chosen.