Hive on Tez : How to control the number of Mappers and Reducers

​​​​​​Hive on Tez : How to control the number of Mappers and Reducers

Goal:

How to control the number of Mappers and Reducers in Hive on Tez.

Env:

Hive 2.1
Tez 0.8

Solution:

1. # of Mappers

Which Tez parameters control this?

  • tez.grouping.max-size(default 1073741824 which is 1GB)
  • tez.grouping.min-size(default 52428800 which is 50MB)
  • tez.grouping.split-count(not set by default)

Which log for debugging # of Mappers?
DAG syslog in the DAG Application Master container directory.
Search for "grouper.TezSplitGrouper", for example:

1

2

3

4

5

6

# grep grouper.TezSplitGrouper syslog_dag_1475192050844_0026_1

2017-05-23 15:00:50,285 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Grouping splits in Tez

2017-05-23 15:00:50,288 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Desired numSplits: 59 lengthPerGroup: 97890789 numLocations: 4 numSplitsPerLocation: 40 numSplitsInGroup: 2 totalLength: 5775556608 numOriginalSplits: 161 . Grouping by length: true count: false nodeLocalOnly: false

2017-05-23 15:00:50,291 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Doing rack local after iteration: 18 splitsProcessed: 139 numFullGroupsInRound: 0 totalGroups: 68 lengthPerGroup: 73418096 numSplitsInGroup: 1

2017-05-23 15:00:50,291 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Allowing small groups after iteration: 19 splitsProcessed: 139 numFullGroupsInRound: 0 totalGroups: 68

2017-05-23 15:00:50,291 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Number of splits desired: 59 created: 69 splitsProcessed: 161

This means, this Hive on Tez query finally spawns 69 Mappers.

If we set tez.grouping.max-size=tez.grouping.min-size=1073741824(1G), here is the result:

1

2

3

4

5

6

7

# grep grouper.TezSplitGrouper syslog_dag_1475192050844_0030_1

2017-05-23 17:16:11,851 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Grouping splits in Tez

2017-05-23 17:16:11,852 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Desired splits: 6 too large.  Desired splitLength: 97890789 Min splitLength: 1073741824 New desired splits: 6 Final desired splits: 6 All splits have localhost: false Total length: 5775556608 Original splits: 161

2017-05-23 17:16:11,854 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Desired numSplits: 6 lengthPerGroup: 962592768 numLocations: 4 numSplitsPerLocation: 40 numSplitsInGroup: 26 totalLength: 5775556608 numOriginalSplits: 161 . Grouping by length: true count: false nodeLocalOnly: false

2017-05-23 17:16:11,856 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Doing rack local after iteration: 3 splitsProcessed: 135 numFullGroupsInRound: 0 totalGroups: 6 lengthPerGroup: 721944576 numSplitsInGroup: 19

2017-05-23 17:16:11,856 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Allowing small groups after iteration: 4 splitsProcessed: 135 numFullGroupsInRound: 0 totalGroups: 6

2017-05-23 17:16:11,856 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Number of splits desired: 6 created: 7 splitsProcessed: 161

This time only 7 Mappers are spawned because of "Min splitLength: 1073741824".

If we set tez.grouping.split-count=13 here is the result:

1

2

3

4

5

6

7

# grep grouper.TezSplitGrouper syslog_dag_1475192050844_0039_1

2017-05-24 16:27:05,523 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Grouping splits in Tez

2017-05-24 16:27:05,523 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Desired numSplits overridden by config to: 13

2017-05-24 16:27:05,526 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Desired numSplits: 13 lengthPerGroup: 444273585 numLocations: 4 numSplitsPerLocation: 40 numSplitsInGroup: 12 totalLength: 5775556608 numOriginalSplits: 161 . Grouping by length: true count: false nodeLocalOnly: false

2017-05-24 16:27:05,528 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Doing rack local after iteration: 5 splitsProcessed: 156 numFullGroupsInRound: 0 totalGroups: 14 lengthPerGroup: 333205184 numSplitsInGroup: 9

2017-05-24 16:27:05,528 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Allowing small groups after iteration: 6 splitsProcessed: 156 numFullGroupsInRound: 0 totalGroups: 14

2017-05-24 16:27:05,528 [INFO] [InputInitializer {Map 1} #0] |grouper.TezSplitGrouper|: Number of splits desired: 13 created: 15 splitsProcessed: 161

This time 15 Mappers are spawned because of "Desired numSplits overridden by config to: 13".

BTW, the query tested is "select count(*) from passwords".
Here "Original splits: 161" means there are totally 161 files:

1

2

# ls  passwords|wc -l

161

Here "Total length: 5775556608" means the table size is about 5.4G:

1

2

# du -b passwords

5775556769 passwords


The detailed algorithm is in tez-mapreduce/src/main/java/org/apache/tez/mapreduce/grouper/TezSplitGrouper.java.
And also this article is explaining the logic:How initial task parallelism works - Tez - Apache Software Foundation

2. # of Reducers

Same as Hive on MR query, below parameters controls # of Reducers:

  • hive.exec.reducers.bytes.per.reducer(default 256000000)
  • hive.exec.reducers.max(default 1009)
  • hive.tez.auto.reducer.parallelism(default false)

Take below query for example, focus on "Reducer 2" which is the join:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

hive> select count(*) from passwords a, passwords b where a.col0=b.col1;

Query ID = mapr_20170524140623_bc36636e-c295-4e75-b7ac-fe066320dce1

Total jobs = 1

Launching Job 1 out of 1

Status: Running (Executing on YARN cluster with App id application_1475192050844_0034)

----------------------------------------------------------------------------------------------

        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

----------------------------------------------------------------------------------------------

Map 1 .......... container     SUCCEEDED     69         69        0        0       0       0

Map 4 .......... container     SUCCEEDED     69         69        0        0       0       0

Reducer 2 ...... container     SUCCEEDED     45         45        0        0       1       0

Reducer 3 ...... container     SUCCEEDED      1          1        0        0       0       0

----------------------------------------------------------------------------------------------

VERTICES: 04/04  [==========================>>] 100%  ELAPSED TIME: 164.32 s

----------------------------------------------------------------------------------------------

OK

0

"Reducer 2" spawns 45 Reducers.

If we double hive.exec.reducers.bytes.per.reducer to 512000000, "Reducer 2" spawns only half # of Reducers -- 23 this time.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

hive> set hive.exec.reducers.bytes.per.reducer=512000000;

hive> select count(*) from passwords a, passwords b where a.col0=b.col1;

Query ID = mapr_20170524142206_d07caa6a-0061-43a6-b5e9-4f67880cf118

Total jobs = 1

Launching Job 1 out of 1

Tez session was closed. Reopening...

Session re-established.

Status: Running (Executing on YARN cluster with App id application_1475192050844_0035)

----------------------------------------------------------------------------------------------

        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

----------------------------------------------------------------------------------------------

Map 1 .......... container     SUCCEEDED     69         69        0        0       0       0

Map 4 .......... container     SUCCEEDED     69         69        0        0       0       0

Reducer 2 ...... container     SUCCEEDED     23         23        0        0       0       0

Reducer 3 ...... container     SUCCEEDED      1          1        0        0       0       0

----------------------------------------------------------------------------------------------

VERTICES: 04/04  [==========================>>] 100%  ELAPSED TIME: 179.62 s

----------------------------------------------------------------------------------------------

OK

0


Of course, we can set a hard limit of # of Reducers by setting hive.exec.reducers.max=10:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

hive> set hive.exec.reducers.max=10;

hive> select count(*) from passwords a, passwords b where a.col0=b.col1;

Query ID = mapr_20170524142736_4367dee2-b695-4162-ad47-99d7ff2311bc

Total jobs = 1

Launching Job 1 out of 1

Status: Running (Executing on YARN cluster with App id application_1475192050844_0035)

----------------------------------------------------------------------------------------------

        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

----------------------------------------------------------------------------------------------

Map 1 .......... container     SUCCEEDED     69         69        0        0       0       0

Map 4 .......... container     SUCCEEDED     69         69        0        0       0       0

Reducer 2 ...... container     SUCCEEDED     10         10        0        0       1       0

Reducer 3 ...... container     SUCCEEDED      1          1        0        0       0       0

----------------------------------------------------------------------------------------------

VERTICES: 04/04  [==========================>>] 100%  ELAPSED TIME: 153.35 s

----------------------------------------------------------------------------------------------

OK

0


Another feature is controlled by hive.tez.auto.reducer.parallelism:
Turn on Tez' auto reducer parallelism feature. When enabled, Hive will still estimate data sizes and set parallelism estimates. Tez will sample source vertices' output sizes and adjust the estimates at runtime as necessary.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

hive> set hive.tez.auto.reducer.parallelism = true;

hive> select count(*) from passwords a, passwords b where a.col0=b.col1;

Query ID = mapr_20170524143541_18b3c2b6-75a8-4fbb-8a0d-2cf354fd7a72

Total jobs = 1

Launching Job 1 out of 1

Status: Running (Executing on YARN cluster with App id application_1475192050844_0036)

----------------------------------------------------------------------------------------------

        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

----------------------------------------------------------------------------------------------

Map 1 .......... container     SUCCEEDED     69         69        0        0       0       0

Map 4 .......... container     SUCCEEDED     69         69        0        0       0       0

Reducer 2 ...... container     SUCCEEDED     12         12        0        0       1       0

Reducer 3 ...... container     SUCCEEDED      1          1        0        0       0       0

----------------------------------------------------------------------------------------------

VERTICES: 04/04  [==========================>>] 100%  ELAPSED TIME: 143.69 s

----------------------------------------------------------------------------------------------

OK

0

From DAG syslog file, we can see in the beginning "Reducer 2" tried to spawn 90 Reducers, and then it got changed to 12 at runtime:

1

2

3

4

[root@s3 container_e02_1475192050844_0036_01_000001]# grep "Reducer 2" syslog_dag_1475192050844_0036_1 |grep parallelism

2017-05-24 14:36:36,831 [INFO] [App Shared Pool - #0] |vertexmanager.ShuffleVertexManager|: Reducing auto parallelism for vertex: Reducer 2 from 90 to 12

2017-05-24 14:36:36,837 [INFO] [App Shared Pool - #0] |impl.VertexImpl|: Resetting vertex location hints due to change in parallelism for vertex: vertex_1475192050844_0036_1_02 [Reducer 2]

2017-05-24 14:36:36,841 [INFO] [App Shared Pool - #0] |impl.VertexImpl|: Vertex vertex_1475192050844_0036_1_02 [Reducer 2] parallelism set to 12 from 90

Prev Page Next Page Home

Subscribe to: Post Comments (Atom)

Popular Posts

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值