spark.sql.sources.parallelPartitionsDiscovery.threshold

streamingdumping运行多天后出现,每个batch多了一个Job(38/38的job),而且该任务的耗时比较长,会使任务积压,driver的日志如下:

2019-01-29 11:55:01,631 INFO datasources.PartitioningAwareFileIndex: Listing leaf files and directories in parallel under: hdfs://nameservice1/user/cobub3/parquet/day=20181210, hdfs://nameservice1/user/cobub3/parquet/day=20181211, hdfs://nameservice1/user/cobub3/parquet/day=20181212, hdfs://nameservice1/user/cobub3/parquet/day=20181213, hdfs://nameservice1/user/cobub3/parquet/day=20181214, hdfs://nameservice1/user/cobub3/parquet/day=20181215, hdfs://nameservice1/user/cobub3/parquet/day=20181216, hdfs://nameservice1/user/cobub3/parquet/day=20181217, hdfs://nameservice1/user/cobub3/parquet/day=20181218, hdfs://nameservice1/user/cobub3/parquet/day=20181219, hdfs://nameservice1/user/cobub3/parquet/day=20181220, hdfs://nameservice1/user/cobub3/parquet/day=20181221, hdfs://nameservice1/user/cobub3/parquet/day=20181222, hdfs://nameservice1/user/cobub3/parquet/day=20181223, hdfs://nameservice1/user/cobub3/parquet/day=20181224, hdfs://nameservice1/user/cobub3/parquet/day=20181225, hdfs://nameservice1/user/cobub3/parquet/day=20181226, hdfs://nameservice1/user/cobub3/parquet/day=20181227, hdfs://nameservice1/user/cobub3/parquet/day=20181228, hdfs://nameservice1/user/cobub3/parquet/day=20181229, hdfs://nameservice1/user/cobub3/parquet/day=20190102, hdfs://nameservice1/user/cobub3/parquet/day=20190103, hdfs://nameservice1/user/cobub3/parquet/day=20190104, hdfs://nameservice1/user/cobub3/parquet/day=20190105, hdfs://nameservice1/user/cobub3/parquet/day=20190106, hdfs://nameservice1/user/cobub3/parquet/day=20190107, hdfs://nameservice1/user/cobub3/parquet/day=20190108, hdfs://nameservice1/user/cobub3/parquet/day=20190109, hdfs://nameservice1/user/cobub3/parquet/day=20190110, hdfs://nameservice1/user/cobub3/parquet/day=20190111, hdfs://nameservice1/user/cobub3/parquet/day=20190112, hdfs://nameservice1/user/cobub3/parquet/day=20190113, hdfs://nameservice1/user/cobub3/parquet/day=20190114, hdfs://nameservice1/user/cobub3/parquet/day=20190115, hdfs://nameservice1/user/cobub3/parquet/day=20190116, hdfs://nameservice1/user/cobub3/parquet/day=20190117, hdfs://nameservice1/user/cobub3/parquet/day=20190118, hdfs://nameservice1/user/cobub3/parquet/day=20190119, hdfs://nameservice1/user/cobub3/parquet/day=20190120, hdfs://nameservice1/user/cobub3/parquet/day=20190121, hdfs://nameservice1/user/cobub3/parquet/day=20190122, hdfs://nameservice1/user/cobub3/parquet/day=20190123, hdfs://nameservice1/user/cobub3/parquet/day=20190124, hdfs://nameservice1/user/cobub3/parquet/day=20190125, hdfs://nameservice1/user/cobub3/parquet/day=20190126, hdfs://nameservice1/user/cobub3/parquet/day=20190128, hdfs://nameservice1/user/cobub3/parquet/day=20190129

2019-01-29 11:55:01,959 INFO spark.SparkContext: Starting job: save at StreamingDumping.scala:159

2019-01-29 11:55:01,981 INFO scheduler.DAGScheduler: Got job 0 (save at StreamingDumping.scala:159) with 47 output partitions

2019-01-29 11:55:01,983 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (save at StreamingDumping.scala:159)

2019-01-29 11:55:01,983 INFO scheduler.DAGScheduler: Parents of final stage: List()

2019-01-29 11:55:01,986 INFO scheduler.DAGScheduler: Missing parents: List()

2019-01-29 11:55:02,001 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[10] at save at StreamingDumping.scala:159), which has no missing parents

2019-01-29 11:55:02,105 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 62.1 KB, free 912.2 MB)

2019-01-29 11:55:02,129 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 22.7 KB, free 912.2 MB)

2019-01-29 11:55:02,130 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.1.213:36623 (size: 22.7 KB, free: 912.3 MB)

2019-01-29 11:55:02,132 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:996

2019-01-29 11:55:02,139 INFO scheduler.DAGScheduler: Submitting 47 missing tasks from ResultStage 0 (MapPartitionsRDD[10] at save at StreamingDumping.scala:159)

2019-01-29 11:55:02,142 INFO cluster.YarnScheduler: Adding task set 0.0 with 47 tasks

2019-01-29 11:55:02,229 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, tdhtest03, executor 3, partition 0, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,250 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, tdhtest02, executor 1, partition 1, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,264 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, tdhtest01, executor 2, partition 2, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,275 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, tdhtest03, executor 3, partition 3, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,286 INFO scheduler.TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, tdhtest02, executor 1, partition 4, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,296 INFO scheduler.TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, tdhtest01, executor 2, partition 5, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,305 INFO scheduler.TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, tdhtest03, executor 3, partition 6, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,313 INFO scheduler.TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, tdhtest02, executor 1, partition 7, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,320 INFO scheduler.TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, tdhtest01, executor 2, partition 8, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,327 INFO scheduler.TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, tdhtest03, executor 3, partition 9, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,335 INFO scheduler.TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, tdhtest02, executor 1, partition 10, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:02,342 INFO scheduler.TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, tdhtest01, executor 2, partition 11, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:04,203 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on tdhtest03:33600 (size: 22.7 KB, free: 912.3 MB)

2019-01-29 11:55:04,381 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on tdhtest01:56090 (size: 22.7 KB, free: 912.3 MB)

2019-01-29 11:55:04,798 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on tdhtest02:35515 (size: 22.7 KB, free: 912.3 MB)

2019-01-29 11:55:06,289 INFO scheduler.TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, tdhtest01, executor 2, partition 12, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:06,304 INFO scheduler.TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, tdhtest01, executor 2, partition 13, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:06,316 INFO scheduler.TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, tdhtest01, executor 2, partition 14, PROCESS_LOCAL, 6394 bytes)

2019-01-29 11:55:06,375 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 4120 ms on tdhtest01 (executor 2) (1/47)

2019-01-29 11:55:06,377 INFO scheduler.TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 4090 ms on tdhtest01 (executor 2) (2/47)

2019-01-29 11:55:06,377 INFO schedul

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值