Tez Configuration Parameters

Table 10.1. Tez Configuration Parameters

Configuration ParameterDescriptionDefault Value
tez.lib.urisLocation of the Tez jars and their dependencies. Tez applications download required jar files from this location, so it should be public accessible.N/A
tez.am.log.levelRoot logging level passed to the Tez Application Master.INFO
tez.staging-dirThe staging directory used by Tez when application developers submit DAGs, or Dynamic Acyclic Graphs. Tez creates all temporary files for the DAG job in this directory./tmp/${user.name}/staging
tez.am.resource.memory.mbThe amount of memory in MB that YARN will allocate to the Tez Application Master. The size increases with the size of the DAG.1536
tez.am.java.optsJava options for the Tez Application Master process. The value specified for -Xmx value should be less than specified in tez.am.resource.memory.mb, typically 512 MB less to account for non-JVM memory in the process.-server -Xmx1024m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC
tez.am.shuffle-vertex-manager.min-src-fractionIn case of a Shuffle operation over a Scatter-Gather edge connection, Tez may start data consumer tasks before all the data producer tasks complete in order to overlap the shuffle IO. This parameter specifies the fraction of producer tasks which should complete before the consumer tasks are scheduled. The percentage is expressed as a decimal, so the default value of 0.2 represents 20%.0.2
tez.am.shuffle-vertex-manager.max-src-fractionIn case of a Shuffle operation over a Scatter-Gather edge connection, Tez may start data consumer tasks before all the data producer tasks complete in order to overlap the shuffle IO. This parameter specifies the fraction of producer tasks which should complete before all consumer tasks are scheduled. The number of consumer tasks ready for scheduling scales linearly between min-fraction and max-fraction. The percentage is expressed as a decimal, so the default value of 0.4 represents 40%.0.4
tez.am.am-rm.heartbeat.interval-ms.maxThis parameter determines how frequently the Tez Application Master asks the YARN Resource Manager for resources in milliseconds. A low value can overload the Resource Manager.250
tez.am.grouping.split-wavesSpecifies the number of waves, or the percentage of queue container capacity, used to process a data set where a value of1 represents 100% of container capacity. The Tez Application Master considers this parameter value, the available cluster resources, and the resources required by the application to calculate parallelism, or the number of tasks to run. Processing queries with additional containers leads to lower latency. However, resource contention may occur if multiple users run large queries simultaneously.Tez Default:1.4; Hive Default: 1.7
tez.am.grouping.min-sizeSpecifies the lower bound of the size of the primary input to each task when The Tez Application Master determines the parallelism of primary input reading tasks. This configuration property prevents input tasks from being too small, which prevents the parallelism for the tasks being too large.16777216
tez.am.grouping.max-sizeSpecifies the upper bound of the size of the primary input to each task when the Tez Application Master determines the parallelism of primary input reading tasks. This configuration property prevents input tasks from being too large, which prevents their parallelism from being too small.1073741824
tez.am.container.reuse.enabledA container is the unit of resource allocation in YARN. This configuration parameter determines whether Tez will reuse the same container to run multiple tasks. Enabling this parameter improves performance by avoiding the memory overhead of reallocating container resources for every task. However, disable this parameter if the tasks contain memory leaks or use static variables.true
tez.am.container.reuse.rack-fallback.enabledSpecifies whether to reuse containers for rack-local tasks. This configuration parameter is ignored unless tez.am.container.reuse.enabled is enabled.true
tez.am.container.reuse.non-local-fallback.enabledSpecifies whether to reuse containers for non-local tasks. This configuration parameter is ignored unless tez.am.container.reuse.enabled is enabled.true
tez.am.container.session.delay-allocation-millisDetermines when a Tez session releases its containers while not actively servicing a query. Specify a value of -1 to never release an idle container in a session. However, containers may still be released if they do not meet resource or locality needs. This configuration parameter is ignored unless tez.am.container.reuse.enabled is enabled.10000 (10 seconds)
tez.am.container.reuse.locality.delay-allocation-millisThe amount of time to wait in milliseconds before assigning a container to the next level of locality. The three levels of locality in ascending order are NODE, RACK, and NON_LOCAL.250
tez.task.get-task.sleep.interval-ms.maxDetermines the maximum amount of time in milliseconds a container agent waits before asking The Tez Application Master for another task. Tez runs an agent on a container in order to remotely launch tasks. A low value may overload the Application Master.200
tez.session.client.timeout.secsSpecifies the amount of time in seconds to wait for the Application Master to start when trying to submit a DAG from the client in session mode.180
tez.session.am.dag.submit.timeout.secsSpecifies the amount of time in seconds that the Tez Application Master waits for a DAG to be submitted before shutting down. The value of this property is used when the Tez Application Manager is running in Session mode, which allows multiple DAGs to be submitted for execution. The idle time between DAG submissions should not exceed this time.300
tez.runtime.intermediate-output.should-compressSpecifies whether Tez should compress intermediate output.false
tez.runtime.intermediate-output.compress.codecSpecifies the codec to used when compressing intermediate output. This configuration is ignored unless tez.runtime.intermediate-output.should-compress is enabled.org.apache.hadoop.io.compress.SnappyCodec
tez.runtime.intermediate-input.is-compressedSpecifies whether intermediate output is compressed.false
tez.runtime.intermediate-input.compress.codecSpecifies the codec to use when reading intermediate compressed input. This configuration property is ignored unless tez.runtime.intermediate-input.is-compressed is enabled.org.apache.hadoop.io.compress.SnappyCodec
tez.yarn.ats.enabledSpecifies that Tez should start the TimeClient for sending information to the Timeline Server.false
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值