ccah-500 第9题 How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

原创 2016年06月01日 17:40:09

9.You observed that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio? 

A. For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O 

B. Increase the io.sort.mb to 1GB 

C. Decrease the io.sort.mb value to 0 

D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records. 

 

Answer: D 

 

reference

http://www.aiotestking.com/cloudera/how-would-you-tune-your-iosortmb-value-to-achieve-maximum-memory-to-disk-io-ratio-3/

io.sort.mb - This sets the size of memory buffer used during sort operations. This buffer is contained within the map/reduce task’s JVM heap as defined in mapred.child.java.opts. If this buffer size is too small for the amount of input data, it can lead to intermediate spills to disk and which will later need to be read and merged. Increasing this value will reduce or eliminate the number of intermediate spills going to disk and reduce the overall I/O load on your system.
Default value: 100 Mb
Recommended value: Use 1/4 to 1/2 of the map/reduce task Java heap size setting (in mapred.child.java.opts).
Auto-tuned value: 1/2 of the map/reduce Java heap size

 

Reference from the book ” Hadoop Operations” of Eric Sammer:

“The value of io.sort.mbis specified in megabytes and, by default, is 100.
Increasing the size of this buffer results in fewer spills to disk and, as a consequence, reduces the number of spill files that must be merged when the map task completes.

The io.sort.mbparameter is one way administrators and job developers can trade more memory for reduced disk IO.

The downside of this is that this buffer must be contained within the child task’s JVM heap allocation, as defined by mapred.child.java.opts.
For example, with a child heap size of 1GB and io.sort.mbset to 128, only 896MB is really available to the user’s code

Remember that ultimately, all records output by map tasks must be spilled so,
in the ideal scenario, these numbers are equal.”

版权声明:本文为博主原创文章,转载请注明。

相关文章推荐

ccah-500 第45题 You want to minimize the chance of data loss in your cluster. What should you do

45.You have A 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running HDFS High Avail...

ccah-500 第13题Which three basic configuration parameters must you set to migrate

13.Which three basic configuration parameters must you set to migrate your cluster from MapReduce 1 ...

ccah-500 第40题 maintain your MRv1 TaskTracker slot capacities when you migrate. What should you do

40.You are migrating a cluster from MApReduce version 1 (MRv1) to MapReduce version 2(MRv2) on YARN....

How to use pysphere set VM cpu or memory reservationrese value

''' This script just for reserving vm's cpu or memory, the method of definition is always contains s...
  • cTx521
  • cTx521
  • 2014年11月13日 22:44
  • 1258

ccah-500 第49题 What occurs when you execute the command: hdfs haadmin -failover nn01 nn02

49.Your cluster implements HDFS High Availability (HA). Your two NameNodes are named nn01 and nn02. ...

hdu 1214 圆桌会议 数论 How strong I want to love you again!

圆桌会议 Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Total Submi...

How to Bring Your Future Closer to You 如何让自己更接近梦想

在天天按部就班的朝九晚五中,我们曾经不甘平庸的豪情都渐渐灰飞烟灭了,很多人会抱怨说“梦想太远远”。但实在只要我们有脚踏实地的计划、孜孜不倦的努力、和不改初衷的执着,梦想总会有靠岸的那一天。 ...

Getting your driver to handle more than one I/O request

Getting your driver to handle more than one I/O request at a timeUpdated: May 25, 2007Your user-mode...
  • Augusdi
  • Augusdi
  • 2011年03月05日 21:48
  • 1136
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:ccah-500 第9题 How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
举报原因:
原因补充:

(最多只允许输入30个字)