Show something

致力于大数据相关技术,欢迎交流。https://github.com/worgent

ccah-500 第9题 How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

9.You observed that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio? 

A. For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O 

B. Increase the io.sort.mb to 1GB 

C. Decrease the io.sort.mb value to 0 

D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records. 

 

Answer: D 

 

reference

http://www.aiotestking.com/cloudera/how-would-you-tune-your-iosortmb-value-to-achieve-maximum-memory-to-disk-io-ratio-3/

io.sort.mb - This sets the size of memory buffer used during sort operations. This buffer is contained within the map/reduce task’s JVM heap as defined in mapred.child.java.opts. If this buffer size is too small for the amount of input data, it can lead to intermediate spills to disk and which will later need to be read and merged. Increasing this value will reduce or eliminate the number of intermediate spills going to disk and reduce the overall I/O load on your system.
Default value: 100 Mb
Recommended value: Use 1/4 to 1/2 of the map/reduce task Java heap size setting (in mapred.child.java.opts).
Auto-tuned value: 1/2 of the map/reduce Java heap size

 

Reference from the book ” Hadoop Operations” of Eric Sammer:

“The value of io.sort.mbis specified in megabytes and, by default, is 100.
Increasing the size of this buffer results in fewer spills to disk and, as a consequence, reduces the number of spill files that must be merged when the map task completes.

The io.sort.mbparameter is one way administrators and job developers can trade more memory for reduced disk IO.

The downside of this is that this buffer must be contained within the child task’s JVM heap allocation, as defined by mapred.child.java.opts.
For example, with a child heap size of 1GB and io.sort.mbset to 128, only 896MB is really available to the user’s code

Remember that ultimately, all records output by map tasks must be spilled so,
in the ideal scenario, these numbers are equal.”

阅读更多
版权声明:本文为博主原创文章,转载请注明。 https://blog.csdn.net/tianbaochao/article/details/51557736
个人分类: ccah-500
想对作者说点什么? 我来说一句

没有更多推荐了,返回首页

不良信息举报

ccah-500 第9题 How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?

最多只允许输入30个字

加入CSDN,享受更精准的内容推荐,与500万程序员共同成长!
关闭
关闭