回顾Hadoop发展历史中的一些重要jira,对于学习和理解Hadoop设计原理和工作机制很有帮助,让你看到Hadoop commiters是如何一步步改进系统的。这里记录下个人觉得可以深入学习的jira。
https://issues.apache.org/jira/browse/HADOOP-3136 Assign multiple tasks per TaskTracker heartbeat
https://issues.apache.org/jira/browse/HADOOP-3412 Refactor the scheduler out of the JobTracker
https://issues.apache.org/jira/browse/HADOOP-4664 Parallelize job initialization
https://issues.apache.org/jira/browse/MAPREDUCE-93 Job Tracker should prefer input-splits from overloaded racks
https://issues.apache.org/jira/browse/HADOOP-288 Distributed Cache
https://issues.apache.org/jira/browse/HADOOP-249 Task JVM reuse
https://issues.apache.org/jira/browse/HADOOP-6659 Switch RPC to use Avro
https://issues.apache.org/jira/browse/HADOOP-7773 Add support for protocol buffer based RPC engine
https://issues.apache.org/jira/browse/HADOOP-1230 Replace parameters with context objects
https://issues.apache.org/jira/browse/MAPREDUCE-279 Map-Reduce 2.0
HDFS-2832: Enable support for heterogeneous storages in HDFS
HDFS-4949: Centralized cache management in HDFS
HDFS-347: Implementing short circuit reads with security
HDFS-1623: High Availability Framework for HDFS NN
http://www.quora.com/What-are-the-most-interesting-Hadoop-JIRA-issues 列出了更多有意思的jira