The below table shows some information about what versions of Hadoop are supported by various HBase versions. Based on the version of HBase, you should select the most appropriate version of Hadoop. We are not in the Hadoop distro selection business. You can use Hadoop distributions from Apache, or learn about vendor distributions of Hadoop athttp://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support
Hadoop 2.x is better than Hadoop 1.x
Hadoop 2.x is faster, with more features such as short-circuit reads which will help improve your HBase random read profile as well important bug fixes that will improve your overall HBase experience. You should run Hadoop 2. rather than Hadoop 1. if you can.
Table2.1.Hadoop version support matrix
HBase-0.92.x | HBase-0.94.x | HBase-0.96.0 | |
---|---|---|---|
Hadoop-0.20.205 | S | X | X |
Hadoop-0.22.x | S | X | X |
Hadoop-1.0.0-1.0.2[a] | S | S | X |
Hadoop-1.0.3+ | S | S | S |
Hadoop-1.1.x | NT | S | S |
Hadoop-0.23.x | X | S | NT |
Hadoop-2.0.x-alpha | X | NT | X |
Hadoop-2.1.0-beta | X | NT | S |
Hadoop-2.2.0 | X | NT | S |
Hadoop-2.x | X | NT | S |
[a]HBase requires hadoop 1.0.3 at a minimum; there is an issue where we cannot find KerberosUtil compiling against earlier versions of Hadoop. |
Where
S = supported and tested, |
X = not supported, |
NT = it should run, but not tested enough. |
Replace the Hadoop Bundled With HBase!
Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under itslib
directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it iscriticalthat the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase everywhere on your cluster. Hadoop version mismatch issues have various manifestations but often all looks like its hung up.