I'm sharing you my experience with CDH.(it is jpurely a personal recommendation) CDH source code is basically from the apache svn itself,but not mirrored to apache releases. A CDH release would be corresponding to a certain/latest release from apache with a good number of patches on top. Majority of these patches would be available in hadoop svn but may be not part of the current Apache Hadoop release. The major advantages I saw with CDH are - Cloudera provides a tool SCM that would kind of automatically set up a hadoop cluster for you - Cloudera bundles the hadoop related projects which is pretty ease to install on any standard linux boxes() - Cloudera ensures that the CDH release and the available hadoop projects for the release are compatible(for example you don't have to take the hassle on finding the compatible hbase release with your hadoop release and integration between related projects etc) - There are a good number of large enterprises using CDH with cloudera support.(Cloudera provides various support packages) - Since a large enterprises are dependent on CDH, it in turn speaks how well CDH is tested and if a bug arises how large would be the impact. (In short CDH is well tested) - Under Cloudera support you get help and suggestions from Cloudera hadoop expert engineers in fine tuning your hadoop platform, tools application etc. - When you go in with some end to end enterprise solutions with hadoop, you can even get advises on best practices in your code level as well from them.(You do get the same from hadoop user groups as well but here there is a dedicated timeline based commitment when you are a customer of Cloudera) - If you don't have the best hadoop resources in store, you may find tough times in handling failures on your cluster , fine tuning your cluster, updating your cluster, optimizing your applications etc. Cloudera guys would throw light almost all critical issues and helps in getting resolved under stringent SLAs. These points never says Apache Releases not so great. It is definitely the best and back bone of hadoop. It is well tested as well. But when it comes nonavailability of expert hadoop resources in house, you can face lot of unexpected hurdles which you may need to handle in time bound manner and there you need to have hadoop consultants. Definitely you'd get more valid points directly from the Cloudera engineers.(Some official comments) Hope it helps!..
来源:http://mail-archives.apache.org/mod_mbox/hive-user/201111.mbox/%3C1320337854.22398.YahooMailNeo@web121217.mail.ne1.yahoo.com%3E