大数据 | Hadoop
文章平均质量分 80
gjwang1983
这个作者很懒,什么都没留下…
展开
-
[ Hadoop | Spark | Scala ] 搭建 Scoobi 开发环境
Scoobi: An open source Scala library for Hadoop MapReduce. It combines the simplicity of functional programming with the strength of distributed data processing powered by Hadoop. It can dramatica原创 2015-04-01 11:11:21 · 1254 阅读 · 0 评论 -
[Spark | Yarn | Hadoop] Spark Submit over Yarn
I use pre built package of spark 1.0.2 for Hadoop 2.4.1 edit conf/spark-env.sh export HADOOP_CONF_DIR="/apache/hadoop/conf" export YARN_CONF_DIR="/apache/hadoop/conf" export SPARK_LIBRARY_PATH="/a原创 2015-04-01 11:11:16 · 606 阅读 · 0 评论 -
[ Hadoop | MapReduce ] 使用 CompositeInputSplit 来提高Join效率
Map side join is the most efficient way. On Hadoop, between two large datasets, we can utilizeComposite Join to achieve this goal.原创 2015-04-01 11:09:12 · 926 阅读 · 0 评论