老样儿,先从官网入手
https://www.cloudera.com/documentation.html
➜ 软件 scp -r spark-cdh-2-parcel root@47.112.138.102:/var/www/html/spark-cdh-2/
[root@hadoop001 spark-cdh-2-parcel]# mv SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957-el7.parcel.sha1 SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957-el7.parcel.sha
[root@hadoop001 spark-cdh-2-parcel]# mkdir /opt/cloudera/csd
[root@hadoop001 spark-cdh-2-parcel]# cp SPARK2_ON_YARN-2.2.0.cloudera2.jar /opt/cloudera/csd
[root@hadoop001 spark-cdh-2-parcel]# cd /opt/cloudera/
[root@hadoop001 cloudera]# cd csd/
[root@hadoop001 csd]# chown -R cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.2.0.cloudera2.jar
[root@hadoop001 csd]# chmod 644 SPARK2_ON_YARN-2.2.0.cloudera2.jar
[root@hadoop001 csd]# /opt/cloudera-manager/cm-5.12.0/etc/init.d/cloudera-scm-server restart
Stopping cloudera-scm-server: [ 确定 ]
Starting cloudera-scm-server: [ 确定 ]
点击Finish之后自动跳到首页,按照提示重启一下
大功告成!!!
下面我们来运行一个Spark作业
[root@hadoop001 bin]# pwd
/opt/cloudera/parcels/SPARK2/lib/spark2/bin
[root@hadoop001 bin]# spark2-submit \
> --master yarn \
> --num-executors 1 \
> --executor-cores 1 \
> --executor-memory 1G \
> --class org.apache.spark.examples.SparkPi \
> /opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars/spark-examples_2.11-2.2.0.cloudera2.jar 10
spark2-submit \
--master yarn \
--num-executors 1 \
--executor-cores 1 \
--executor-memory 1G \
--class org.apache.spark.examples.SparkPi \
/opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars/spark-examples_2.11-2.2.0.cloudera2.jar 10
报错了
19/06/05 11:48:03 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1251 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
解决办法
按提示重启一下
修改完之后重新运行,又报错了,莫慌
19/06/05 12:00:31 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3530)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3513)
so easy
su - hdfs再继续运行,生产上都是使用hdfs用户来提交spark作业
搞定
大功告成!!!