【spark】spark常用命令列表

:告诉spark-shell hadoop配置文件路径

#用YARN_CONF_DIR或HADOOP_CONF_DIR指定YARN或者Hadoop配置文件存放目录

set HADOOP_HOME=D:\Big-File\Architecture\hadoop\hadoop-2.3.0
set HADOOP_CONF_DIR=D:\Big-File\Architecture\hadoop\hadoop-2.3.0\etc\hadoop

:启动spark-shell时,指定需要加载的类库

bin\spark-shell  --jars   E:\DM\XXXXXXX-1.0.0.jar

:指定driver内存

bin\spark-shell --driver-memory 512m    --verbose

:spark ui地址

http://192.168.1.5:4040/jobs/

:通过spark-submit运行某个应用

bin\spark-submit --master local[4]  --class com.test.mllib.XXXXXX  E:\DM\XXXXX-1.0.0.jar  2 3 7 10 1300 1307

:java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------

https://issues.apache.org/jira/browse/SPARK-10528

解决方案(适合spark1.6.0,适合spark1.5.2):

1. Open Command Prompt in Admin Mode

2.创建目录d:/tmp/hive
2. winutils.exe chmod 777 /tmp/hive
3. Open Spark-Shell --master local[2]

:日志配置conf/log4j.properties

log4j.rootCategory=INFO, console,FILE

log4j.appender.FILE=org.apache.log4j.DailyRollingFileAppender
log4j.appender.FILE.Threshold=DEBUG
log4j.appender.FILE.file=E:/DM/Spark/spark-1.6.0-bin-hadoop2.6/spark.log
log4j.appender.FILE.DatePattern='.'yyyy-MM-dd
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=[%-5p] [%d{yyyy-MM-dd HH:mm:ss}] [%C{1}:%M:%L] %m%n

:java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

使用spark1.6-hadoop2.6访问hadoopp2.6报错,改为spark1.6-hadoop2.3

:"main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

直接修改hadoop源码,返回true

:测试访问hadoop

val textFile = sc.textFile("hdfs://localhost:19000/README.txt")
textFile.count

:测试提交任务到Yarn上执行

export HADOOP_CONF_DIR=D:\Big-File\Architecture\hadoop\hadoop-2.3.0\etc\hadoop

bin\spark-submit --class com.test.mllib.test.WorkCountApp --master yarn  --deploy-mode client  --executor-memory 256M  --num-executors 1 E:\DM\code\projects\ch11-testit\target\ch11-testit-1.0.0.jar hdfs://localhost:19000/README.txt
bin\spark-submit --class org.apache.spark.examples.SparkPi --master yarn  --deploy-mode client  --executor-memory 128M  --num-executors 1  E:\DM\Spark\spark-1.6.0-bin-hadoop2.3\lib\spark-examples-1.6.0-hadoop2.3.0.jar   10

:spark assembly jar缓存,避免每次重新提交

http://blog.csdn.net/amber_amber/article/details/42081045

: Exit status: 1. Diagnostics: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 

http://zy19982004.iteye.com/blog/2031172

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值