spark启动脚本中涉及到的shell基础命令总结

最新推荐文章于 2024-04-14 03:14:48 发布

zhaoweiwei369

最新推荐文章于 2024-04-14 03:14:48 发布

阅读量545

点赞数 1

分类专栏： linux 文章标签： shell linux spark

本文链接：https://blog.csdn.net/qq_36297093/article/details/117746143

版权

linux 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

今天看了王知无-大数据技术与架构老师的 Spark源码分析之Spark Shell 不由感慨这不就是标准的模板脚本吗，今天我主要对启动过程中涉及到的一些shell脚本涉及的基础命令进行总结，在这里也非常感谢老师兢兢业业的分析为我提供的基础的思路和素材。如果里面有什么不妥的地方也欢迎大家指出来，共勉。

我尽量不重复王知无-大数据技术与架构老师的博客内容只做基础命令的总结，如果大家想看具体的启动分析欢迎大家光顾王老师博客

首先把启动脚本先放上去

#!/usr/bin/env bash
 
# Shell script for starting the Spark Shell REPL
 
cygwin=false
case "`uname`" in
  CYGWIN*) cygwin=true;;
esac
 
# Enter posix mode for bash
set -o posix
 
if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi
 
export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]"
 
# SPARK-4161: scala does not assume use of the java classpath,
# so we need to add the "-Dscala.usejavacp=true" flag manually. We
# do this specifically for the Spark shell because the scala REPL
# has its own class loader, and any additional classpath specified
# through spark.driver.extraClassPath is not automatically propagated.
SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"
 
function main() {
  if $cygwin; then
    # Workaround for issue involving JLine and Cygwin
    # (see http://sourceforge.net/p/jline/bugs/40/).
    # If you're using the Mintty terminal emulator in Cygwin, may need to set the
    # "Backspace sends ^H" setting in "Keys" section of the Mintty options
    # (see https://github.com/sbt/sbt/issues/562).
    stty -icanon min 1 -echo > /dev/null 2>&1
    export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
    "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"
    stty icanon echo > /dev/null 2>&1
  else
    export SPARK_SUBMIT_OPTS
    "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"
  fi
}
 
# Copy restore-TTY-on-exit functions from Scala script so spark-shell exits properly even in
# binary distribution of Spark where Scala is not installed
exit_status=127
saved_stty=""
 
# restore stty settings (echo in particular)
function restoreSttySettings() {
  stty $saved_stty
  saved_stty=""
}
 
function onExit() {
  if [[ "$saved_stty" != "" ]]; then
    restoreSttySettings
  fi
  exit $exit_status
}
 
# to reenable echo if we are interrupted before completing.
trap onExit INT
 
# save terminal settings
saved_stty=$(stty -g 2>/dev/null)
# clear on error so we don't later try to restore them
if [[ ! $? ]]; then
  saved_stty=""
fi
 
main "$@"
 
# record the exit status lest it be overwritten:
# then reenable echo and propagate the code.
exit_status=$?
onExit

case命令

case语句适用于需要进行多重分支的应用情况。

        case分支语句的格式如下：

            case $变量名 in

                模式1）

            命令序列1

            ;;

                模式2）

            命令序列2

         ;; 

                *）

            默认执行的命令序列     ;; 

            esac 

        case语句结构特点如下：

        case行尾必须为单词“in”，每一个模式必须以右括号“）”结束。

        双分号“;;”表示命令序列结束。

        匹配模式中可是使用方括号表示一个连续的范围，如[0-9]；使用竖杠符号“|”表示或。

        最后的“*）”表示默认模式，当使用前面的各种模式均无法匹配该变量时，将执行“*）”后

    的命令序列。

uname命令

uname可以查看操作系统的名字，详情参考 man uname.

常用两个命令：

uname -r 可以查看内核版本；

uname -a 可以查看所有的信息

set命令

set [-可选参数] [-o 选项]
可选参数及其说明如下：

参数	说明
-a	标示已修改的变量，以供输出至环境变量
-b	使被中止的后台程序立刻回报执行状态
-d	Shell预设会用杂凑表记忆使用过的指令，以加速指令的执行。使用-d参数可取消
-e	若指令传回值不等于0，则立即退出shell
-f	取消使用通配符
-h	自动记录函数的所在位置
-k	指令所给的参数都会被视为此指令的环境变量
-l	记录for循环的变量名称
-m	使用监视模式
-n	测试模式，只读取指令，而不实际执行
-p	启动优先顺序模式
-P	启动-P参数后，执行指令时，会以实际的文件或目录来取代符号连接
-t	执行完随后的指令，即退出shell
-u	当执行时使用到未定义过的变量，则显示错误信息
-v	显示shell所读取的输入值
-H shell	可利用"!"加<指令编号>的方式来执行 history 中记录的指令
-x	执行指令后，会先显示该指令及所下的参数
+<参数>	取消某个set曾启动的参数。与-<参数>相反
-o option	特殊属性有很多，大部分与上面的可选参数功能相同，这里就不列了

其他用法：

1.set：初始化位置参数
$ cat set-it.sh
#!/bin/bash
set first second third
echo $3 $2 $1

$ ./set-it.sh
third second first

if命令

if [ command ];then
   符合该条件执行的语句
elif [ command ];then
   符合该条件执行的语句
else
   符合该条件执行的语句
fi

# 文件表达式
if [ -f  file ]    如果文件存在
if [ -d ...   ]    如果目录存在
if [ -s file  ]    如果文件存在且非空 
if [ -r file  ]    如果文件存在且可读
if [ -w file  ]    如果文件存在且可写
if [ -x file  ]    如果文件存在且可执行  

# 整数变量表达式
if [ int1 -eq int2 ]    如果int1等于int2   
if [ int1 -ne int2 ]    如果不等于    
if [ int1 -ge int2 ]    如果>=
if [ int1 -gt int2 ]    如果>
if [ int1 -le int2 ]    如果<=
if [ int1 -lt int2 ]    如果<
#    字符串变量表达式
If  [ $a = $b ]                 如果string1等于string2,字符串允许使用赋值号做等号
if  [ $string1 !=  $string2 ]   如果string1不等于string2       
if  [ -n $string  ]             如果string 非空(非0），返回0(true)  
if  [ -z $string  ]             如果string 为空
if  [ $sting ]                  如果string 非空，返回0 (和-n类似)

$() ${} $(()) $

$()

$()与``类似用作命令替换
如: $(d=date "+%Y-%m-%d";pwd)多个命令之间用;隔开最后一个命令不用 ; 号

${}

${ }变量替换
1.一般情况下，$var与${var}是没有区别的，但是用${ }会比较精确的界定变量名称的范围
exp1:

[root@localhost ~]# A=Linux
[root@localhost ~]# echo $AB    #表示变量AB

[root@localhost ~]# echo ${A}B    #表示变量A后连接着B
LinuxB
2.取路径、文件名、后缀
先赋值一个变量为一个路径，如下：
file=/dir1/dir2/dir3/my.file.txt

命令    解释    结果
${file#*/}    拿掉第一条 / 及其左边的字符串    dir1/dir2/dir3/my.file.txt
[root@localhost ~]# echo ${file#*/}
dir1/dir2/dir3/my.file.txt

${file##*/}    拿掉最后一条 / 及其左边的字符串    my.file.txt
[root@localhost ~]# echo ${file##*/}
my.file.txt

${file#*.}    拿掉第一个 . 及其左边的字符串    file.txt
[root@localhost ~]# echo ${file#*.}
file.txt

${file##*.}    拿掉最后一个 . 及其左边的字符串    txt
[root@localhost ~]# echo ${file##*.}
txt

${file%/*}    拿掉最后一条 / 及其右边的字符串    /dir1/dir2/dir3
[root@localhost ~]# echo ${file%/*}
/dir1/dir2/dir3

${file%%/*}    拿掉第一条 / 及其右边的字符串    (空值)
[root@localhost ~]# echo ${file%%/*}
(空值)

${file%.*}    拿掉最后一个 . 及其右边的字符串    /dir1/dir2/dir3/my.file
[root@localhost ~]# echo ${file%.*}
/dir1/dir2/dir3/my.file

${file%%.*}    拿掉第一个 . 及其右边的字符串    /dir1/dir2/dir3/my
[root@localhost ~]# echo ${file%%.*}
/dir1/dir2/dir3/my
记忆方法如下：

# 是去掉左边(在键盘上 # 在 $ 之左边)
% 是去掉右边(在键盘上 % 在 $ 之右边)
单一符号是最小匹配;两个符号是最大匹配
*是用来匹配不要的字符，也就是想要去掉的那部分
还有指定字符分隔号，与*配合，决定取哪部分

3.取子串及替换

命令                                    解释                           　　 结果
${file:0:5}            　　　提取最左边的 5 个字节    　　　　　　　　　　　　/dir1
${file:5:5}            　　　提取第 5 个字节右边的连续 5 个字节    　　　　　/dir2
${file/dir/path}            将第一个 dir 提换为 path    　　　　　　　　　 /path1/dir2/dir3/my.file.txt
${file//dir/path}    　　　　将全部 dir 提换为 path    　　　　　　　　　　　/path1/path2/path3/my.file.txt
${#file}    　　　　　　　　　 获取变量长度    　　　　　　　　　　　　　　　　　27

$(( ))

1.$(( ))与整数运算
[root@localhost ~]# echo $((2*3))
6
[root@localhost ~]# a=5;b=7;c=2
[root@localhost ~]# echo $((a+b*c))
19
[root@localhost ~]# echo $(($a+$b*$c))
19
2.进制转换
$(( ))可以将其他进制转成十进制数显示出来。用法如下：
echo $((N#xx))
其中，N为进制，xx为该进制下某个数值，命令执行后可以得到该进制数转成十进制后的值。

[root@localhost ~]# echo $((2#110))
6
[root@localhost ~]# echo $((16#2a))
42
[root@localhost ~]# echo $((8#11))
9
3.(())重定义变量值
[root@localhost ~]# a=5;b=7
[root@localhost ~]# ((a++))
[root@localhost ~]# echo $a
6
[root@localhost ~]# ((a--));echo $a
5
[root@localhost ~]# ((a<b));echo $?
0
[root@localhost ~]# ((a>b));echo $?
1

$# 是传给脚本的参数个数
$0 是脚本本身的名字
$1 是传递给该shell脚本的第一个参数
$2 是传递给该shell脚本的第二个参数
$@ 是传给脚本的所有参数的列表
$* 是以一个单字符串显示所有向脚本传递的参数，与位置变量不同，参数可超过9个
$$ 是脚本运行的当前进程ID号
$? 是显示最后命令的退出状态，0表示没有错误，其他表示有错误

" " ` ` ' '

" " 和 ` `都有处理命令的作用
'' 不做任何处理

dirname

dirname命令用于显示某个文件所在的路径

[root@localnode3 test]# pwd
/home/xinghl/test
[root@localnode3 test]# ll
总用量 4
-rw-r--r-- 1 root root 27 2月  17 10:48 test1
[root@localnode3 test]# dirname test1
.
我们要的其实就是那个点，在linux中.代表当前目录。..代表父目录。因此cd ./.. 就是进入父目录的意思。
例:
export SPARK_HOME="$(cd "dirname "$0""/..; pwd)"

[ ] [[ ]] ( ) (( ))

详细可参考：

https://blog.csdn.net/taiyang1987912/article/details/39551385

zhaoweiwei369

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
spark启动脚本中涉及到的shell基础命令总结

今天看了王知无-大数据技术与架构老师的 Spark源码分析之Spark Shell 不由感慨这不就是标准的模板脚本吗，今天我主要对启动过程中涉及到的一些shell脚本涉及的基础命令进行总结，在这里也非常感谢老师兢兢业业的分析为我提供的基础的思路和素材。如果里面有什么不妥的地方也欢迎大家指出来，共勉。我尽量不串联王知无-大数据技术与架构老师的博客内容只做基础命令的总结，如果大家想看具体的启动分析欢迎大家光顾王老师博客首先把启动脚本先放上去#!/usr/bin/env bash ...
复制链接

扫一扫