Shell编程注意点 ( by quqi99 )

                                   Shell编程注意点 ( by quqi99 )

作者:张华 写于:发表于:2011-04-06
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

( http://blog.csdn.net/quqi99 )

1在shell脚本中调用另一个脚本的三种不同方式(fork,exec, source)

       我们先谈谈在shell脚本中调用另一个脚本的三种不同方式的区别(fork,exec, source )

  • fork ( /directory/script.sh ), fork是最普通的,就是直接在脚本里面用/directory/script.sh来调用script.sh这个脚本.运行的时候开一个sub-shell执行调用的脚本,sub-shell执行的时候,parent-shell还在。sub-shell执行完毕后返回parent-shell.sub-shell从parent-shell继承环境变量.但是sub-shell中的环境变量不会带回parent-shell

  • exec(exec /directory/script.sh) exec与fork不同,不需要新开一个sub-shell来执行被调用的脚本. 被调用的脚本与父脚本在同一个shell内执行。但是使用exec调用一个新脚本以后,父脚本中exec行之后的内容就不会再执行了。这是exec和source的区别

  • source(source/directory/script.sh)与fork的区别是不新开一个sub-shell来执行被调用的脚本,而是在同一个shell中执行.所以被调用的脚本中声明的变量和环境变量,都可以在主脚本中得到和使用.

可以通过下面这两个脚本来体会三种调用方式的不同:

1.sh 

#!/bin/bash
A=B
echo "PID for 1.sh before exec/source/fork:$$"
exportA
echo "1.sh: /$A is $A"
case $1 in
       exec)
               echo "using exec…"
               exec ./2.sh ;;
       source)
               echo "using source…"
               ../2.sh ;;
        *)
               echo"using fork by default…"
               ./2.sh ;;
esac
echo"PID for 1.sh after exec/source/fork:$$"
echo "1.sh:/$A is $A"

2.sh 

#!/bin/bash
echo"PID for 2.sh: $$"
echo "2.sh get /$A=$A from1.sh"
A=C
export A
echo "2.sh: /$A is $A"
 
执行情况:
$./1.sh    
PID for 1.sh beforeexec/source/fork:5845364
1.sh: $A is B
using fork bydefault…
PID for 2.sh: 5242940
2.sh get $A=B from 1.sh
2.sh:$A is C
PID for 1.sh after exec/source/fork:5845364
1.sh: $A isB
$ ./1.sh exec
PID for 1.sh beforeexec/source/fork:5562668
1.sh: $A is B
using exec…
PID for2.sh: 5562668
2.sh get $A=B from 1.sh
2.sh: $A is C
$./1.sh source
PID for 1.sh beforeexec/source/fork:5156894
1.sh: $A is B
using source…
PIDfor 2.sh: 5156894
2.sh get $A=B from 1.sh
2.sh: $A is C
PIDfor 1.sh after exec/source/fork:5156894
1.sh: $A is C
$

2函数调用

先看一个例子,执行mysql的函数mysqlExec,如下:

source“mysql.conf”

mysqlExec(){
sql=$1
sqlOp=`echo${sql:0:6}| tr A-Z a-z`
if[ "$sqlOp" != "select" ]; then
  sql=$sql";select row_count();"
fi
#use different mysql command depends on the password

if[ -z $MYSQL_PASSWORD ]
then
  $mysql $MYSQL_DATABASE -h$MYSQL_HOSTNAME -u$MYSQL_USERNAME -se "${sql};"
else
  $mysql $MYSQL_DATABASE -h$MYSQL_HOSTNAME -u$MYSQL_USERNAME -p$MYSQL_PASSWORD-se "${sql};"
fi
status=$?
if[ $status -eq 0 ]; then
  if[ "$sqlOp" != "select" ]; then
    log "OK $sql"
  fi
else
  log "Occur DB Error, can retry in 3 seconds later -> $sql"
  sleep3
  echo "DB_ERROR"
fi
return $status
exit
}

函数调用要注意两点:
1) 函数中可以用 echo,如上面的echo "DB_ERROR",在调用时要获取echo的值,应该这样:
campaign=`mysqlExec"$sql"`
if[ "x$campaign" == "x" -o "$campaign" =="DB_ERROR" ]; then
continue
fi
2) 函数中也可以有返回值,如上面的return $status,在调用时应该 通过 $? 获得,如:
if[ $? -eq 0 ]; then
echo“zhanghua”

fi

3) 如果想从被调用的函数处返回一个值,可以这样

   调用gen_conf函数,传一个引用(注意不是变量)config_file进去, gen_conf $host config_file

   在gen_conf函数中通过__resultvar变量返回值:

   gen_config{

         local  __resultvar=$2
        eval $__resultvar="'$config'"

   }

3shell中的要用局部变量很纠结

你会看到 shell有一个非常大的缺点,就是它在函数调用时,没有局部变量与全局变量之分,如 A脚本调用 B脚本中的一个函数,在 B脚本内部有一个变量 vari(你可能受 JAVA影响认为它是局部变量那就大错特错了),如果 A脚本中也有这个名为 vari的变量,那么在函数返回时, B脚本的那个 vari变量会将 A脚本的 vari变量覆盖,举个例子:
updateWithOptimisticLock(){
rand=$1
campaignId=$2
seq=1
updateVal=-1
status=1
while[ true ]; do
if[ $seq -gt 3 ]; then
log"FATAL ERROR, Update num error, $updatesql"
break
fi
cur=`queryCur"$campaignId"`
if[ "$cur" == "DB_ERROR" ]; then
continue
fi
updateVal=$(($cur+$rand))
if[ "$rand" == "0" ]; then
break
fi
updatesql="updatet_campaign_ set num=$updateVal where campaign_id=$campaignId andnum=$cur"
affectRows=`mysqlExec"$updatesql"`
if[ "$affectRows" == "1" ]; then
status=0
break
fi
seq=$(($seq+1))
done
echo$updateVal
return$status
}
调用的伪码如下,这时里面的 seq变量会被上述 updateWithOptimisticLock函数中的变量 seq给覆盖,所以在 shell中没有局部变量一说
seq=1
while[ $seq -le $cycleNum ]; do
updateFakeNumWithOptimisticLock$rand $campaignId
done
4使用 xargs来传参数
shell中的管道符 |很强大,可以将前一条命令的标准输出作为下一条命令的标准输入,但是如果下一条命令不是标准输入而是需要传参的话,那怎么办呢,用 xargs即可,例如下列 shellxargs命令的 -i选项告诉 xargs用前一条命令的标准输出去替换 {}

find. | xargs zgrep "<URL>/Search?" | sed's/.*q=/([-_*()~.%+0-9A-Za-z]*/).*//1/' | sort -nr | uniq -c | sort-nr | head -1000 | xargs -i php -r "echorawurldecode('{}')./"/n/";" > result.out &

另, 修改/etc/hosts, 能处理127或localhost打头的多个hostname项如(ubuntu.me.com  ubuntu), 用‘/usr/bin/awk '$1 ~ /^127|localhost/ {print $0}' /etc/hosts’是为了避免IPv6 中的node如ff02::1 ip6-allnodes

/usr/bin/awk '$1 ~ /^127|localhost/ {print $0}' /etc/hosts |awk '$1 ~ /^127|localhost/ {print $0}' | /bin/sed "s/\s*\(${CURRENT_HOSTNAME}\)\(\s*\)/\t${NETCFG_HOSTNAME}/g"

当然 exec也可以实现上述功能,只是 exec都是一次性读入内存容易内存溢出,如:
find. -name "*.m4" -exec grep --color -H "catalina"{} /;

5shell中的sed命令使用的正则是缩水版

shell中的 sed命令使用的正则引擎和我们 java中平常所用正则引擎并不一样,功能比较弱。

如上节中的|sed 's/.*q=/([-_*()~.%+0-9A-Za-z]*/),就是因为shell的很多正则不支持,才在使用sed命令时用了那么多枚举。

20191115更新

上面说法不正确, sed可以加-r参数不用枚举, 例:

grep -r 'get_or_set_cached_cell_and_set_connections' var/log/nova/ |sed -r 's/.+(waited|held) (.+) inner.+/\2/g;t;d' |sort -nr |head -n 5

nova service-list --bi nova-compute | grep nova-compute | cut -d ' ' -f 4 | xargs -n 1 -I {} ssh -o StrictHostKeyChecking=no  ubuntu@{} "date; hostname; zgrep MessagingTimeout /var/log/nova/nova-compute.log*; echo -e '-----------------------------\n'"

数组

readarray -t cookies<<<"`ls -1 /var/snap/ovs-stat/common/tmp/tmp.662kJWxfEg/juju-4f585d-sf00272961-cisco-7/ovs/bridges/br-int/flowinfo/cookies/| grep -v 6cc04af0837d3b42`"
for c in ${cookies[@]}; do grep -rl $c /var/snap/ovs-stat/common/tmp/tmp.662kJWxfEg/juju-4f585d-sf00272961-cisco-7/ovs/bridges/br-int/flowinfo/tables; done| sort| uniq -c

svcs=(
nova-compute-kvm
neutron-openvswitch
)
for svc in ${svcs[@]}; do
juju config $svc worker-multiplier
done

排序相加

cat ps.txt | sed 's/.*]\(.*\)/\1/g' | column -t | body sort -n -k4 -r
uid     tgid     total_vm  rss      pgtables_bytes  swapents  oom_score_adj  name
100112  832785   5760963   1650592  16789504        76362     0              beam.smp

cat ps.txt | sed 's/.*]\(.*\)/\1/g' | column -t | body sort -n -k4 -r | awk '{sum+=$4;} END{print sum;}'

rabbitmq相序相加举例

$ cat sos_commands/networking/netstat_-W_-neopa |head -n1
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name     Timer
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      101        1671174047 178/systemd-resolve  off (0.00/0/0)

$ cat sos_commands/networking/netstat_-W_-neopa| awk '/:5672/ { print $5 }' | awk -F: '{ a[$1]++ }; END { for (i in a) print i, a[i] }' |sort -n -k 2 -r |more |head -n 10
10.164.0.107 69

实例 - 抓取

#!/bin/bash
# Usage: nohup proxychains4 ./crawl.sh > log.txt 2>1&
#set -x
[[ -f page.txt ]] && echo 'skip lynx' || lynx -dump ftp://ftp.hycom.org/datasets/force/ncep_cfsr/netcdf/ >page.txt
grep -r "ftp://" page.txt |awk '{print $2}' > urls.txt
readarray -t exist_files<<<"`ls .`"
for url in $(cat urls.txt)
do
  skip="0"
  name=$(echo $url |awk -F '/' '{print $NF}')
  readarray -t exist_files<<<"`ls .`"
  for f in ${exist_files[@]}; do
    [[ "$name" == "$f" ]] && skip="1"; break;
  done
  if [ "$skip" == "1" ]; then
    echo "skipping ${f}";
  else
    echo "download ${url}"; 
    wget -c ${url}
  fi
done

改进版

#!/bin/bash
# Usage: nohup proxychains4 ./crawl.sh > log.txt 2>1&
#set -x
[[ -f page.txt ]] && echo 'skip creating lynx' || lynx -dump http://tds.hycom.org/thredds/catalog/datasets/force/ncep_cfsr/netcdf/catalog.html >page.txt
[[ -f urls.txt ]] && echo 'skip creating urls.txt' || grep -r "http://" page.txt |awk '{print $2}' |grep -E '\.nc$' > urls.txt
sed -i '/dswsfc/d' urls.txt
sed -i '/dlwsfc/d' urls.txt
readarray -t exist_files<<<"`ls . |grep -E '\.nc$'`";  #can't use multiple commands in [[ ]]
[[ -f skip.txt ]] && echo 'will use skip.txt' || printf "%s\n" "${exist_files[@]}" > skip.txt
readarray -t skip_files<<<"`cat skip.txt`"
for url in $(cat urls.txt)
do
  name=$(echo $url |awk -F '/' '{print $NF}')
  realurl='http://tds.hycom.org/thredds/fileServer/datasets/force/ncep_cfsr/netcdf/'$name
  if `echo ${skip_files[@]} |grep -q "$name"`; then
    echo "skipping ${f}";
  else
    echo "download ${realurl}"; 
    #wget -c --limit-rate=3m ${realurl}
    wget -c ${realurl}
    #echo ${realurl} >> skip.txt
  fi
done

获取绝对目录

export OS_CACERT=$(dirname "$(realpath -s "${BASH_SOURCE[0]}")")/ssl/openstack-ssl/results/cacert.pem

从国外下载气象数据

#gdisk /dev/sdd1   #t, 0700
#sudo mkfs.ntfs -f -L win /dev/sdd1
#sudo ntfsfix /dev/sdd1
sudo mkfs.ext4 /dev/xvdb
sudo parted /dev/xvdb  #print
sudo mount /dev/xvdb /mnt/
sudo mkdir -p /mnt/ftp
sudo chown -R $USER /mnt/
sudo apt install curlftpfs -y
#sudo fusermount -zu /mnt/ftp
sudo curlftpfs ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/ /mnt/ftp
sudo curlftpfs -o rw,allow_other,uid=1000,gid=1000 ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/ /mnt/ftp
$ scp -i ~/.aws/zhhuabj.pem ubuntu@13.114.59.98:/mnt/ftp/uvel/rarchv.2012_205_00_3zu.nc4 /tmp/
rarchv.2012_205_00_3zu.nc4                                                                                                94%  236MB   7.6MB/s   00:01 ETA

# using direct_io and cache=no to avoid using disk
sudo bash -c 'cat >>/etc/fstab' <<EOF
curlftpfs#@ftp.hycom.org/datasets/global/GLBa0.08_rect/data/ /mnt/ftp fuse defaults,direct_io,cache=no,rw,allow_other,uid=1000,gid=1000,_netdev 0 0
EOF
sudo mount -a

rsync -avztur -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.aws/lxj.pem" --progress ubuntu@3.18.107.xxx:/mnt/ftp .

注意: 上面加了direct_io,cache=no后总是hang,得去掉.

#debug curlftpfs
sudo fusermount -zu /mnt/ftp/2d
sudo curlftpfs -f -v -o debug,ftpfs_debug=3,rw,allow_other,uid=1000,gid=1000 ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/2d /mnt/ftp/2d
1, use lftp to mirror

# lftp -e "mirror --delete --only-newer --verbose --parallel=2" ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/
lftp -c "set ftp:list-options -a;
open 'ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/';
lcd /home/ubuntu/test;
cd 2d
mirror -c --use-cache --verbose --allow-chown --allow-suid --no-umask --verbose --parallel=2"

2, use wget to mirror

wget -c -m ftp://ftp.hycom.org/datasets/global/GLBa0.08_rect/data/
lynx -dump http://13.59.199.151/new_01hr/ > page.txt
grep -r "http://" page.txt |awk '{print $2}' |grep -E '\.nc$' > urls.txt
aria2c -x 10 -i urls.txt >/dev/null 2>/dev/null &

并行运行

token='111'
seq $PARALLEL_REQS | xargs -I {} -n1 -P$PARALLEL_REQS echo "$token"

#!/bin/bash -eux
PARALLEL_REQS=5
SLEEP_SECS=0.1
TEMPLOG=$(mktemp).log
K8S_ENDPOINT='/api'
export CLUSTER_NAME="juju-cluster"
APISERVER=$(kubectl config view -o jsonpath="{.clusters[?(@.name==\"$CLUSTER_NAME\")].cluster.server}")
TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(@.metadata.annotations['kubernetes\.io/service-account\.name']=='default')].data.token}"|base64 --decode)

# trap ctrl-c and call ctrl_c()
trap ctrl_c INT
function ctrl_c() {
  local DEST=query-kubeapiserver.$(date '+%Y%m%d_%H%M').log
  mv $TEMPLOG $DEST
  echo "output at $DEST"
  exit 0
}

set +x
while true; do
  echo "in while loop"
  seq $PARALLEL_REQS | xargs -I {} -n1 -P$PARALLEL_REQS curl -s -X GET $APISERVER$K8S_ENDPOINT --header "Authorization: Bearer $TOKEN" --insecure 2>&1 >> $TEMPLOG
# seq $PARALLEL_REQS | xargs -I {} -P$PARALLEL_REQS sudo etcd.etcdctl --cacert /root/cdk/etcd/client-ca.pem --cert /root/cdk/etcd/client-cert.pem --key /root/cdk/etcd/client-key.pem --endpoints=$ETCD_ENDPOINTS get -w json $KEY 2>&1 >> $TEMPLOG
  echo "sleeping.."
  /bin/sleep $SLEEP_SECS
  echo "sleep done"
done

vimdiff

vimdiff <(cut -d' ' -f2- var/snap/maas/common/log/rsyslog/maas-enlisting-node/2021-11-11/192.168.32.106) <(cut -d' ' -f2- var/snap/maas/common/log/rsyslog/maas-enlisting-node/2021-11-11/192.168.32.107)

awk中的field再提取数据

# poll_loop|INFO|wakeup due to [POLLIN] on fd 3 (10.10.5.180:42162<->10.10.5.166:6642) at lib/stream-ssl.c:832 (100% CPU usage)
# sudo tail -f -n0 /var/log/ovn/ovn-controller.log | awk '/wakeup due to/ { system("sleep 5"); system("echo \"over\""); }'
sudo tail -f -n0 /var/log/ovn/ovn-controller.log | awk '/wakeup due to/ { system("sudo /usr/bin/perf record -p `pidof ovn-controller` -g --call-graph dwarf sleep 5"); system("mpstat -P ALL"); system("sudo pkill tail"); }'

不知道为什么下面的不work, substr($11,2,length($11)-2) 相当于将 (100% 变成 100
# awk从field中提取数据 - https://unix.stackexchange.com/questions/468010/awk-extract-string-from-a-field
sudo tail -f -n0 /var/log/ovn/ovn-controller.log | awk '/wakeup due to/ && substr($11,2,length($11)-2)>99 { system("sudo /usr/bin/perf record -p `pidof ovn-controller` -g --call-graph dwarf sleep 5"); system("mpstat -P ALL"); system("sudo pkill tail"); }'

shell中的单引号与双引号

有特殊字符的需要用单引号

UNIT=vault/0
rid=certificates:34
juju run -u $UNIT 'for rid in $(relation-ids certificates); do echo $rid; done'
juju run -u $UNIT 'relation-get -r '$rid' - '${UNIT}'' > tmp
juju run -u $UNIT "relation-get -r \$rid - \$UNIT"
cat tmp |~ubuntu/sanity.py
cat tmp |~ubuntu/sanity.py |grep "Not After"
juju scp ~ubuntu/sanity.py $UNIT:/tmp/
juju run -u $UNIT 'for rid in $(relation-ids certificates); do echo $rid; relation-get -r $rid - '${UNIT}' | /tmp/sanity.py | grep "Not After"; done'
juju run -u $UNIT 'for rid in $(relation-ids certificates); do echo $rid; relation-get -r '$rid' - '${UNIT}' | /tmp/sanity.py | grep "Not After"; done'

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

quqi99

你的鼓励就是我创造的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值