Linux命令行与shell脚本(17)--正则表达式

什么是正则表达式

  • sed和gawk的正则表达式引擎之间是有区别的。gawk可以使用大多数扩展正则表达式模式符号,并且能够提供一些额外的sed没有的功能。但正因为如此,gawk通常在处理数据流时更慢
$ echo "The book are " | sed -n  '/book/p'
The book are 
$ echo "The book are " | sed -n  '/^book/p'  #锁定book出现在行首

$ echo "The book ^are " | sed -n  '/book ^/p' #若不将^放在模式开头,则^表示普通字符
The book ^are 
$ 

$ echo "The book good book" | sed -n  '/good/p'
The book good book
$ echo "The book good book" | sed -n  '/good$/p'  #锁定good出现在行尾
  • 使用正则过滤数据流中的空白行
$ sed '/^$/d' data  #匹配行首和行尾什么都没有的行,并删除
This is an added line.
This is the second added line
chenhong:shell_workspace chenhong$ 
  • 点字符用来匹配任意的单字符,除了换行符
$ cat data
Tihs is a test of line.
The cat is sleeping.
That is a very nice hat.
This test is at line four.
$ sed -n '/.at/p' data
The cat is sleeping.
That is a very nice hat.
This test is at line four.

  • 方框号[]表示字符组
$ cat data
Tihs is a test of line.
The cat is sleeping.
That is a very nice hat.
This test is at line four.
at ten o'clock we'll go home.

$ sed -n '/[ch]at/p' data  #匹配cat或hat
The cat is sleeping.
That is a very nice hat.

$ sed -n '/[^ch]at/p' data #匹配 .at ,其中不能是cat和hat
This test is at line four.

$ sed -n '/[c-f]at/p' data  # 匹配[cdef]at
The cat is sleeping.

$ sed -n '/[a-bh-k]at/p' data #匹配[abhijk]at
That is a very nice hat.


  • 使用正则过滤出4位数字的行
$ cat data
21234
1000
1001
1002
1003
1004
1005
45321
$ sed -n '/^[0123456789][0123456789][0123456789][0123456789]$/p' data
1000
1001
1002
1003
1004
1005
$ sed -n '/^[0-9][0-9][0-9][0-9]$/p' data  #使用区间写法
1000
1001
1002
1003
1004
1005
  • 正则表达式中的特殊字符
$ echo "abc" | sed -n '/[[:digit:]]/p' #匹配0~9之间的数字
$ echo "abc" | sed -n '/[[:alpha:]]/p' #匹配任意字母,不管是大写还是小写
abc
$ echo "This is, a test" | sed -n '/[[:punct:]]/p' #匹配标点符号
This is, a test
  • 星号表示前个字符会出现0次或多次
$ echo "ik" | sed -n '/ie*k/p'  #e出现0次
ik
$ echo "iek" | sed -n '/ie*k/p' #e出现1次
iek
$ echo "ieek" | sed -n '/ie*k/p' #e出现多次
ieek

$ echo "baat" | sed -n "/b[ae]*t/p"
baat
$ echo "baaeeet" | sed -n "/b[ae]*t/p"
baaeeet
  • 问号表示前个字符可以出现0次或1次
$ echo "bt" | gawk '/be?t/{print $0}'
bt
$ echo "bet" | gawk '/be?t/{print $0}'
bet
$ echo "beet" | gawk '/be?t/{print $0}'
  • 加号表明前面的字符可以出现1次或多次
$ echo "beet" | gawk '/be+t/{print $0}'
beet
$ echo "bet" | gawk '/be+t/{print $0}'
bet
$ echo "bt" | gawk '/be+t/{print $0}'
  • 花括号指定字符出现的次数
$ echo "bet" | gawk --re-interval '/be{2}t/{print $0}'
$ echo "beet" | gawk --re-interval '/be{2}t/{print $0}' #指定e出现2次
beet

$ echo "beet" | gawk --re-interval '/be{1,2}t/{print $0}'
beet
$ echo "bet" | gawk --re-interval '/be{1,2}t/{print $0}' #指定e出现1~2次
bet

  • 管道符指定使用两个或多个模式
$ echo "The cat is asleep " | gawk '/cat|dog/{print $0}'
The cat is asleep 
$ echo "The dog is asleep " | gawk '/cat|dog/{print $0}'
The dog is asleep 
$ echo "The sheep is asleep " | gawk '/cat|dog/{print $0}'
  • 使用括号可以对多个字符进行聚合,然后对聚合使用特殊字符?,*,+等
$ echo "Saturday" | gawk '/Sat(urday)?/{print $0}'
Saturday
  • 统计$PATH变量中可执行文件
#!/bin/bash
#遍历PATH环境变量中的可执行文件数量

#获得目录中的可执行文件,参数$1为目录地址
function get_file_in_dir(){
    local path=$1;
    local files=`ls $path`;
    local file
    local count=0;
    for file in $files
    do
        local absfile=$path/$file
        if [ -d $absfile ]
        then    
            #echo $absfile is directory
            local result=`get_file_in_dir $absfile`;
            count=$[ $count + $result ]
        elif [ -x $absfile ]
        then
            #echo $absfile is "-x"
            count=$[ $count + 1 ]
        else
            echo $absfile "unknow"
        fi
    done
    echo $count;
}

#echo `get_file_in_dir /opt/local/bin`;

paths=`echo $PATH | sed 's/:/ /g'`;
#echo $paths
for path in $paths
do
    count=0
    files=`ls $path`;
    for file in $files
    do
        absfile=$path/$file
        if [ -d "$absfile" ] #文件为目录
        then
            result=`get_file_in_dir $absfile`;
            count=$[ $count + $result ]
        elif [ -x "$absfile" ] #文件可执行
        then
            count=$[ $count + 1 ];
        fi
    done
    echo "$path -- $count "
done
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值