Python，shell正则表达式

最新推荐文章于 2024-07-19 11:44:44 发布

H2223

最新推荐文章于 2024-07-19 11:44:44 发布

阅读量468

点赞数

分类专栏： python shell 安全文章标签： linux centos 运维

本文链接：https://blog.csdn.net/H2223/article/details/125872763

版权

安全同时被 3 个专栏收录

10 篇文章 1 订阅

订阅专栏

python

1 篇文章 0 订阅

订阅专栏

shell

1 篇文章 0 订阅

订阅专栏

shell

基础

[root@localhost ~]# mkdir b.txt
[root@localhost ~]# mkdir a.txt
[root@localhost ~]# ls ?.txt
a.txt:

b.txt:

[root@localhost ~]# mkdir bb.txt
[root@localhost ~]# mkdir cc.txt
[root@localhost ~]# ls ??.txt
bb.txt:

cc.txt:

？匹配单个字符

？有几个就匹配几个字符

[root@localhost ~]# mkdir cc.txt
[root@localhost ~]# mkdir bb.txt
[root@localhost ~]# mkdir ab.txt
[root@localhost ~]# mkdir a.txt
[root@localhost ~]# mkdir b.txt
[root@localhost ~]# ls *.txt
ab.txt:

a.txt:

bb.txt:

b.txt:

cc.txt:

[root@localhost ~]# ls a*.txt
ab.txt:

a.txt:

*可匹配任意数量的任意字符（也可匹配空字符，a背后是空也匹配了）

[root@localhost ~]# ls [ab].txt
a.txt:

b.txt:

[] : 匹配括号中的任意一个字符

shell的正则与其他语言的不太一样，[]只能匹配一个字符，例如ab.txt就匹配不上

[root@localhost ~]# touch aaa bbb aba
[root@localhost ~]# ls ?[!a]?
aba  bbb

!和^：表示匹配除了括号中的字符

[start-end]拓展

方括号拓展有一个简写形式[start-end],表示匹配一个连续的范围，例如[a-d]等于[abcd]，[0-9]等于[0123456789]

[root@localhost ~]# ls [a-z].txt
a.txt:

b.txt:
[root@localhost ~]# ls [a]*.txt
ab.txt:

a.txt:

[a]*：表示匹配以a开头的字符

大括号拓展

大括号{，，，}表示分别拓展成大括号里的所有值

[root@localhost ~]# echo d{a,b,c,d,e}g
dag dbg dcg ddg deg

表示集合的字符类描述：

[:alnum:]	字符与数字字符
[:alpha:]	字母字符（包括大小写字母）
[:blank:]	空格与制表符
[:digit:]	数字
[:lower:]	小写字母
[:upper:]	大写字母
[:punct:]	标点符号
[:space:]	包括换行符，回车等在内的所有空白

shell正则在grep命令中的应用

[root@localhost ~]# vim demo1
[root@localhost ~]# cat demo1
ggle
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

在demo1中写入字段


[root@localhost ~]# grep "g[o]*gle" demo1
ggle
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

[root@localhost ~]# grep "g[o].*gle" demo1
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

[root@localhost ~]# grep "g[o]\?gle" demo1
ggle
gogle


[root@localhost ~]# grep "g[o]\+gle" demo1
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

[root@localhost ~]# grep "g[o]\{1,2\}gle" demo1
gogle
google

[root@localhost ~]# grep ^g demo1
ggle
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

[root@localhost ~]# grep e$ demo1
ggle
gogle
google
gooooooooooooooooooooooooooooooooooooooogle

* ：匹配前面的子表达式零次或多次

[] ：匹配包含在方括号里的任意一个字符或一组单个字符

. ：匹配任意字符但只能匹配一次

.* ：任意字符任意多次，包括0次

上述gogle就是匹配0次

\?:是匹配0次到一次

\+：匹配一次到多次

\{x,y\}：匹配x次到y次

\{2,\}：匹配两次以上

^g：匹配以g开头

&e：匹配以e结尾

Python正则表达式

单字符匹配

import re
a = 'gc++8g354fpythonsg435ffgdg3453javascript*&%(@#'
result = re.findall('\d+',a)  # ['8', '354', '435', '3453']
result1 = re.findall('\d',a)  #['8', '3', '5', '4', '4', '3', '5', '3', '4', '5', '3']
result2 = re.findall('\d*',a) #['', '', '', '', '8', '', '354', '', '', '', '', '', '', '', '', '', '435', '', '', '', '', '', '3453', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
result3 = re.findall('\d?',a)#['', '', '', '', '8', '', '3', '5', '4', '', '', '', '', '', '', '', '', '', '4', '3', '5', '', '', '', '', '', '3', '4', '5', '3', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
result4 = re.findall('[0-9]',a)#['8', '3', '5', '4', '4', '3', '5', '3', '4', '5', '3']
result5 = re.findall('\w*',a)#['gc', '', '', '8g354fpythonsg435ffgdg3453javascript', '', '', '', '', '', '', '']

[] ：匹配[]里的字符例如，[0-9]就是匹配所有数字

\d:匹配所有数字

\w:匹配所有字母

\s:匹配所有空格

*：匹配前一个字符出现0次到无数次

+：匹配前一个字符出现1次到无数次

?：匹配前一个字符出现0次到1次

import re

s = 'abc adc aec afc agc ahc abb'
result = re.findall('a[^cf]c',s)    #['abc', 'adc', 'aec', 'agc', 'ahc']

^和!都是取反，^cf就是匹配除了cf的字符

import re

a = 'python  13432 java 342 node'
result = re.findall('[a-z]{3,6}',a)      #['python', 'java', 'node']
result1 = re.findall('[a-z]{3,6}?',a)   #['pyt', 'hon', 'jav', 'nod']

{x，y}：匹配x次到y次

由于默认是贪婪模式，所以没有？的时候匹配了6次

有？就匹配3次

断言

x(?=y)：匹配‘x’仅仅当‘x’后面跟着‘y’

(?<=y)x：匹配‘x’仅仅当‘x’前面是‘y’

x(?!y)：y前面不跟着x匹配成功时匹配‘x’

(?<!y)x：仅仅当‘x’前面不是‘y’时匹配‘x’

import re

a = '<a target=_blank href="www.baidu.com">百度一下</a>百度知道'    #这是一个a标签，把www.baidu.com匹配出来
result = re.search('(?<=(href=")).{1,200}(?=(">))',a)   #<re.Match object; span=(23, 36), match='www.baidu.com'>

(?<=(href="))：匹配href="后面的字符

(?=(">))：匹配">前面的字符

就可以把www.baidu.com匹配出来

分组：group(0):就是匹配出来是数据

group（n）：第n个括号匹配出来的数据

在匹配的值前面加?:就不捕获分组

H2223

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python，shell正则表达式

匹配单个字符？有几个就匹配几个字符*可匹配任意数量的任意字符（也可匹配空字符，a背后是空也匹配了）[]匹配括号中的任意一个字符shell的正则与其他语言的不太一样，[]只能匹配一个字符，例如ab.txt就匹配不上!和^表示匹配除了括号中的字符。...
复制链接

扫一扫