SQL 正则表达式

最新推荐文章于 2024-07-25 00:00:00 发布

万物皆可休

最新推荐文章于 2024-07-25 00:00:00 发布

阅读量1.5w

点赞数 12

文章标签：正则表达式 sql 数据库

原文链接：https://zhuanlan.zhihu.com/p/352635770

版权

一、正则表达式

like与regexp的区别[1]
like匹配整个列[2]。如果被匹配的文本仅在列值中出现，LIKE并不会找到它，相应的行也不会返回（当然，使用通配符除外）；
REGEXP在列值内进行匹配。如果被匹配的匹配的文本在列值中出现，REGEXP将会找到它，相应的行将被返回，这时一个非常重要的差别（当然，如果适应定位符号^和$，可以实现REGEXP匹配整个列而不是列的子集）

区分大小写：MySQL中正则表达式匹配（从版本3.23.4后）不区分大小写。如果要区分大小写，应该使用BINARY关键字，如where post_name REGEXP BINARY 'Hello .000'

二、like

通配符：%任意字符，_单个字符

##查找名字中有“腾”顾客
select name
from user_id
where name like '%腾%'

##查找名字的姓为“张”的顾客
select name
from user_id
where name like '张%'

##查找名字的最后一个字为“峰”的顾客
select name
from user_id
where name like '%峰'

三、regexp

所使用的模式表如下：

模式	匹配模式的什么	例子	含义
^	匹配字符串开头	select name from 表名 where name regexp '^王'	匹配姓为王的名字
$	匹配字符串结尾	select name from 表名 where name regexp '明$'	匹配名字最后一个字为明的名字
.	匹配任意字符	select name from 表名 where name regexp '.明.'	匹配带有明的名字
[…]	匹配方括号间列出的任意字符	select name from 表名 where name regexp '^[wzs]';	匹配括号里任意字符的名字
[^…]	匹配方括号间未列出的任意字符	select name from 表名 where name regexp '^[^wzs]';	匹配未在括号里任意字符的名字
p1\|p2\|p2	交替：匹配任意p1或p2或p3	select performance from 表名 where performance regexp 'A-\|A\|A+';	匹配p1,p2,p3
*	匹配前面的字符零次或者多次	str*'	可以匹配st/str/strr/strrr……
?	匹配前面的字符零次或者1次	str?'	可以匹配st/str
+	匹配前面的字符一次或者多次	str+'	可以匹配str/strr/strrr/strrrr……
{n}	匹配前面的字符n次
{m,n}	匹配前面的字符m至n次

例子：

## 查询所有以 'st' 开头的name
SELECT name 
FROM user  
WHERE name REGEXP'^st'

## 查询所有以 'ok' 结尾的 name
SELECT name 
FROM user 
WHERE name REGEXP 'ok$'

查询所有包含 'mar' 的 name
SELECT name 
FROM user  
WHERE name REGEXP 'mar'

四、not regexp：是MySQL用于模式匹配的操作符。它比较列中的给定模式，并返回与模式不匹配的列。查找在匹配中没有的数据

##查找开头不是姓张的
select *
from user
where name not regexp '^张'

##查找不含有英文字母的其他的名字
select 
from user 
where not regexp '[a-zA-Z]'

五、regexp_instr()函数

regexp_instr()函数返回与正则表达式模式匹配的子字符串的起始索引

REGEXP_INSTR(expr, pat[, pos[, occurrence[, return_option[, match_type]]]])


exp	源字符串
pat	正则表达式
pos	可选参数，标识开始匹配的位置，默认为 1
occurrence	可选参数，标识匹配的次数，默认为 1
return_option	可选参数，指定返回值的类型。如果为 0，则返回匹配的第一个字符的位置。如果为 1，则返回匹配的最后一个位置，默认为 0
match_type	可选参数，允许优化正则表达式。例如，可以使用此参数指定是否区分大小写

##匹配名字中的数据
select regexp_instr('cat','at')result
from user
返回：result=2
##开头为at的索引
select regexp_instr('cat','^at')result
返回：result=0                          ##有问题？
select regexp_instr('at','^at')result
返回：result=1

##参数pos：指定开始位置
select regexp_instr('cat cat','cat',2)result
返回：result=5
select 
regexp_instr('cat cat','cat',2)result1,
regexp_instr('cat cat','cat',3)result2,
regexp_instr('cat cat','cat',4)result3
返回：result1=result2=result3=5
select 
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1) result1,  ##.表示任意字符
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 2) result2,
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 6) result3
返回：result1=1
      result2=5
      result3=16

##REGEXP_SUBSTR()检查子字符串
select 
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1) result1,  ##.表示任意字符
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 2) result2,
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 6) result3
返回：result1=cat
      result2=city
      result3=cute

##参数occurrence：匹配的子字符串是第几次出现
select 
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1,1) result1,  ##.表示任意字符
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1,2) result2,
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1,3) result3
返回：result1=5
      result2=16
      result3=0  ##对于以上的理解是指定出现的位置是第第一个字符，在这个条件下来查查找第几个匹配的字符

##参数return_option：指定返回值的类型。如果为 0，则返回匹配的第一个字符的位置。如果为 1，则返回匹配的最后一个位置
select 
REGEXP_INSTR('Cat City is SO Cute!', 'C.t',1,1,0) result1,  ##.表示任意字符
REGEXP_INSTR('Cat City is SO Cute!', 'C.t',1,1,1) result2,
返回：result1=1
      result2=4

SELECT 
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 1, 0) 'result1',
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 2, 0) 'result2',
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 3, 0) 'result3'
UNION ALL
SELECT
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 1, 1),
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 2, 1),
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 3, 1);
##结果是1，5，16；4，8，19

##参数match_type：
指定区分大小写的匹配和不区分大小写的匹配的示例
select
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 1, 'c') 'result1',
  REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1, 1, 'i') 'result2'
返回：result1=0
      result2=1                          两者应该为啥不一样？

六、regexp_like()

函数用于模式匹配。它比较给定的字符串，如果字符串相同则返回，如果为真返回1，否则返回0

##形式select regexp_like(str1, str2)

select regexp_like('MCA', 'mca')result
result=1  不区分大小写
select regexp_like('MCA', 'bcd')result
result=0

七、regexp_replace()函数

用于模式匹配。它通过匹配字符来替换给定的字符串。

regexp_replace(str,'旧字符串，新字符串)
##替换
select 
('zhangsan','san','si')result
result=zhangsi

select 
('Java','Java','Mysql')result
result=Mysql

八、regexp_substr()函

数用于模式匹配。它从给定的字符串中返回子串

##语法
select regexp_substr('str', 'match_type', occurrence, position, )

select 
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 1) result1,  ##.表示任意字符
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 2) result2,
REGEXP_INSTR('Cat City is SO Cute!', 'C.t', 6) result3
返回：result1=cat
      result2=city
      result3=cute

7、其他

a.匹配特殊的字符需要使用\转义字