Postgresql学习笔记之——模糊匹配LIKE、SIMILAR TO和POSIX正则表达式

最新推荐文章于 2024-05-10 11:32:25 发布

Major_ZYH

最新推荐文章于 2024-05-10 11:32:25 发布

阅读量5.6k

点赞数 1

分类专栏： Postgresql 文章标签： postgresql

本文链接：https://blog.csdn.net/qq_32838955/article/details/105466577

版权

Postgresql 专栏收录该内容

32 篇文章 24 订阅

订阅专栏

1、介绍

Postgresql数据库中提供三种实现模糊匹配的方式：

1.传统的SQL的LIKE操作。
2.SQL99中SIMILAR TO操作符。
3.POSIX风格的正则表达式。

另外还有一个模式匹配函数 substring 也可供使用。

2、LIKE操作符

传统的LIKE操作符比较简单，其中百分号 **“%”代表了0个或任意个字符，而下划线“_”**代表任意一个字符：

postgres=# create table test(id int,note text);
CREATE TABLE
postgres=# insert into test values (1,'abcabeefg'),(2,'abxyz'),(3,'123abe'),(4,'ab_abeefg'),(5,'ab%abefg'),(6,'abcab%fg'),(7,'abcab_fg'),(8,'a%babefg');
INSERT 0 8
postgres=# select * from test ;
 id |   note    
----+-----------
  1 | abcabeefg
  2 | abxyz
  3 | 123abe
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
  8 | a%babefg
(8 rows)

分别使用**“%”与“_”**进行模糊查询：

postgres=# select * from test where note like 'ab_ab%';
 id |   note    
----+-----------
  1 | abcabeefg
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
(5 rows)

postgres=# select * from test where note like 'ab__ab%';
 id | note 
----+------
(0 rows)

postgres=# select * from test where note like 'ab___';
 id | note  
----+-------
  2 | abxyz
(1 row)

postgres=# select * from test where note like 'ab%ab%';
 id |   note    
----+-----------
  1 | abcabeefg
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
(5 rows)

可以看出**“%”匹配0个或多个字符，*“_”*只匹配任意一个字符。

如果相匹配的字符创还有百分号**%或者还有下划线“_”，可以在字符创前加转义字符反斜杠“\”**，例如：

postgres=# select * from test where note like 'ab\%%';
 id |   note   
----+----------
  5 | ab%abefg
(1 row)

postgres=# select * from test where note like 'ab\_%';
 id |   note    
----+-----------
  4 | ab_abeefg
(1 row)

如果反斜杠使用繁琐，也可以用ESCAPE子句将反义字符指定成其他字符，如指定为**#**：

postgres=# select * from test where note like 'ab#%%' escape '#';
 id |   note   
----+----------
  5 | ab%abefg
(1 row)

postgres=# select * from test where note like 'ab#_%' escape '#';
 id |   note    
----+-----------
  4 | ab_abeefg
(1 row)

另外，转义字符反斜杠**“\”或者使用ESCAPE子句指定的字符（如#**），都可以重复使用两个转义字符来去除其转义的含义，当做普通字符创匹配。

postgres=# insert into test values (9,'ab\abe');
INSERT 0 1
postgres=# insert into test values (10,'ab#abe');
INSERT 0 1
postgres=# select * from test where note like 'ab\\%';
 id |  note  
----+--------
  9 | ab\abe
(1 row)

postgres=# select * from test where note like 'ab##%' escape '#';
 id |  note  
----+--------
 10 | ab#abe
(1 row)

LIKE表达式不仅可以用在WHERE子句中，也可以用在SELECT语句中判断字符中是否含有指定的字符，结果返回有则为true（t），没有返回false（f）：

postgres=# select 'abcabefg' like '%be%';
 ?column? 
----------
 t
(1 row)

postgres=# select 'abcabefg' like '%hj%';
 ?column? 
----------
 f
(1 row)

3、SIMILAR TO正则表达式

SIMILAR TO是SQL99标准定义的正则表达式。SQL标准的正则表达式混合了LIKE和普通正则表达式，类似一个杂合体。

SIMILAR TO操作符只有在匹配整个字符串时才能成功，这一点和LIKE相同，但是与普通的正则表达式只匹配部分的习惯是不同的，SIMILAR TO 与 LIKE一样使用百分号和下划线分别匹配多个和单个字符。也可以用反斜杠转义符或ESCAPE子句指定转义符。

除此之外，SIMILAR TO 还支持与 POSIX 正则表达式相同的模式匹配元字符：

（1） | ：表示选择两个候选项之一，两个字符匹配其中之一就可以。类似“或”。
（2）* ：表示重复前面的指定的项0次或多次。
（3）+ ：表示重复前面的指定项1次或多次。
（4）？：表示重复前面的指定项0次或1次。
（5）{m} ：表示重复前面的项m次。
（6）{m，} ：表示重复前面的项m次或更多次。
（7）{m，n} ：表示重复前面的项至少m次，不超过n次。
（8）括号() ：可以作为项目分组到一个独立的逻辑项中。
（9）[…] ：声明一个字符类，就像POSIX正则表达式。

示例

postgres=# select * from test;
 id |   note    
----+-----------
  1 | abcabeefg
  2 | abxyz
  3 | 123abe
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
  8 | a%babefg
  9 | ab\abe
 10 | ab#abe
(10 rows)

postgres=# select * from test where note similar to 'ab%';
 id |   note    
----+-----------
  1 | abcabeefg
  2 | abxyz
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
  9 | ab\abe
 10 | ab#abe
(8 rows)

postgres=# select * from test where note similar to '%(123|yz)%';
 id |  note  
----+--------
  2 | abxyz
  3 | 123abe
(2 rows)
postgres=# select * from test where note similar to '%(ab|be)';
 id |   note   
----+----------
  3 | 123abe
  9 | ab\abe
 10 | ab#abe
(3 rows)

4、POSIX正则表达式

POSIX正则表达式的模式匹配操作符有以下集中：

（1）~ ：匹配正则表达式，区分大小写。
（2）~* ：匹配正则表达式，不分大小写。
（3）!~：不匹配正则表达式，区分大小写。
（4）!~* ：不匹配正则表达式，不分大小写。

POSIX正则表达式提供了比LIKE 和 SIMILAR TO操作符更强大的模式匹配方法：

postgres=# select * from test where note ~ '\\';
 id |  note  
----+--------
  9 | ab\abe
(1 row)

postgres=# select * from test where note ~ '%';
 id |   note   
----+----------
  5 | ab%abefg
  6 | abcab%fg
  8 | a%babefg
(3 rows)

postgres=# select * from test where note ~ 'abx';
 id | note  
----+-------
  2 | abxyz
(1 row)

postgres=# select * from test where note ~ 'abce';
 id | note 
----+------
(0 rows)

postgres=# select * from test where note ~ 'abca';
 id |   note    
----+-----------
  1 | abcabeefg
  6 | abcab%fg
  7 | abcab_fg
(3 rows)

POSIX正则表达式与LIKE、SIMILAR TO不一样的是，只要部分匹配到字符串就会返回结果。

如果想要匹配开头或结尾是指定的字符串，需要使用**“^”和“$”**元字符：

postgres=# select * from test where note ~ 'xyz$';
 id | note  
----+-------
  2 | abxyz
(1 row)

5、模式匹配函数substring

在Postgresql中有一个强大的函数substring，它可以使用正则表达式。它有两种使用方法：
（1）substring(字符串,数字,数字)
这种方法与其他语言中类似，都是指定位置截取字符串：

postgres=# select * from test;
 id |   note    
----+-----------
  1 | abcabeefg
  2 | abxyz
  3 | 123abe
  4 | ab_abeefg
  5 | ab%abefg
  6 | abcab%fg
  7 | abcab_fg
  8 | a%babefg
  9 | ab\abe
 10 | ab#abe
 11 | abababab
(11 rows)

postgres=# select substring(note,4,3) from test where id=1;
 substring 
-----------
 abe
(1 row)

（2）substring(字符串，字符串)
有两个参数，都是字符串，这是一种使用了POSIX正则表达式的方式。
在Postgresql中有两种正则表达式的方式。一种是SQL正则表达式，一种是POSIX正则表达式。POSIX正则表达式就是一般脚本语言中使用的标准正则表达式。而SQL正则表达式遵循SQL语句中的LIKE语法。SQL正则表达式中百分号**%可以表示任意个字符，而POSIX正则表达式中需要用.***表示。

同时SQL正则表达式中也支持一下语法：
（1） | ：表示选择两个候选项之一，两个字符匹配其中之一就可以。POSIX不支持。
（2）* ：表示重复前面的指定的项0次或多次。
（3）+ ：表示重复前面的指定项1次或多次。
（8）括号() ：可以作为项目分组到一个独立的逻辑项中。
（9）[…] ：声明一个字符类。

SIMILAR TO中使用的也是SQL正则表达式，而**“~”**使用的是POSIX正则表达式。

在SIMILAR TO中只要全部匹配才为真，但是在POSIX中只要包含就为真。

两个字符串参数的substring中的表达式是使用POSIX正则表达式：

postgres=# select substring('asb34-dd',E'\\d+');
 substring 
-----------
 34
(1 row)

这种方式的substring函数返回正则表达式中“()”里匹配的部分。

（3）substring(字符串，字符串，字符串)或substring(字符串 from 字符串 for 字符串)
这种形式的substring使用SQL正则表达式。第三个参数指定一个转义符：

postgres=# select substring('asb34-dd','%#"[0-9]+#"%','#');
 substring 
-----------
 34
(1 row)

例子中的substring函数的第二个字符串参数是模式匹配字符串，该字符串中必须要有两个标记串，此函数返回两个标记串之间的字符串，而标记串为转义字符再加上一个双引号组成，此例子中标记串为 #" 。

以待更新。。。

Major_ZYH

关注

1
点赞
踩
17

收藏

觉得还不错? 一键收藏
打赏
2
评论
Postgresql学习笔记之——模糊匹配LIKE、SIMILAR TO和POSIX正则表达式

1、介绍Postgresql数据库中提供三种实现模糊匹配的方式：1.传统的SQL的LIKE操作。2.SQL99中SIMILAR TO操作符。3.POSIX风格的正则表达式。另外还有一个模式匹配函数 substring 也可供使用。2、LIKE操作符传统的LIKE操作符比较简单，其中百分号 **“%”代表了0个或任意个字符，而下划线“_”**代表任意一个字符：postgres=# c...
复制链接

扫一扫