PostgreSQL 大小写敏感的那些事

最新推荐文章于 2024-03-18 15:54:00 发布

qq_58251011

最新推荐文章于 2024-03-18 15:54:00 发布

阅读量4.9k

点赞数 2

原文链接：https://my.oschina.net/postgresqlchina/blog/5073939

版权

PostgreSQL和Oracle一样，默认都是大小写不敏感的，但两者仍然存在区别：

Oracle：默认是大小写不敏感，表名、字段名等不区分大小写，小写字母会自动转换为大写字母；需要用小写字母时需要使用双引号，或借助函數upper()和lower()；
PostgreSQL：默认是大小写不敏感，表名、字段名等不区分大小写，大写字母会自动转换为小写字母；需要用大写字母时需要使用双引号，或借助函數upper()和lower(www.rsxedu.com)；
1、表、列名中的大小写敏感
例如我们创建表test，表名写成test、Test、TEST结果均是一样的：

bill@bill=>create table test(Id int,INFO text);
CREATE TABLE
bill@bill=>select id,info from test;
id | info
----±-----
(0 rows)
那么如果我们想要指定表或者列名为大写该怎么办呢？

使用双引号即可。

但是可以看到这种方法也很麻烦，因为我们需要查询的时候也要用双引号，所以建议不要这么去使用。

bill@bill=>create table ‘TEST’(id int,info text);
CREATE TABLE
bill@bill=>select * from test;
ERROR: relation ‘test’ does not exist
LINE 1: select * from test;
^
bill@bill=>select * from TEST;
ERROR: relation ‘test’ does not exist
LINE 1: select * from TEST;
^
bill@bill=>select * from ‘TEST’;
id | info
----±-----
(0 rows)

2、查询数据中的大小写敏感
当我们进行数据匹配查询时，是区分大小写的。

例如：

bill@bill=>insert into test values(1,‘Bill’);
INSERT 0 1
bill@bill=>select * from test where info = ‘bill’;
id | info
----±-----
(0 rows)
bill@bill=>select * from test where info = ‘Bill’;
id | info
----±-----
1 | Bill
(1 row)
这主要是因为pg查询中是否区分大小写取决于操作符，例如我们上面例子中用到的=号。

Datum
texteq(PG_FUNCTION_ARGS)
{
Oid collid = PG_GET_COLLATION();
bool result;
check_collation_set(collid);
if (lc_collate_is_c(collid) ||www.meimeitu8.com
collid == DEFAULT_COLLATION_OID ||
pg_newlocale_from_collation(collid)->deterministic)
{
Datum arg1 = PG_GETARG_DATUM(0);
Datum arg2 = PG_GETARG_DATUM(1);
Size len1,
len2;
/*
* Since we only care about equality or not-equality, we can avoid all
* the expense of strcoll() here, and just do bitwise comparison. In
* fact, we don’t even have to do a bitwise comparison if we can show
* the lengths of the strings are unequal; which might save us from
* having to detoast one or both values.
*/
len1 = toast_raw_datum_size(arg1);
len2 = toast_raw_datum_size(arg2);
if (len1 != len2)
result = false;
else
{
text *targ1 = DatumGetTextPP(arg1);
text *targ2 = DatumGetTextPP(arg2);
result = (memcmp(VARDATA_ANY(targ1), VARDATA_ANY(targ2),
len1 - VARHDRSZ) == 0);
PG_FREE_IF_COPY(targ1, 0);
PG_FREE_IF_COPY(targ2, 1);
}
}
else
{
text *arg1 = PG_GETARG_TEXT_PP(0);
text *arg2 = PG_GETARG_TEXT_PP(1);
result = (text_cmp(arg1, arg2, collid) == 0);
PG_FREE_IF_COPY(arg1, 0);
PG_FREE_IF_COPY(arg2, 1);
}
PG_RETURN_BOOL(result);
}
可以看到，就是直接使用memcmp函数对字符直接进行比较，自然是会去区分大小写。所以想要不区分大小写除非我们数据写入的时候就不区分大小写。

因此我们可以使用citext模块来实现忽略大小写的查询：

bill@bill=>create extension citext ;
CREATE EXTENSION
bill@bill=>create table test(id int,info citext);
CREATE TABLE
bill@bill=>insert into test values(1,‘Bill’);
INSERT 0 1
bill@bill=>select * from test where info = ‘bill’;
id | info
----±-----
1 | Bill
(1 row)
3、数据排序中的大小写敏感
排序也是Oracle一样，默认是区分大小写的。

例如：

bill@bill=>create table test (c1 text);
CREATE TABLE
bill@bill=>insert into test values (‘a’),(‘b’),(‘c’),(‘A’),(‘B’),(‘C’);
INSERT 0 6
bill@bill=>select * from test order by c1;
c1

A
B
C
a
b
c
(6 rows)
不过从pg12开始支持不区分大小写，或者区分大小写的排序的collate。

CREATE COLLATION [ IF NOT EXISTS ] name (
[ LOCALE = locale, ]
[ LC_COLLATE = lc_collate, ]
[ LC_CTYPE = lc_ctype, ]
[ PROVIDER = provider, ]
[ DETERMINISTIC = boolean, ]
[ VERSION = version ]
)
CREATE COLLATION [ IF NOT EXISTS ] name FROM existing_collation
其中几个关键参数：

PROVIDER：指定用于与此排序规则相关的区域服务的提供程序。可能的值是: icu、libc。默认是libc。但若要设置大小写不敏感，目前只支持icu。
DETERMINISTIC：设置成not deterministic表示大小写不敏感。
例子：

bill@bill=> CREATE COLLATION case_insensitive (provider = icu, locale = ‘zh_Hans’, deterministic = false);
CREATE COLLATION
bill@bill=> select * from test order by c1 collate ‘case_insensitive’;
c1

a
A
b
B
c
C
(6 rows)
说到collate，不知道大家有没有遇过这种情况：

bill@bill=>select ‘a’ > ‘A’ collate ‘en_US’;
?column?

f
(1 row)
bill@bill=>select ‘a’ > ‘A’ collate ‘C’;
?column?

t
(1 row)
使用不同的collate字符串进行比较的结果竟然不一样。
因为上述字符串比较使用的是text_gt函数。当collate为C时，就是使用memcmp直接对被比较的字符串的比较，而collate为其它时，则会使用使用strcoll_l或strcoll进行比较。

/*
* Unfortunately, there is no strncoll(), so in the non-C locale case we
* have to do some memory copying. This turns out to be significantly
* slower, so we optimize the case where LC_COLLATE is C. We also try to
* optimize relatively-short strings by avoiding palloc/pfree overhead.
*/
同理，我们使用索引查询时也需要注意：创建索引时用的collate和查询时的collate一致才可以使用到索引。

4、其它场景下的大小写敏感
当我们使用数组类型时，是否区分大小写呢？

bill@bill=>select ‘a’ = any(array[‘A’,‘1’]);
?column?

f
(1 row)
可以看到和普通的字符串查询一样有区分大小写，那么我们该怎么去忽略大小写呢？

和字符串不同，我们没办法数据进行lower操作：

bill@bill=>select ‘a’ = lower(any(array[‘A’,‘1’]));
ERROR: syntax error at or near ‘any’
LINE 1: select ‘a’ = lower(any(array[‘A’,‘1’]));
^

不过我们可以自己写一个函数来将数据中的元素转换为小写。

bill@bill=>create or replace function lower(text[]) returns text[] as $KaTeX parse error: Can't use function '$' in math mode at position 6: bill$̲# select array_\dots$ language sql strict immutable;
CREATE FUNCTION
bill@bill=>select lower(array[‘A’,‘a’]);
lower

{a,a}
(1 row)
再次查询：

bill@bill=>select ‘a’ = any( lower(array[‘A’,‘1’]) );
?column?

t
(1 row)
当然除此之外还有其它方法，例如自定义操作符等等。

总结：
PostgreSQL中和Oracle一样是大小写不敏感的，不同的是其默认是将字符转换成小写。
目前pg中没有参数去从数据库级别设置大小写敏感，但是针对不同的场景我们还是可以利用pg自身强大的功能来解决。

参考链接：
https://www.postgresql.org/docs/14/citext.html
https://www.postgresql.org/docs/14/sql-createcollation.html

qq_58251011

关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
PostgreSQL 大小写敏感的那些事

PostgreSQL和Oracle一样，默认都是大小写不敏感的，但两者仍然存在区别：Oracle：默认是大小写不敏感，表名、字段名等不区分大小写，小写字母会自动转换为大写字母；需要用小写字母时需要使用双引号，或借助函數upper()和lower()；PostgreSQL：默认是大小写不敏感，表名、字段名等不区分大小写，大写字母会自动转换为小写字母；需要用大写字母时需要使用双引号，或借助函數upper()和lower(www.rsxedu.com)；1、表、列名中的大小写敏感例如我们创建表test
复制链接

扫一扫