一、问题背景
同事在一测试环境创建索引失败,报错如下:
postgres=# create index idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911 on tab_pacloud_messages_0911 using btree (id,user_id,region_id);
NOTICE: identifier "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911" will be truncated to "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes"
ERROR: relation "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes" already exists
postgres=#
该sql是给tab_pacloud_messages_0911表,创建一个btree索引"idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911",但是报错索引"idx_tab_pacloud_messages_id_regionid_region_userid_username_mes"已经存在
二、问题分析
创建索引idx_tab_pacloud_messages_id_regionid_region_userid_username_mes _0911,报错索引idx_tab_pacloud_messages_id_regionid_region_userid_username_mes 已经存在,这两者之间有什么关系呢?
查看后发现idx_tab_pacloud_messages_id_regionid_region_userid_username_mes 已经存在,是表tab_pacloud_messages的一个btree索引。
postgres=# \di idx_tab_pacloud_messages_id_regionid_region_userid_username_mes
List of relations
Schema | Name | Type | Owner | Table
--------+-----------------------------------------------------------------+-------+----------+----------------------
public | idx_tab_pacloud_messages_id_regionid_region_userid_username_mes | index | postgres | tab_pacloud_messages
正在创建索引的表tab_pacloud_messages_0911是tab_pacloud_messages的备份,正在创建的索引比原表索引名多了个后缀_0911。
这里想起了之前一篇案例,约束名超长被截断,怀疑索引名也是被截断了。
之前忽略了ERROR前抛出的NOTICE,其实已经提醒索引将被截断。
NOTICE: identifier "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911" will be truncated to "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes"
PostgreSQL默认的对象名最大长度为64(实际为63,最后一位为 ‘\0’),刚好新建索引截断后和原表索引名相同,因此就报错索引已经存在了。
postgres=# select char_length('idx_tab_pacloud_messages_id_regionid_region_userid_username_mes');
char_length
-------------
63
(1 row)
postgres=#
报错的原因已经搞清楚了,解决办法也很容易,精简压缩下索引名就行了。那么索引是在sql执行的哪个环节,怎么样被截断的?
老办法,源代码分析+gdb跟踪验证
截断函数如下:
/*
* truncate_identifier() --- truncate an identifier to NAMEDATALEN-1 bytes.
*
* The given string is modified in-place, if necessary. A warning is
* issued if requested.
*
* We require the caller to pass in the string length since this saves a
* strlen() call in some common usages.
*/
void
truncate_identifier(char *ident, int len, bool warn)
{
/* NAMEDATALEN 这个宏为64 ,len为ident(这里为索引名)的字符长度68 */
if (len >= NAMEDATALEN)
{
/* 这里的len值经过函数运算,最终为NAMEDATALEN - 1 即63 */
len = pg_mbcliplen(ident, len, NAMEDATALEN - 1);
if (warn)
{
/*
* We avoid using %.*s here because it can misbehave if the data
* is not valid in what libc thinks is the prevailing encoding.
*/
/* 这里给定了一个长度为64的字符数组buf, 这个buf数组是为NOTICE做准备 */
char buf[NAMEDATALEN];
/* 这里将索引名copy至buf数组,拷贝长度为63个字符 */
memcpy(buf, ident, len);
buf[len] = '\0';
/* 这里就抛出NOTICE了,提示索引名ident将被截断为buf */
ereport(NOTICE,
(errcode(ERRCODE_NAME_TOO_LONG),
errmsg("identifier \"%s\" will be truncated to \"%s\"",
ident, buf)));
}
/*这里就将索引名截断了,ident[63]也就是第64个字符被赋值为结束符'\0'*/
ident[len] = '\0';
}
}
gdb跟踪:
省略不重要的步骤,首先来看进入Breakpoint后的stack
#0 truncate_identifier (ident=ident@entry=0x2cf2b48 "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911", len=68,
warn=warn@entry=true) at scansup.c:189
#1 0x00000000005766f4 in downcase_identifier (ident=0x2cf29dd "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911",
len=<optimized out>, warn=warn@entry=true, truncate=truncate@entry=true) at scansup.c:171
#2 0x000000000057670d in downcase_truncate_identifier (ident=<optimized out>, len=<optimized out>, warn=warn@entry=true) at scansup.c:133
#3 0x0000000000553217 in core_yylex (yylval_param=<optimized out>, yylloc_param=<optimized out>, yyscanner=0x2cf28c0) at scan.l:1042
#4 0x0000000000554495 in base_yylex (lvalp=lvalp@entry=0x7ffe7440c980, llocp=llocp@entry=0x7ffe7440c97c, yyscanner=yyscanner@entry=0x2cf28c0)
at parser.c:101
#5 0x000000000054fe8c in base_yyparse (yyscanner=yyscanner@entry=0x2cf28c0) at gram.c:25277
#6 0x0000000000554354 in raw_parser (str=<optimized out>) at parser.c:53
#7 0x00000000007264e2 in pg_parse_query (
query_string=query_string@entry=0x2cf20a0 "create index idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911 on tab_pacloud_messages_0911 using btree (id,user_id,region_id);") at postgres.c:641
#8 0x0000000000726936 in exec_simple_query (
query_string=0x2cf20a0 "create index idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911 on tab_pacloud_messages_0911 using btree (id,user_id,region_id);") at postgres.c:967
#9 0x0000000000727e02 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x2d1c7b8, dbname=0x2d1c698 "postgres", username=<optimized out>)
at postgres.c:4182
#10 0x000000000047b14f in BackendRun (port=0x2d14940) at postmaster.c:4358
#11 BackendStartup (port=0x2d14940) at postmaster.c:4030
#12 ServerLoop () at postmaster.c:1707
#13 0x00000000006bee79 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2cecc00) at postmaster.c:1380
#14 0x000000000047bb91 in main (argc=3, argv=0x2cecc00) at main.c:228
可以看到是在语法解析阶段进入了截断函数truncate_identifier ,stack最顶层显示,我们传入的索引名为idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911,长度为68
再看truncate_identifier 函数中的截取逻辑
(gdb) n /* 这里经过pg_mbcliplen函数运算,len取值为63 */
191 len = pg_mbcliplen(ident, len, NAMEDATALEN - 1);
(gdb) n
192 if (warn)
(gdb) p len
$4 = 63
(gdb) n /* 拷贝ident至buf数组 */
200 memcpy(buf, ident, len);
(gdb) p ident /* 正在创建的索引 */
$10 = 0x2cf2b48 "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes_0911"
(gdb) n
201 buf[len] = '\0';
(gdb) p buf
$14 = "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes"
(gdb) n /* 抛出NOTICE */
202 ereport(NOTICE,
(gdb) /* 截断索引 */
207 ident[len] = '\0';
(gdb) n
209 }
(gdb) p len
$17 = 63
(gdb) p ident /* 索引已经被截断 */
$18 = 0x2cf2b48 "idx_tab_pacloud_messages_id_regionid_region_userid_username_mes"
三、总结反思
以上整个问题已经非常清晰了,这类问题应该还是比较常见的。所以在平时对于对象的命名,尽量要精简些,控制在63个字符长度以内。
另外对于截断函数处理这块,我感觉使用ereport抛出个NOTICE,然后进行截断,是否合适。是不是应该抛出个ERROR然后退出,因为截断的对象名不是我们想要的,是"不正确的",这个值得考虑。
调整对象名最大长度NAMEDATALEN不太合适,会降低性能。