官方原文
CONVERT:https://www.postgresql.org/docs/10/functions-string.html
查询PostgreSQL字符集
select pg_encoding_to_char(encoding) from pg_database where datname = 'your_database';
这里提供2个函数
--针对pg_database表
select pg_encoding_to_char(6) --UTF8 --编号转换为字符集
select pg_char_to_encoding('UTF8') --6 --字符集转换为编号
| bytea | Convert string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. Conversions can be defined by CREATE CONVERSION. Also there are some predefined conversions. See Table 9-8 for available conversions. | convert( 'text_in_utf8', 'UTF8', 'LATIN1') | text_in_utf8represented in Latin-1 encoding (ISO 8859-1) |
convert(string bytea,src_encoding,
dest_encoding)
Convert string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. Conversions can be defined by CREATE CONVERSION. Also there are some predefined conversions. See Table 9-8 for available conversions.
将字符串转换为dest_encoding。原始编码由src_encoding指定。字符串在这种编码中必须是有效的。转换可以通过创建转换来定义。还有一些预定义的转换。可用的转换见表9-8。
--For example
convert('text_in_utf8','UTF8', 'LATIN1')
-- value ,source code(源), target code(目标)
text_in_utf8represented in Latin-1 encoding (ISO 8859-1)
用Latin-1编码表示的text_in_utf8 (ISO 8859-1)
测试CONVERT正确性:
--把浪潮字符串由UTF8编码转为GBK编码
convert('浪潮','UTF8', 'GBK')
--通过MD5值验证准确性
select md5(convert('浪潮','UTF8','GBK')) as gbk,md5('浪潮') as utf8
---result---
gbk utf8
0dc1d920718b8e0d53940bf63ab5c64e a889ead0cee9b0adcf21310b789eab17
注:
这里最好先获取下数据库的字符集,然后再进行转码,最为安全
Table 9-8. Built-in Conversions
Conversion Name [a] | Source Encoding | Destination Encoding |
---|---|---|
ascii_to_mic | SQL_ASCII | MULE_INTERNAL |
ascii_to_utf8 | SQL_ASCII | UTF8 |
big5_to_euc_tw | BIG5 | EUC_TW |
big5_to_mic | BIG5 | MULE_INTERNAL |
big5_to_utf8 | BIG5 | UTF8 |
euc_cn_to_mic | EUC_CN | MULE_INTERNAL |
euc_cn_to_utf8 | EUC_CN | UTF8 |
euc_jp_to_mic | EUC_JP | MULE_INTERNAL |
euc_jp_to_sjis | EUC_JP | SJIS |
euc_jp_to_utf8 | EUC_JP | UTF8 |
euc_kr_to_mic | EUC_KR | MULE_INTERNAL |
euc_kr_to_utf8 | EUC_KR | UTF8 |
euc_tw_to_big5 | EUC_TW | BIG5 |
euc_tw_to_mic | EUC_TW | MULE_INTERNAL |
euc_tw_to_utf8 | EUC_TW | UTF8 |
gb18030_to_utf8 | GB18030 | UTF8 |
gbk_to_utf8 | GBK | UTF8 |
iso_8859_10_to_utf8 | LATIN6 | UTF8 |
iso_8859_13_to_utf8 | LATIN7 | UTF8 |
iso_8859_14_to_utf8 | LATIN8 | UTF8 |
iso_8859_15_to_utf8 | LATIN9 | UTF8 |
iso_8859_16_to_utf8 | LATIN10 | UTF8 |
iso_8859_1_to_mic | LATIN1 | MULE_INTERNAL |
iso_8859_1_to_utf8 | LATIN1 | UTF8 |
iso_8859_2_to_mic | LATIN2 | MULE_INTERNAL |
iso_8859_2_to_utf8 | LATIN2 | UTF8 |
iso_8859_2_to_windows_1250 | LATIN2 | WIN1250 |
iso_8859_3_to_mic | LATIN3 | MULE_INTERNAL |
iso_8859_3_to_utf8 | LATIN3 | UTF8 |
iso_8859_4_to_mic | LATIN4 | MULE_INTERNAL |
iso_8859_4_to_utf8 | LATIN4 | UTF8 |
iso_8859_5_to_koi8_r | ISO_8859_5 | KOI8R |
iso_8859_5_to_mic | ISO_8859_5 | MULE_INTERNAL |
iso_8859_5_to_utf8 | ISO_8859_5 | UTF8 |
iso_8859_5_to_windows_1251 | ISO_8859_5 | WIN1251 |
iso_8859_5_to_windows_866 | ISO_8859_5 | WIN866 |
iso_8859_6_to_utf8 | ISO_8859_6 | UTF8 |
iso_8859_7_to_utf8 | ISO_8859_7 | UTF8 |
iso_8859_8_to_utf8 | ISO_8859_8 | UTF8 |
iso_8859_9_to_utf8 | LATIN5 | UTF8 |
johab_to_utf8 | JOHAB | UTF8 |
koi8_r_to_iso_8859_5 | KOI8R | ISO_8859_5 |
koi8_r_to_mic | KOI8R | MULE_INTERNAL |
koi8_r_to_utf8 | KOI8R | UTF8 |
koi8_r_to_windows_1251 | KOI8R | WIN1251 |
koi8_r_to_windows_866 | KOI8R | WIN866 |
koi8_u_to_utf8 | KOI8U | UTF8 |
mic_to_ascii | MULE_INTERNAL | SQL_ASCII |
mic_to_big5 | MULE_INTERNAL | BIG5 |
mic_to_euc_cn | MULE_INTERNAL | EUC_CN |
mic_to_euc_jp | MULE_INTERNAL | EUC_JP |
mic_to_euc_kr | MULE_INTERNAL | EUC_KR |
mic_to_euc_tw | MULE_INTERNAL | EUC_TW |
mic_to_iso_8859_1 | MULE_INTERNAL | LATIN1 |
mic_to_iso_8859_2 | MULE_INTERNAL | LATIN2 |
mic_to_iso_8859_3 | MULE_INTERNAL | LATIN3 |
mic_to_iso_8859_4 | MULE_INTERNAL | LATIN4 |
mic_to_iso_8859_5 | MULE_INTERNAL | ISO_8859_5 |
mic_to_koi8_r | MULE_INTERNAL | KOI8R |
mic_to_sjis | MULE_INTERNAL | SJIS |
mic_to_windows_1250 | MULE_INTERNAL | WIN1250 |
mic_to_windows_1251 | MULE_INTERNAL | WIN1251 |
mic_to_windows_866 | MULE_INTERNAL | WIN866 |
sjis_to_euc_jp | SJIS | EUC_JP |
sjis_to_mic | SJIS | MULE_INTERNAL |
sjis_to_utf8 | SJIS | UTF8 |
tcvn_to_utf8 | WIN1258 | UTF8 |
uhc_to_utf8 | UHC | UTF8 |
utf8_to_ascii | UTF8 | SQL_ASCII |
utf8_to_big5 | UTF8 | BIG5 |
utf8_to_euc_cn | UTF8 | EUC_CN |
utf8_to_euc_jp | UTF8 | EUC_JP |
utf8_to_euc_kr | UTF8 | EUC_KR |
utf8_to_euc_tw | UTF8 | EUC_TW |
utf8_to_gb18030 | UTF8 | GB18030 |
utf8_to_gbk | UTF8 | GBK |
utf8_to_iso_8859_1 | UTF8 | LATIN1 |
utf8_to_iso_8859_10 | UTF8 | LATIN6 |
utf8_to_iso_8859_13 | UTF8 | LATIN7 |
utf8_to_iso_8859_14 | UTF8 | LATIN8 |
utf8_to_iso_8859_15 | UTF8 | LATIN9 |
utf8_to_iso_8859_16 | UTF8 | LATIN10 |
utf8_to_iso_8859_2 | UTF8 | LATIN2 |
utf8_to_iso_8859_3 | UTF8 | LATIN3 |
utf8_to_iso_8859_4 | UTF8 | LATIN4 |
utf8_to_iso_8859_5 | UTF8 | ISO_8859_5 |
utf8_to_iso_8859_6 | UTF8 | ISO_8859_6 |
utf8_to_iso_8859_7 | UTF8 | ISO_8859_7 |
utf8_to_iso_8859_8 | UTF8 | ISO_8859_8 |
utf8_to_iso_8859_9 | UTF8 | LATIN5 |
utf8_to_johab | UTF8 | JOHAB |
utf8_to_koi8_r | UTF8 | KOI8R |
utf8_to_koi8_u | UTF8 | KOI8U |
utf8_to_sjis | UTF8 | SJIS |
utf8_to_tcvn | UTF8 | WIN1258 |
utf8_to_uhc | UTF8 | UHC |
utf8_to_windows_1250 | UTF8 | WIN1250 |
utf8_to_windows_1251 | UTF8 | WIN1251 |
utf8_to_windows_1252 | UTF8 | WIN1252 |
utf8_to_windows_1253 | UTF8 | WIN1253 |
utf8_to_windows_1254 | UTF8 | WIN1254 |
utf8_to_windows_1255 | UTF8 | WIN1255 |
utf8_to_windows_1256 | UTF8 | WIN1256 |
utf8_to_windows_1257 | UTF8 | WIN1257 |
utf8_to_windows_866 | UTF8 | WIN866 |
utf8_to_windows_874 | UTF8 | WIN874 |
windows_1250_to_iso_8859_2 | WIN1250 | LATIN2 |
windows_1250_to_mic | WIN1250 | MULE_INTERNAL |
windows_1250_to_utf8 | WIN1250 | UTF8 |
windows_1251_to_iso_8859_5 | WIN1251 | ISO_8859_5 |
windows_1251_to_koi8_r | WIN1251 | KOI8R |
windows_1251_to_mic | WIN1251 | MULE_INTERNAL |
windows_1251_to_utf8 | WIN1251 | UTF8 |
windows_1251_to_windows_866 | WIN1251 | WIN866 |
windows_1252_to_utf8 | WIN1252 | UTF8 |
windows_1256_to_utf8 | WIN1256 | UTF8 |
windows_866_to_iso_8859_5 | WIN866 | ISO_8859_5 |
windows_866_to_koi8_r | WIN866 | KOI8R |
windows_866_to_mic | WIN866 | MULE_INTERNAL |
windows_866_to_utf8 | WIN866 | UTF8 |
windows_866_to_windows_1251 | WIN866 | WIN |
windows_874_to_utf8 | WIN874 | UTF8 |
euc_jis_2004_to_utf8 | EUC_JIS_2004 | UTF8 |
utf8_to_euc_jis_2004 | UTF8 | EUC_JIS_2004 |
shift_jis_2004_to_utf8 | SHIFT_JIS_2004 | UTF8 |
utf8_to_shift_jis_2004 | UTF8 | SHIFT_JIS_2004 |
euc_jis_2004_to_shift_jis_2004 | EUC_JIS_2004 | SHIFT_JIS_2004 |
shift_jis_2004_to_euc_jis_2004 | SHIFT_JIS_2004 | EUC_JIS_2004 |
23.3.1. Supported Character Sets
Table 23-1 shows the character sets available for use in PostgreSQL.
Table 23-1. PostgreSQL Character Sets
Name | Description | Language | Server? | Bytes/Char | Aliases |
---|---|---|---|---|---|
BIG5 | Big Five | Traditional Chinese | No | 1-2 | WIN950, Windows950 |
EUC_CN | Extended UNIX Code-CN | Simplified Chinese | Yes | 1-3 | |
EUC_JP | Extended UNIX Code-JP | Japanese | Yes | 1-3 | |
EUC_JIS_2004 | Extended UNIX Code-JP, JIS X 0213 | Japanese | Yes | 1-3 | |
EUC_KR | Extended UNIX Code-KR | Korean | Yes | 1-3 | |
EUC_TW | Extended UNIX Code-TW | Traditional Chinese, Taiwanese | Yes | 1-3 | |
GB18030 | National Standard | Chinese | No | 1-4 | |
GBK | Extended National Standard | Simplified Chinese | No | 1-2 | WIN936, Windows936 |
ISO_8859_5 | ISO 8859-5, ECMA 113 | Latin/Cyrillic | Yes | 1 | |
ISO_8859_6 | ISO 8859-6, ECMA 114 | Latin/Arabic | Yes | 1 | |
ISO_8859_7 | ISO 8859-7, ECMA 118 | Latin/Greek | Yes | 1 | |
ISO_8859_8 | ISO 8859-8, ECMA 121 | Latin/Hebrew | Yes | 1 | |
JOHAB | JOHAB | Korean (Hangul) | No | 1-3 | |
KOI8R | KOI8-R | Cyrillic (Russian) | Yes | 1 | KOI8 |
KOI8U | KOI8-U | Cyrillic (Ukrainian) | Yes | 1 | |
LATIN1 | ISO 8859-1, ECMA 94 | Western European | Yes | 1 | ISO88591 |
LATIN2 | ISO 8859-2, ECMA 94 | Central European | Yes | 1 | ISO88592 |
LATIN3 | ISO 8859-3, ECMA 94 | South European | Yes | 1 | ISO88593 |
LATIN4 | ISO 8859-4, ECMA 94 | North European | Yes | 1 | ISO88594 |
LATIN5 | ISO 8859-9, ECMA 128 | Turkish | Yes | 1 | ISO88599 |
LATIN6 | ISO 8859-10, ECMA 144 | Nordic | Yes | 1 | ISO885910 |
LATIN7 | ISO 8859-13 | Baltic | Yes | 1 | ISO885913 |
LATIN8 | ISO 8859-14 | Celtic | Yes | 1 | ISO885914 |
LATIN9 | ISO 8859-15 | LATIN1 with Euro and accents | Yes | 1 | ISO885915 |
LATIN10 | ISO 8859-16, ASRO SR 14111 | Romanian | Yes | 1 | ISO885916 |
MULE_INTERNAL | Mule internal code | Multilingual Emacs | Yes | 1-4 | |
SJIS | Shift JIS | Japanese | No | 1-2 | Mskanji, ShiftJIS, WIN932, Windows932 |
SHIFT_JIS_2004 | Shift JIS, JIS X 0213 | Japanese | No | 1-2 | |
SQL_ASCII | unspecified (see text) | any | Yes | 1 | |
UHC | Unified Hangul Code | Korean | No | 1-2 | WIN949, Windows949 |
UTF8 | Unicode, 8-bit | all | Yes | 1-4 | Unicode |
WIN866 | Windows CP866 | Cyrillic | Yes | 1 | ALT |
WIN874 | Windows CP874 | Thai | Yes | 1 | |
WIN1250 | Windows CP1250 | Central European | Yes | 1 | |
WIN1251 | Windows CP1251 | Cyrillic | Yes | 1 | WIN |
WIN1252 | Windows CP1252 | Western European | Yes | 1 | |
WIN1253 | Windows CP1253 | Greek | Yes | 1 | |
WIN1254 | Windows CP1254 | Turkish | Yes | 1 | |
WIN1255 | Windows CP1255 | Hebrew | Yes | 1 | |
WIN1256 | Windows CP1256 | Arabic | Yes | 1 | |
WIN1257 | Windows CP1257 | Baltic | Yes | 1 | |
WIN1258 | Windows CP1258 | Vietnamese | Yes | 1 | ABC, TCVN, TCVN5712, VSCII |