关于汉字排序的问题源于曾经做过的114号码百事通查询,当时要求提供按笔画、部首、拼音的多种查询方式,拼音排序直接按名称字段进行排序时总有些是是而非的错误,百思不得其解;同时笔画、部首排序也摸不着头脑。。。
数字和英文混合排序
1,3,A,B,101,301,AB,BA
要求按照数字,字符分别进行排序,同时在统一类别数据中,首先按数字ASC,再次按长度排序,要求结果如下
1,101,3,301,A,AB,B,BA或者
1,3,101,301,A,B,AB,BA
但是很奇怪,第一个排序似乎在合并两个视图之后,排序规则有点乱了。。。[@more@]
CREATE TABLE TestNumberChar
(
SortName VARCHAR2(20)
)
INSERT INTO TestNumberChar
VALUES('1');
INSERT INTO TestNumberChar
VALUES('3');
INSERT INTO TestNumberChar
VALUES('A');
INSERT INTO TestNumberChar
VALUES('B');
INSERT INTO TestNumberChar
VALUES('101');
INSERT INTO TestNumberChar
VALUES('301');
INSERT INTO TestNumberChar
VALUES('AB');
INSERT INTO TestNumberChar
VALUES('BA');
COMMIT;
CREATE VIEW vSortNumber as
SELECT * FROM TestNumberChar
WHERE SUBSTR(SortName,1,1) IN
('0','1','2','3','4','5','6','7','8','9')
ORDER BY LENGTH(SortName),SortName
CREATE VIEW vSortChar as
SELECT * FROM TestNumberChar
WHERE SUBSTR(SortName,1,1) NOT IN
('0','1','2','3','4','5','6','7','8','9')
ORDER BY LENGTH(SortName),SortName
SELECT * FROM vSortNumber
UNION
SELECT * FROM vSortChar
Order
Sort Result
1
1
2
101
3
3
4
301
5
A
6
AB
7
B
8
BA
SELECT * FROM
TestNumberChar
ORDER BY TO_NUMBER
(
SUBSTR
(
SortName,1,LENGTH(SortName)-NVL
(
LENGTH
(
REPLACE
(
TRANSLATE(SortName,'0123456789.','00000000000'),'0'
)
),0
)
)
)
SELECT
SortName A,
TRANSLATE(SortName,'0123456789.','00000000000')
B ,
REPLACE(TRANSLATE(SortName,'0123456789.','00000000000'),'0') C,
LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.','00000000000'),
'0'))
D,
NVL(LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.',
'00000000000'),'0')),0)
E,
LENGTH(SortName)-NVL(LENGTH(REPLACE(TRANSLATE(SortName,
'0123456789.','00000000000'),'0')),0)
F,
SUBSTR(SortName,1,LENGTH(SortName)-
NVL(LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.',
'00000000000'),'0')),0))
G,
TO_NUMBER(SUBSTR(SortName,1,
LENGTH(SortName)-NVL(LENGTH(REPLACE(
TRANSLATE(SortName,'0123456789.','00000000000'),'0')),0)))
H
FROMTestNumberChar
ORDER BY
TO_NUMBER(SUBSTR(SortName,1,
LENGTH(SortName)-NVL(LENGTH(REPLACE(TRANSLATE(SortName,
'0123456789.','00000000000'),'0')),0)))
Sort
A
B
C
D
E
F
G
H
1
1
0
0
1
1
1
2
3
0
0
1
3
3
3
101
000
0
3
101
101
4
301
000
0
3
301
301
5
A
A
A
1
1
0
6
B
B
B
1
1
0
7
BA
BA
BA
2
2
0
8
AB
AB
AB
2
2
0
关于汉字字符排序
CREATE TABLE TestHZ
(
SortName VARCHAR2(20)
)
INSERT INTO TestHZ VALUES ('你');
INSERT INTO TestHZ VALUES ('我');
INSERT INTO TestHZ VALUES ('他');
INSERT INTO TestHZ VALUES ('鸡');
INSERT INTO TestHZ VALUES ('鸭');
INSERT INTO TestHZ VALUES ('鹅');
INSERT INTO TestHZ VALUES ('铑');
COMMIT;
The NLSSORT Function
The NLSSORT
function enables you to use any linguistic sort for an ORDER BY clause. It replaces a character
string with the equivalent sort string used by the linguistic sort mechanism so
that sorting the replacement strings produces the desired sorting sequence. For
a binary sort, the sort string is the same as the input string.
The kind of linguistic sort used by an ORDER BY clause is determined by the NLS_SORT session parameter, but it can be
overridden by explicitly using the NLSSORT
function.
SELECT* FROM TestHZ
--按拼音排序SELECT * FROM TestHZ
ORDER BY
SortName
--按笔画(第一顺序)、部首(第二顺序)排序SELECT * FROM TestHZ
ORDER BY
NLSSORT(SortName,'NLS_SORT = SCHINESE_STROKE_M')
--按部首(第一顺序)、拼音(第二顺序)排序
SELECT* FROM TestHZ
ORDER BY
NLSSORT(SortName,'NLS_SORT = SCHINESE_RADICAL_M')
--按拼音排序
SELECT* FROM TestHZ
ORDER BY
NLSSORT(SortName,'NLS_SORT = SCHINESE_PINYIN_M')
粗略的看一下结果,Sort By和Sort By NLSSORT(XXX,’NLS_SORT
= SCHINESE_PINYIN_M’)排序结果差不多,实际上,排序不同主要是针对一些比较生僻的汉字而言的,如“铑”。具体问题出在哪里,暂时也不清楚。。。