最近工作中遇到中文编码,中文参数传递,AJAX返回值包含中文乱码的问题,为此奋斗了一宿,有点心得,有点体会,总结如下,希望对迷惑于此中者有解惑之功效!
在网上一阵狠搜,编码函数的确是有,包括Javascript的和ASP的。本文先总结几个有用的ASP中的编码与解码函数,代码如下:
第一组函数:VB_URLEncode()和VB_URLDecode()
2 ' ==============================================================
3 ' 功能:ASP Server对象内置编码函数
4 ' 描述:没有对应的解码函数
5 ' ==============================================================
6 Function VB_URLEncode(enStr)
7 VB_URLEncode = Server.URLEncode(enStr)
8 End Function
9
10 ' ==============================================================
11 ' 功能:Server.URLEncode()的解码函数
12 ' 描述:目前该函数还未完善
13 ' 当本页面为UTF-8编码时,源字符串中包含如下格式子字符串时:
14 ' "编码aa测aa试"
15 ' 函数无法对VB_URLEncode()之后的编码进行解码
16 ' 当本页面为GB2312编码是,该函数工作正常。
17 ' ==============================================================
18 Function VB_URLDecode(enStr)
19 dim deStr,strSpecial
20 dim c,i,v
21 deStr = ""
22 strSpecial = " !""#$%&'()*+,.-_/:;<=>?@[\]^`{|}~% "
23 For i = 1 To len (enStr)
24 c = Mid (enStr,i, 1 )
25 If c = " % " Then
26 v = eval ( " &h " + Mid (enStr,i + 1 , 2 ))
27 If inStr (strSpecial, chr (v)) > 0 Then
28 deStr = deStr & chr (v)
29 i = i + 2
30 Else
31 v = eval ( " &h " + Mid (enStr,i + 1 , 2 ) + Mid (enStr,i + 4 , 2 ))
32 deStr = deStr & chr (v)
33 i = i + 5
34 End If
35 Else
36 If c = " + " Then
37 deStr = deStr & " "
38 Else
39 deStr = deStr & c
40 End If
41 End If
42 Next
43 VB_URLDecode = deStr
44 End Function
第二组函数:VB_GBtoUTF8()和VB_UTF8toGB()
VB_GBtoUTF8()代码如下:
2 ' 功能:对中文字符进行编码,由GB2312转换为UTF-8
3 ' 描述:与UTF8toGB()互逆
4 ' 编码后的格式可用于页面之间的数据传递,但无法
5 ' 正确显示在HTML页面,需要UTF8toGB()解码。
6 ' ===========================================
7 Function VB_GBtoUTF8(szInput)
8 Dim wch, uch, szRet
9 Dim x
10 Dim nAsc, nAsc2, nAsc3
11 ' 如果输入参数为空,则退出函数
12 If szInput = "" Then
13 VB_GBtoUTF8 = szInput
14 Exit Function
15 End If
16 ' 开始转换
17 For x = 1 To Len (szInput)
18 ' 利用mid函数分拆GB编码文字
19 wch = Mid (szInput, x, 1 )
20 ' 利用ascW函数返回每一个GB编码文字的Unicode字符代码
21 ' 注:asc函数返回的是ANSI 字符代码,注意区别
22 nAsc = AscW(wch)
23 If nAsc < 0 Then nAsc = nAsc + 65536
24
25 If (nAsc And & HFF80) = 0 Then
26 szRet = szRet & wch
27 Else
28 If (nAsc And & HF000) = 0 Then
29 uch = " % " & Hex (((nAsc \ 2 ^ 6 )) Or & HC0) & Hex (nAsc And & H3F Or & H80)
30 szRet = szRet & uch
31 Else
32 ' GB编码文字的Unicode字符代码在0800 - FFFF之间采用三字节模版
33 uch = " % " & Hex ((nAsc \ 2 ^ 12 ) Or & HE0) & " % " & _
34 Hex ((nAsc \ 2 ^ 6 ) And & H3F Or & H80) & " % " & _
35 Hex (nAsc And & H3F Or & H80)
36 szRet = szRet & uch
37 End If
38 End If
39 Next
40 VB_GBtoUTF8 = szRet
41 End Function
VB_UTF8toGB()代码如下:
2 ' 功能:对中文字符进行编码,由UTF-8转换为GB2312
3 ' 描述:VB_GBtoUTF8()的解码函数
4 ' ===========================================
5 Function VB_UTF8toGB(UTFStr)
6 For Dig = 1 To len (UTFStr)
7 ' 如果UTF8编码文字以%开头则进行转换
8 If mid (UTFStr,Dig, 1 ) = " % " Then
9 ' UTF8编码文字大于8则转换为汉字
10 If len (UTFStr) >= Dig + 8 Then
11 GBStr = GBStr & ConvChinese( mid (UTFStr,Dig, 9 ))
12 Dig = Dig + 8
13 Else
14 GBStr = GBStr & mid (UTFStr,Dig, 1 )
15 End If
16 Else
17 GBStr = GBStr & mid (UTFStr,Dig, 1 )
18 End If
19 Next
20 VB_UTF8toGB = GBStr
21 End Function
22
23 ' UTF8编码文字将转换为汉字
24 Function ConvChinese(x)
25 A = split ( mid (x, 2 ), " % " )
26 i = 0
27 j = 0
28 For i = 0 To ubound (A)
29 A(i) = c16to2(A(i))
30 Next
31 For i = 0 To ubound (A) - 1
32 DigS = instr (A(i), " 0 " )
33 Unicode = ""
34 For j = 1 To DigS - 1
35 If j = 1 Then
36 A(i) = right (A(i), len (A(i)) - DigS)
37 Unicode = Unicode & A(i)
38 Else
39 i = i + 1
40 A(i) = right (A(i), len (A(i)) - 2 )
41 Unicode = Unicode & A(i)
42 End If
43 Next
44
45 If len (c2to16(Unicode)) = 4 Then
46 ConvChinese = ConvChinese & chrw( int ( " &H " & c2to16(Unicode)))
47 Else
48 ConvChinese = ConvChinese & chr ( int ( " &H " & c2to16(Unicode)))
49 End If
50 Next
51 End Function
52
53 ' 二进制代码转换为十六进制代码
54 Function c2to16(x)
55 i = 1
56 For i = 1 To len (x) step 4
57 c2to16 = c2to16 & hex (c2to10( mid (x,i, 4 )))
58 Next
59 End Function
60
61 ' 二进制代码转换为十进制代码
62 Function c2to10(x)
63 c2to10 = 0
64 If x = " 0 " Then Exit Function
65 i = 0
66 For i = 0 To len (x) - 1
67 If mid (x, len (x) - i, 1 ) = " 1 " Then c2to10 = c2to10 + 2 ^ (i)
68 Next
69 End Function
70
71 ' 十六进制代码转换为二进制代码
72 Function c16to2(x)
73 i = 0
74 For i = 1 To len ( trim (x))
75 tempstr = c10to2( cint ( int ( " &h " & mid (x,i, 1 ))))
76 Do While len (tempstr) < 4
77 tempstr = " 0 " & tempstr
78 Loop
79 c16to2 = c16to2 & tempstr
80 Next
81 End Function
82
83 ' 十进制代码转换为二进制代码
84 Function c10to2(x)
85 mysign = sgn (x)
86 x = abs (x)
87 DigS = 1
88 Do
89 If x < 2 ^ DigS Then
90 Exit Do
91 Else
92 DigS = DigS + 1
93 End If
94 Loop
95 tempnum = x
96
97 i = 0
98 For i = DigS To 1 step - 1
99 If tempnum >= 2 ^ (i - 1 ) Then
100 tempnum = tempnum - 2 ^ (i - 1 )
101 c10to2 = c10to2 & " 1 "
102 Else
103 c10to2 = c10to2 & " 0 "
104 End If
105 Next
106 If mysign =- 1 Then c10to2 = " - " & c10to2
107 End Function
测试代码如下:
2 < head >
3 < meta http - equiv = " content-type " content = " text/html; charset=gb2312 " />
4 < title > 字符编码测试 </ title >
5 </ head >
6 < style type = " text/css " >
7 body{ margin:20px 10px; line - height: 140 %; font - size:12px; color:blue;}
8 </ style >
9 < body >
10 < %
11 On Error Resume Next
12 str = " ##testingTest$$##编码aa测aa试aa##!!67&#=; "
13 Response.Write( " 源字符串: " & str & " <BR> " )
14 str1 = VB_URLEncode(str)
15 str2 = VB_URLDecode(str1)
16 Response.Write( " VB_URLEncode: " & str1 & " <BR> " )
17 Response.Write( " VB_URLDecode: " & str2 & " <BR> " )
18 If str2 = str Then Response.Write( " 结果==>解码正确, URLEncode对字符串中除26个英文字母(包括大小写)之外的所有字符都进行编码,中文字符为2字节,非中文字符1字节<BR> " )
19 Response.Write( " ------------------------------------------------------- <BR> " )
20 str3 = VB_GBtoUTF8(str)
21 str4 = VB_UTF8toGB(str3)
22 Response.Write( " VB_GBtoUTF8: " & str3 & " <BR> " )
23 Response.Write( " VB_UTF8toGB: " & str4 & " <BR> " )
24 If str4 = str Then Response.Write( " 结果==>解码正确,GBtoUTF8只对中文字符编码,按每个中文字符3字节编码<BR> " )
25 Response.End()
26 % >
27 </ body >
28 </ html >
测试结果如下:
VB_URLEncode: % 23 %23testingTest% 24 % 24 % 23 % 23 %B1%E0%C2%EBaa%B2%E2aa%CA%D4aa% 23 % 23 % 21 % 2167 % 26 % 23 %3D%3B
VB_URLDecode: ##testingTest$$##编码aa测aa试aa##!! 67 & # = ;
结果 ==> 解码正确, URLEncode对字符串中除26个英文字母(包括大小写)之外的所有字符都进行编码,中文字符为2字节,非中文字符1字节
-------------------------------------------------------
VB_GBtoUTF8: ##testingTest$$##%E7%BC% 96 %E7%A0%81aa%E6%B5%8Baa%E8%AF%95aa##!! 67 & # = ;
VB_UTF8toGB: ##testingTest$$##编码aa测aa试aa##!! 67 & # = ;
结果 ==> 解码正确,GBtoUTF8只对中文字符编码,按每个中文字符3字节编码
如上测试代码在gb2312编码下运行OK,如若将网页编码换为utf-8的话,VB_URLDecode()解码在某些情况下会发生错误,比如上面代码就是这样。具体情形我在该函数源码上有说明。
有了这两组编码函数后,就可以很方便的避免路径参数传递,网页数据传递中的中文乱码问题。
够详细了吧,这里只介绍了ASP下的2组编码与解码函数,还有一些其他的编码函数我就不深究了。另外还有ASP编码后的字符串如何用javascript来解码,这点在AJAX应用上有突出的体现。鉴于技术问题,接着研究去,这个就留在下篇解说!
如有不正确的地方,欢迎拍砖!鞠躬下台···
附上完整的函数及测试代码:charEncode.rar