前言
计算机字符编码也称字集码,是把字符集中的字符编码为指定集合中某一对象,以便文本在计算机中存储和通过通信网络的传递。常见的编码有ASCII、UNICODE(10位)、UNICODE(16位)、GBK、UTF8等。
在编程中,会经常用到汉字转换为不同编码,以便于人机交流,花费了一些时间,在网上搜索到了几种汉字转编码的代码并稍加修改,以便日后编程时使用。如有侵权,请联系删除。
一、代码
1.汉字转10位unicode编码
Function strToUnicode10(str As String) As String
For i = 1 To Len(str)
strToUnicode10 = strToUnicode10 & " " & AscW(Mid(str, i, 1))
Next
End Function
2.汉字转16位unicode编码
data = pd.read_csv(
'https://labfile.oss.aliyuncs.com/courses/1283/adult.data.csv')
print(data.head())
3.汉字转GBK编码
Function strToGBK(str As String) As String
For i = 1 To Len(str)
strToGBK = strToGBK & " " & Hex(Asc(Mid(str, i, 1)))
Next
End Function
4.汉字转UTF8编码
Function strToUtf8(str As String) As String
Dim wch As String
Dim uch As String
Dim szRet As String
Dim x As Long
Dim inputLen As Long
Dim nAsc As Long
Dim nAsc2 As Long
Dim nAsc3 As Long
If str = "" Then
strToUtf8 = str
Exit Function
End If
inputLen = Len(str)
For x = 1 To inputLen
wch = Mid(str, x, 1)
nAsc = AscW(wch)
'对于<0的编码 其需要加上65536
If nAsc < 0 Then nAsc = nAsc + 65536
'对于<128位的ASCII的编码则无需更改
If (nAsc And &HFF80) = 0 Then
szRet = szRet & wch
Else
If (nAsc And &HF000) = 0 Then
uch = "%" & Hex(((nAsc \ 2 ^ 6)) Or &HC0) & Hex(nAsc And &H3F Or &H80)
szRet = szRet & uch
Else
uch = "%" & Hex((nAsc \ 2 ^ 12) Or &HE0) & "%" & _
Hex((nAsc \ 2 ^ 6) And &H3F Or &H80) & "%" & _
Hex(nAsc And &H3F Or &H80)
szRet = szRet & " " & uch
End If
End If
Next
strToUtf8 = szRet
End Function
二、运行效果截图