.Net 字符集与编解码

29 篇文章 0 订阅
25 篇文章 1 订阅

0 .NET 字符集编解码

.Net 内部使用的字符集是Unicode,如果需要编码为其他诸如GBK、UTF8编码,可以通过Encoding 类来实现。

using System.Text;


void PrintBytes(byte[] bytes)
{
    foreach (var b in bytes)
    {
        Console.Write("{0:X} ", b);
    }
    Console.WriteLine();
}



Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

string str = "主账号";

var gbkBytes = Encoding.GetEncoding("gbk").GetBytes(str);       //获取GBK编码
var utf8Bytes = Encoding.UTF8.GetBytes(str);                    //获取UTF8编码
var unicodeBytes = Encoding.Unicode.GetBytes(str);              //获取Unicode编码

PrintBytes(gbkBytes);
PrintBytes(utf8Bytes);
PrintBytes(unicodeBytes);

var gbkStr = Encoding.GetEncoding("gbk").GetString(gbkBytes);   //使用GBK解码
var utf8Str = Encoding.UTF8.GetString(utf8Bytes);               //使用UTF8解码
var unicodeStr = Encoding.Unicode.GetString(unicodeBytes);      //使用Unicode解码

Console.WriteLine(gbkStr);
Console.WriteLine(utf8Str);
Console.WriteLine(unicodeStr);

输出:

D6 F7 D5 CB BA C5
E4 B8 BB E8 B4 A6 E5 8F B7
3B 4E 26 8D F7 53
主账号
主账号
主账号

在使用C++API时,当遇到字符串处理时难免会需要处理字符编码的问题。这里主要针对于使用C++ API是遇到的一些编码被封送的情况测试。

1 Windows环境下

这里首先测试了.Net 在Windows环境下运行情况下,.Net 默认使用ANSI 编解码,其中在 DllImport 中指定的 CharSet 对导出函数的直接字符串参数生效。CharSet 取值与C++ API 端接收到的字符串情况对应如下:

1.1 请求函数:C# => C++

1.1.1 测试函数字符串参数编码

输入文字:主账号

CharSetC++API端接收到的字符集输出Byte值
不设置GBKD6 F7 D5 CB BA C5 
AnsiGBKD6 F7 D5 CB BA C5 
UnicodeUnicode3B 4E 26 8D F7 53 
AutoUnicode3B 4E 26 8D F7 53 
NoneGBKD6 F7 D5 CB BA C5 

测试接口代码:

    //对这个接口分别设置为以下5种情况进行测试

	[DllImport(LibName, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Auto, allingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.None, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);
1.1.2 测试函数结构体参数中的字符串编码

在这个 DllImport 中设置的CharSet 仅对接口函数的直接字符串类型生效。如果参数是一个对象,而对象中的字符串类型需要在定义封装对象的位置,通过StructLayout 属性的CharSet 来设置。我这里测试下来,CharSet 取值与C++ API 端接收到的字符串情况对应如下:

输入文字:主账号

CharSetC++API端接收到的字符集输出Byte值
不设置GBKD6 F7 D5 CB BA C5 
AnsiGBKD6 F7 D5 CB BA C5 
Unicodenull
Autonull
NoneGBKD6 F7 D5 CB BA C5 

跟上面类似,但是Unicode 传送的不成功,想必是类型问题,Unicode 对应C++中应该对应使用wchar* 数组。

测试接口及定义结构体的代码:

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepReqAddPrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public class StepReqUpdatePrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.None)]
public class StepReqAddAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
public class StepReqUpdateAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddPrimaryAccount(StepReqAddPrimaryAccount reqAddPrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqUpdatePrimaryAccount(StepReqUpdatePrimaryAccount reqUpdatePrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Auto, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddAccount(StepReqAddAccount reqAddAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.None, CallingConvention = CallingConvention.StdCall)]
    public static extern int ReqUpdateAccount(StepReqUpdateAccount reqUpdateAccount, int requestID);

1.2 回调函数:C++ => C#

返回值:正确

C++中编码:GBK

CharSetC#回调函数接收到字符集解码情况
不设置GBK正确
AnsiGBK正确
Unicodenull乱码
Autonull乱码
NoneGBK正确

C++中编码:Utf8

相关测试代码:

//依次将CharSet 设置为:不设置值、Ansi、Unicode、Auto、None,进行测试
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepRspInfo
{
	public int ErrorID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 256)]
	public string? ErrorMsg;
}

回调委托定义:

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void OnRspAdminUserLogin(StepRspAdminUserLogin? rspAdminUserLogin, StepRspInfo? rspInfo, int requestID, bool isLast);

2 Linux环境下

2.1 请求函数: C# => C++

2.1.1 测试函数字符串参数编码

 输入文字:主账号

CharSetC++API端接收到的字符集输出Byte值
不设置UTF8E4 B8 BB E8 B4 A6 E5 8F B7
AnsiUTF8E4 B8 BB E8 B4 A6 E5 8F B7
UnicodeUnicode3B 4E 26 8D F7 53
AutoUTF8E4 B8 BB E8 B4 A6 E5 8F B7
NoneUTF8E4 B8 BB E8 B4 A6 E5 8F B7

测试接口代码:

    //对这个接口分别设置为以下5种情况进行测试

	[DllImport(LibName, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.Auto, allingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);

	[DllImport(LibName, CharSet=CharSet.None, CallingConvention = CallingConvention.StdCall)]
	public static extern void RegisterFront(string ip, int port);
2.1.2 测试函数结构体参数中的字符串编码

通过StructLayout 属性的CharSet 来设置结构体中的字符串编码,CharSet 取值与C++ API 端接收到的字符串情况对应如下:

输入文字:主账号

CharSetC++API端接收到的字符集输出Byte值
不设置UTF8E4 B8 BB E8 B4 A6 E5 8F B7
AnsiUTF8E4 B8 BB E8 B4 A6 E5 8F B7
Unicodenull
AutoUTF8E4 B8 BB E8 B4 A6 E5 8F B7
NoneUTF8E4 B8 BB E8 B4 A6 E5 8F B7

跟上面类似,但是Unicode 传送的不成功,想必是类型问题,Unicode 对应C++中应该对应使用wchar* 数组。

测试接口及定义结构体的代码:

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepReqAddPrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public class StepReqUpdatePrimaryAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? PrimaryAccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? PrimaryAccountName;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? BrokerPassword;
	public int ChannelID;
	public bool IsAllowLogin;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.None)]
public class StepReqAddAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
public class StepReqUpdateAccount
{
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 9)]
	public string? TradingDay;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 32)]
	public string? AccountID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? AccountName;
	public AccountStatusType AccountStatus;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 64)]
	public string? Password;
	public int TradeGroupID;
	public int RiskGroupID;
	public int CommissionGroupID;
}
	[DllImport(LibName, CharSet=CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddPrimaryAccount(StepReqAddPrimaryAccount reqAddPrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Unicode, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqUpdatePrimaryAccount(StepReqUpdatePrimaryAccount reqUpdatePrimaryAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.Auto, CallingConvention = CallingConvention.StdCall)]
	public static extern int ReqAddAccount(StepReqAddAccount reqAddAccount, int requestID);
	[DllImport(LibName, CharSet = CharSet.None, CallingConvention = CallingConvention.StdCall)]
    public static extern int ReqUpdateAccount(StepReqUpdateAccount reqUpdateAccount, int requestID);

 2.2 回调函数: C++ => C#

 返回值:正确

C++中编码:GBK

CharSetC#解码情况
不设置乱码
Ansi乱码
Unicode乱码
Auto乱码
None乱码

C++中编码:Utf8

相关测试代码:

//依次将CharSet 设置为:不设置值、Ansi、Unicode、Auto、None,进行测试
[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
public class StepRspInfo
{
	public int ErrorID;
	[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 256)]
	public string? ErrorMsg;
}

回调委托定义:

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void OnRspAdminUserLogin(StepRspAdminUserLogin? rspAdminUserLogin, StepRspInfo? rspInfo, int requestID, bool isLast);

3 结论

当在与C++ API 交互时,如果在windows平台运行,建议使用GBK编码进行通信;而在Linux平台运行的话,建议使用 UTF8编码进行通信。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值