mysql++ use Unicode on Windows

最新推荐文章于 2021-07-27 00:31:12 发布

ecjtuync

最新推荐文章于 2021-07-27 00:31:12 发布

阅读量1.7k

点赞数

分类专栏： Openssl &amp; cryptlib研究文章标签： mysql windows string query insert encoding

本文链接：https://blog.csdn.net/ecjtuync/article/details/3368144

版权

Openssl & cryptlib研究专栏收录该内容

5 篇文章 0 订阅

订阅专栏

this is a summary about how to use unicode with mysql++ on Windows OS.

[problem]

from http://lists.mysql.com/plusplus/5989

hi..
i have a problem using Query class from mysqlpp with wchar_t. i hope
somebody can help me with this.
i want to insert unicode to table.

here is the sample code

	mysqlpp::Connection con( "test", "localhost",  "test", "" );
	mysqlpp::Query query = con.query();
	char *		test = "abcdefghij";
	wchar_t *	wtest = L"abcdefghij";

	query << "INSERT INTO tes VALUES( '" << test << "')";
	query.execute();

	query << "INSERT INTO tes VALUES( '" << wtest << "')";
	query.execute();



the char test will gave the right result, 'abcdefghij' inserted to db.
but the wchar_t wtest not give the right result, the record filled with
'004A00F4'.
it seems Query not welcome unicode.
so how can I insert unicode using mysqlpp??

[solution]

from http://tangentsoft.net/mysql++/doc/html/userman/unicode.html

6.3. Unicode on Windows

Each Windows API function that takes a string actually comes in two versions. One version supports only 1-byte “ANSI” characters (a superset of ASCII), so they end in 'A'. Windows also supports the 2-byte subset of Unicode called UCS-2. Some call these “wide” characters, so the other set of functions end in 'W'. TheMessageBox() API, for instance, is actually a macro, not a real function. If you define the UNICODE macro when building your program, the MessageBox() macro evaluates to MessageBoxW(); otherwise, to MessageBoxA().

Since MySQL uses the UTF-8 Unicode encoding and Windows uses UCS-2, you must convert data when passing text between MySQL++ and the Windows API. Since there’s no point in trying for portability — no other OS I’m aware of uses UCS-2 — you might as well use platform-specific functions to do this translation. Since version 2.2.2, MySQL++ ships with two Visual C++ specific examples showing how to do this in a GUI program. (In earlier versions of MySQL++, we did Unicode conversion in the console mode programs, but this was unrealistic.)

How you handle Unicode data depends on whether you’re using the native Windows API, or the newer .NET API. First, the native case:

// Convert a C string in UTF-8 format to UCS-2 format.
void ToUCS2(LPTSTR pcOut, int nOutLen, const char* kpcIn)
{
  MultiByteToWideChar(CP_UTF8, 0, kpcIn, -1, pcOut, nOutLen);
}

// Convert a UCS-2 string to C string in UTF-8 format.
void ToUTF8(char* pcOut, int nOutLen, LPCWSTR kpcIn)
{
  WideCharToMultiByte(CP_UTF8, 0, kpcIn, -1, pcOut, nOutLen, 0, 0);
}

These functions leave out some important error checking, so see examples/vstudio/mfc/mfc_dlg.cpp for the complete version.

If you’re building a .NET application (such as, perhaps, because you’re using Windows Forms), it’s better to use the .NET libraries for this:

// Convert a C string in UTF-8 format to a .NET String in UCS-2 format.
String^ ToUCS2(const char* utf8)
{
  return gcnew String(utf8, 0, strlen(utf8), System::Text::Encoding::UTF8);
}

// Convert a .NET String in UCS-2 format to a C string in UTF-8 format.
System::Void ToUTF8(char* pcOut, int nOutLen, String^ sIn)
{
  array<Byte>^ bytes = System::Text::Encoding::UTF8->GetBytes(sIn);
  nOutLen = Math::Min(nOutLen - 1, bytes->Length);
  System::Runtime::InteropServices::Marshal::Copy(bytes, 0,
    IntPtr(pcOut), nOutLen);
  pcOut[nOutLen] = '/0';
}

Unlike the native API versions, these examples are complete, since the .NET platform handles a lot of things behind the scenes for us. We don’t need any error-checking code for such simple routines.

All of this assumes you’re using Windows NT or one of its direct descendants: Windows 2000, Windows XP, Windows Vista, or any “Server” variant of Windows. Windows 95 and its descendants (98, ME, and CE) do not support UCS-2. They still have the 'W' APIs for compatibility, but they just smash the data down to 8-bit and call the 'A' version for you.

from examples/vstudio/mfc/mfc_dlg.cpp

ToUCS2

// Convert a C string in UTF-8 format to UCS-2 format.

bool

CExampleDlg::ToUCS2(LPTSTR pcOut, int nOutLen, const char* kpcIn)

{

if (strlen(kpcIn) > 0) {

// Do the conversion normally

return MultiByteToWideChar(CP_UTF8, 0, kpcIn, -1, pcOut,

nOutLen) > 0;

}

else if (nOutLen > 1) {