Many C++ Windows programmers getconfused over what bizarre identifierslike TCHAR,LPCTSTR
In general, a character can be represented in 1 byte or 2 bytes.Let's say 1-byte character is ANSI character - all Englishcharacters are represented throughthis
Visual C++ compiler supports char wchar_t
What if you want your C/C++ code to be independent of characterencoding/mode used?
Suggestion: Use generic data-types and names torepresent characters and string.
For example, instead of replacing:
char cResponse; // 'Y' or 'N' char sUsername[64]; // str* functions
with
wchar_t cResponse; // 'Y' or 'N' wchar_t sUsername[64]; // wcs* functions
In order to support multi-lingual (i.e.Unicode) in your language, you can simply code it in more genericmanner:
#include<TCHAR.H> // Implicit or explicit include TCHAR cResponse; // 'Y' or 'N' TCHAR sUsername[64]; // _tcs* functions
The following project setting inGeneral page describes which Character Set is to be used forcompilation:
(General -> Character Set)
This way, when your project is beingcompiled as Unicode, the TCHAR wouldtranslate to wchar_t.If it is being compiled as ANSI/MBCS, it would be translatedto char.You are free to use char wchar_t,and project settings will not affect any direct use of thesekeywords.
T CHAR
#ifdef _UNICODE typedef wchar_t TCHAR; #else typedef char TCHAR; #endif
The macro _UNICODE TCHARwouldmean wchar_t.When Character Set if set to "Use Multi-Byte CharacterSet", TCHAR would mean char.
Likewise, to support multiplecharacter-set using single code base, and possibly supportingmulti-language, use specific functions (macros). Instead ofusing strcpy, strlen, strcat wcscpy, wcslen, wcscat _tcscpy, _tcslen, _tcscatfunctions.
As youknow strlen
size_t strlen(const char*);
And, wcslen
size_t wcslen(const wchar_t* );
You may betteruse _tcslen,whichis
size_t _tcslen(const TCHAR* );
WC wcs _tcs char what_t,logically.
But, inreality, _tcslen _tcs
#ifdef _UNICODE #define _tcslen wcslen #else #define _tcslen strlen #endif
You shouldrefer TCHAR.H
You might ask why they are defined asmacros, and not implemented as functions instead? The reason issimple: A library or DLL may export a single function, with samename and prototype (Ignore overloading concept of C++). Forinstance, when you export a function as:
void _TPrintChar(char);
How the client is supposed to call itas?
void _TPrintChar(wchar_t);
_TPrintChar
void PrintCharA(char); // A = ANSI void PrintCharW(wchar_t); // W = Wide character
And a simple macro, as defined below,would hide the difference:
#ifdef _UNICODE void _TPrintChar(wchar_t); #else void _TPrintChar(char); #endif
The client would simply call it as:
TCHAR cChar; _TPrintChar(cChar);
Note thatboth TCHAR _TPrintChar cChar char wchar_t.
Macros do avoid these complications,and allows us to use either ANSI or Unicode function for charactersand strings. Most of the Windows functions, that take string or acharacter are implemented this way, and for programmersconvenience, only one function (a macro!) isgood. SetWindowText
// WinUser.H #ifdef UNICODE #define SetWindowText SetWindowTextW #else #define SetWindowText SetWindowTextA #endif // !UNICODE
There are very few functions that donot have macros, and are available only withsuffixed ReadDirectoryChangesW,which doesn't have ANSI equivalent.
You all know that we use double quotation marks to representstrings. The string represented in this manner is ANSI-string,having 1-byte each character. Example:
"This is ANSI String. Each letter takes 1 byte."
The string text given aboveis L.An example:
L"This is Unicode string. Each letter would take 2 bytes, including spaces."
Notethe
In general, string would be in multipleof sizeof(TCHAR)
When you need to express hard-codedstring, you can use:
"ANSI String"; // ANSI L"Unicode String"; // Unicode _T("Either string, depending on compilation"); // ANSI or Unicode // or use TEXT macro, if you need more readability
The non-prefixed string is ANSI string,the _T TEXT wouldbe either, depending on compilation.Again, _T TEXT
// SIMPLIFIED #ifdef _UNICODE #define _T(c) L##c #define TEXT(c) L##c #else #define _T(c) c #define TEXT(c) c #endif
The ## _T("Unicode") L"Unicode",where the string passed is argument to macro -If _UNICODE _UNICODE _T("Unicode") "Unicode".The token pasting operator did exist even in C language, and is notspecific about VC++ or character encoding.
Note that these macros can be used for strings as well ascharacters. _T('R') L'R' 'R'
No, you cannot usethese macros to convert variables (string or character) intoUnicode/non-Unicode text. Following is not valid:
char c = 'C'; char str[16] = "CodeProject"; _T(c); _T(str);
The bold lines would get successfullycompiled in ANSI (Multi-Byte) build,since _T(x) x,and therefore _T(c) _T(str) c str,respectively. But, when you build it with Unicode character set, itwould fail to compile:
error C2065: 'Lc' : undeclared identifier error C2065: 'Lstr' : undeclared identifier
I would not like to insult yourintelligence by describing why and what those errors are.
There exist set of conversion routineto convert MBCS to Unicode and vice versa, which I would explainsoon.
String classes, likeMFC/ATL's CString CStringA CStringW CString
The TCHAR TCHAR.What if you
// ANSI characters foo_ansi(char*); foo_ansi(const char*); char* pString; // Unicode/wide-string foo_uni(WCHAR*); wchar_t* foo_uni(const WCHAR*); WCHAR* pString; // Independent foo_char(TCHAR*); foo_char(const TCHAR*); TCHAR* pString;After reading about
TCHAR
TCHAR.H
TCHAR
Windows.h
NOTE: If your project implicitly or explicitlyincludes
Windows.h,you need not include
TCHAR.H
- char*
replacement: LPSTR - constchar*
replacement: LPCSTR - WCHAR*
replacement: LPWSTR - constWCHAR*
replacement: LPCWSTR(C before W,since constisbefore WCHAR) - TCHAR*
replacement: LPTSTR - constTCHAR*
replacement: LPCTSTR
Now, I hope you understand thefollowing signatures:
BOOL SetCurrentDirectory( LPCTSTR lpPathName ); DWORD GetCurrentDirectory(DWORD nBufferLength,LPTSTR lpBuffer);
Continuing. You must have seen somefunctions/methods asking you topass GetCurrentDirectory,you need to pass number of characters,and
TCHAR sCurrentDir[255]; // Pass 255 and not 255*2 GetCurrentDirectory(sCurrentDir, 255);
On the other side, if you need toallocate number or characters, you must allocate proper number ofbytes. In C++, you can simply use new:
LPTSTR pBuffer; // TCHAR* pBuffer = new TCHAR[128]; // Allocates 128 or 256 BYTES, depending on compilation.
But if you use memory allocationfunctions like malloc, LocalAlloc, GlobalAlloc,etc; you must specify the number of bytes!
pBuffer = (TCHAR*) malloc (128 * sizeof(TCHAR) );Typecasting the return value is required, as you know. Theexpression in
malloc'sargument ensures that it allocates desired number of bytes - andmakes up room for desired number of characters.
License
This article, along with any associatedsource code and files, is licensed under
本文详细解释了在C/C++编程中使用TCHAR和Unicode字符集的概念,包括如何定义通用字符类型来支持多语言环境,以及相关的字符串操作宏和函数。

被折叠的 条评论
为什么被折叠?



