Secure String Functions in the C Run-Time Library


How to Get More Control When Performing String Operations


In addition to the new secure string functions, the C run-time library has some new functions that provide more control when performing string manipulations. For example, you can control the filler values or how truncation is performed. Naturally, the C run time offers both ANSI (A) versions of the functions as well as Unicode (W) versions of the functions. Here are the prototypes for some of these functions (and many more exist that are not shown here):


HRESULT StringCchCat(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc); HRESULT StringCchCatEx(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc,    PTSTR  * ppszDestEnd, size_t  * pcchRemaining, DWORD dwFlags); HRESULT StringCchCopy(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc); HRESULT StringCchCopyEx(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc,    PTSTR  * ppszDestEnd, size_t  * pcchRemaining, DWORD dwFlags); HRESULT StringCchPrintf(PTSTR pszDest, size_t cchDest,    PCTSTR pszFormat, ...); HRESULT StringCchPrintfEx(PTSTR pszDest, size_t cchDest,    PTSTR  * ppszDestEnd, size_t  * pcchRemaining, DWORD dwFlags,    PCTSTR pszFormat,...);


You'll notice that all the methods shown have "Cch" in their name. This stands for Count of characters, and you'll typically use the _countof macro to get this value. There is also a set of functions that have "Cb" in their name, such as StringCbCat(Ex), StringCbCopy(Ex), and StringCbPrintf(Ex). These functions expect that the size argument is in count of bytes instead of count of characters. You'll typically use the sizeof operator to get this value.

你注意到所有方法名中都含有"Cch",它表示Count of characters,可以通过宏_countof得到其值。也有一组名称中含有"Cb"的函数,比如StringCbCat(Ex)StringCbCopy(Ex)StringCbPrintf(Ex)。这些函数通过字节计数而非字符来预计引数长,通过使用sizeof来得到其值。

All these functions return an HRESULT with one of the values shown in Table 2-2.


Table 2-2: HRESULT Values for Safe String Functions

HRESULT ValueDescription

 Success. The destination buffer contains the source string and is terminated by '/0'.



 Failure. The NULL value has been passed as a parameter.



 Failure. The given destination buffer was too small to contain the entire source string.



Unlike the secure (_s suffixed) functions, when a buffer is too small, these functions do perform truncation. You can detect such a situation when STRSAFE_E_INSUFFICIENT_BUFFER is returned. As you can see in StrSafe.h, the value of this code is 0x8007007a and is treated as a failure by SUCCEEDED/FAILED macros. However, in that case, the part of the source buffer that could fit into the destination writable buffer has been copied and the last available character is set to '/0'. So, in the previous example, szBuffer would contain the string "012345678" if StringCchCopy is used instead of _tcscpy_s. Notice that the truncation feature might or might not be what you need, depending on what you are trying to achieve, and this is why it is treated as a failure (by default). For example, in the case of a path that you are building by concatenating different pieces of information, a truncated result is unusable. If you are building a message for user feedback, this could be acceptable. It's up to you to decide how to handle a truncated result.


Last but not least, you'll notice that an extended (Ex) version exists for many of the functions shown earlier. These extended versions take three additional parameters, which are described in Table 2-3.


Table 2-3: Extended Version Parameters

Parameters and ValuesDescription
size_t* pcchRemaining

 Pointer to a variable that indicates the number of unused characters in the destination buffer. The copied terminating '/0' character is not counted. For example, if one character is copied into a buffer that is 10 characters wide, 9 is returned even though you won't be able to use more than 8 characters without truncation. If pcchRemaining is NULL, the count is not returned.


LPTSTR* ppszDestEnd

 If ppszDestEnd is non-NULL, it points to the terminating '/0' character at the end of the string contained by the destination buffer.


DWORD dwFlags

 One or more of the following values separated by '|'.



 If the function succeeds, the low byte of dwFlags is used to fill the rest of the destination buffer, just after the terminating '/0' character. (See the comment about STRSAFE_FILL_BYTE just after this table for more details.)



 Treats NULL string pointers like empty strings (TEXT("")).



 If the function fails, the low byte of dwFlags is used to fill the entire destination buffer except the first '/0' character used to set an empty string result. (See the comment about STRSAFE_FILL_BYTE just after this table for more details.) In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any character in the string being returned is replaced by the filler byte value.



 If the function fails, the first character of the destination buffer is set to '/0' to define an empty string (TEXT("")). In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any truncated string is overwritten.



 As in the case of STRSAFE_NULL_ON_FAILURE, if the function fails, the destination buffer is set to an empty string (TEXT("")). In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any truncated string is overwritten.



  Note  Even if STRSAFE_NO_TRUNCATION is used as a flag, the characters of the source string are still copied, up to the last available character of the destination buffer. Then both the first and the last characters of the destination buffer are set to '/0'. This is not really important except if, for security purposes, you don't want to keep garbage data.

 注意  即使使用STRSAFE_NO_TRUNCATION做标记,原字符串的字符仍会被复制,这取决于目的buffer的最后有效字符。然后目的buffer的首个和最后一个字符被置为'/0'。如果不是出于安全的目的、你不想保有垃圾数据的话,这样做并非重要。

There is a last detail to mention that is related to the remark that you read at the bottom of page 21. In Figure 2-4, the 0xfd value is used to replace all the characters after the '/0', up to the end of the destination buffer. With the Ex version of these functions, you can choose whether you want this expensive filling operation (especially if the destination buffer is large) to occur and with which byte value. If you add STRSAFE_FILL_BEHIND_NULL to dwFlag, the remaining characters are set to '/0'. When you replace STRSAFE_FILL_BEHIND_NULL with the STRSAFE_FILL_BYTE macro, the given byte value is used to fill up the remaining values of the destination buffer.


Windows String Functions Windows的字符串函数

Windows also offers various functions for manipulating strings. Many of these functions, such as lstrcat and lstrcpy, are now deprecated because they do not detect buffer overrun problems. Also, the ShlwApi.h file defines a number of handy string functions that format operating system—related numeric values, such as StrFormatKBSize and StrFormatByteSize. See http://msdn2.microsoft.com/en-us/library/ms538658.aspx for a description of shell string handling functions.


It is common to want to compare strings for equality or for sorting. The best functions to use for this are CompareString(Ex) and CompareStringOrdinal. You use CompareString(Ex) to compare strings that will be presented to the user in a linguistically correct manner. Here is the prototype of the CompareString function:


int CompareString(

   LCID locale,

   DWORD dwCmdFlags,

   PCTSTR pString1,

   int cch1,

   PCTSTR pString2, int cch2);

This function compares two strings. The first parameter to CompareString specifies a locale ID (LCID), a 32-bit value that identifies a particular language. CompareString uses this LCID to compare the two strings by checking the meaning of the characters as they apply to a particular language. A linguistically correct comparison produces results much more meaningful to an end user. However, this type of comparison is slower than doing an ordinal comparison. You can get the locale ID of the calling thread by calling the Windows GetThreadLocale function:


LCID GetThreadLocale();

The second parameter of CompareString identifies flags that modify the method used by the function to compare the two strings. Table 2-4 shows the possible flags.


Table 2-4: Flags Used by the CompareString Function

NORM_IGNOREKANATYPEDo not differentiate between hiragana and katakana characters.不区分平假名和片假名。
NORM_IGNORESYMBOLSIgnore symbols.  忽略符号。
NORM_IGNOREWIDTHDo not differentiate between a single-byte character and the same character as a double-byte character.  不区分相同字符的单字节和双字节字符。
SORT_STRINGSORTTreat punctuation the same as symbols.  将标点按符号处理。

The remaining four parameters of CompareString specify the two strings and their respective lengths in characters (not in bytes). If you pass negative values for the cch1 parameter, the function assumes that the pString1 string is zero-terminated and calculates the length of the string. This also is true for the cch2 parameter with respect to the pString2 string. If you need more advanced linguistic options, you should take a look at the CompareStringEx functions.


To compare strings that are used for programmatic strings (such as pathnames, registry keys/ values, XML elements/attributes, and so on), use CompareStringOrdinal:


int CompareStringOrdinal(

  PCWSTR pString1,

  int cchCount1,

  PCWSTR pString2,

  int cchCount2,

  BOOL bIgnoreCase);

This function performs a code-point comparison without regard to the locale, and therefore it is fast. And because programmatic strings are not typically shown to an end user, this function makes the most sense. Notice that only Unicode strings are expected by this function.


The CompareString and CompareStringOrdinal functions' return values are unlike the return values you get back from the C run-time library's *cmp string comparison functions. CompareString(Ordinal) returns 0 to indicate failure, CSTR_LESS_THAN (defined as 1) to indicate that pString1 is less than pString2, CSTR_EQUAL (defined as 2) to indicate that pString1 is equal to pString2, and CSTR_GREATER_THAN (defined as 3) to indicate that pString1 is greater than pString2. To make things slightly more convenient, if the functions succeed, you can subtract 2 from the return value to make the result consistent with the result of the C run-time library functions (-1, 0, and +1).






