《Windows Via C/C++》边学习，边翻译（五）操作字符和字符串-4

最新推荐文章于 2023-12-05 22:27:25 发布

Direwolf

最新推荐文章于 2023-12-05 22:27:25 发布

阅读量1.9k

点赞数

分类专栏： Windows Via C/C++ 文章标签： windows buffer character string function parameters

Windows Via C/C++ 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Secure String Functions in the C Run-Time Library

How to Get More Control When Performing String Operations

怎样在执行字符串处理时得到更多控制

In addition to the new secure string functions, the C run-time library has some new functions that provide more control when performing string manipulations. For example, you can control the filler values or how truncation is performed. Naturally, the C run time offers both ANSI (A) versions of the functions as well as Unicode (W) versions of the functions. Here are the prototypes for some of these functions (and many more exist that are not shown here):

除了新的安全版本的字符串函数之外，C运行期库中有一些新函数，在执行字符串操作时能提供更多控制。例如，可以控制填充符的值或是如何执行（字符串）截断。C运行期既提供ANSI(A)版本又提供Unicode(W)版本的函数。这里是其中一些函数原型（其它更多函数并未列出）：

HRESULT StringCchCat(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc);

HRESULT StringCchCatEx(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc,

PTSTR * ppszDestEnd, size_t * pcchRemaining, DWORD dwFlags);

HRESULT StringCchCopy(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc);

HRESULT StringCchCopyEx(PTSTR pszDest, size_t cchDest, PCTSTR pszSrc,

PTSTR * ppszDestEnd, size_t * pcchRemaining, DWORD dwFlags);

HRESULT StringCchPrintf(PTSTR pszDest, size_t cchDest,

PCTSTR pszFormat, ...);

HRESULT StringCchPrintfEx(PTSTR pszDest, size_t cchDest,

PTSTR * ppszDestEnd, size_t * pcchRemaining, DWORD dwFlags,

PCTSTR pszFormat,...);

You'll notice that all the methods shown have "Cch" in their name. This stands for Count of characters, and you'll typically use the _countof macro to get this value. There is also a set of functions that have "Cb" in their name, such as StringCbCat(Ex), StringCbCopy(Ex), and StringCbPrintf(Ex). These functions expect that the size argument is in count of bytes instead of count of characters. You'll typically use the sizeof operator to get this value.

你注意到所有方法名中都含有"Cch"，它表示Count of characters，可以通过宏_countof得到其值。也有一组名称中含有"Cb"的函数，比如StringCbCat(Ex)，StringCbCopy(Ex)和StringCbPrintf(Ex)。这些函数通过字节计数而非字符来预计引数长，通过使用sizeof来得到其值。

All these functions return an HRESULT with one of the values shown in Table 2-2.

所有这些函数都返回类型为HRESULT，返回值为表2-2中所列值之一。

Table 2-2: HRESULT Values for Safe String Functions

HRESULT Value	Description
S_OK	Success. The destination buffer contains the source string and is terminated by '/0'. 成功。目的buffer存放源字符串并以结束符'/0'结束。
STRSAFE_E_INVALID_PARAMETER	Failure. The NULL value has been passed as a parameter. 失败。将空指针NULL传给了一个参数。
STRSAFE_E_INSUFFICIENT_BUFFER	Failure. The given destination buffer was too small to contain the entire source string. 失败。提供的目的buffer太小，无法存放整个源字符串。

Unlike the secure (_s suffixed) functions, when a buffer is too small, these functions do perform truncation. You can detect such a situation when STRSAFE_E_INSUFFICIENT_BUFFER is returned. As you can see in StrSafe.h, the value of this code is 0x8007007a and is treated as a failure by SUCCEEDED/FAILED macros. However, in that case, the part of the source buffer that could fit into the destination writable buffer has been copied and the last available character is set to '/0'. So, in the previous example, szBuffer would contain the string "012345678" if StringCchCopy is used instead of _tcscpy_s. Notice that the truncation feature might or might not be what you need, depending on what you are trying to achieve, and this is why it is treated as a failure (by default). For example, in the case of a path that you are building by concatenating different pieces of information, a truncated result is unusable. If you are building a message for user feedback, this could be acceptable. It's up to you to decide how to handle a truncated result.

不同于安全版本（带_s后缀）函数，当buffer太小时，这些函数执行截断。此情形下可以检测到返回值STRSAFE_E_INSUFFICIENT_BUFFER。在StrSafe.h中可以看到，此错误码值为0x8007007a，它以SUCCEEDED/FAILED宏当作失败的。但是，源buffer中（长度）合适的部分会复制到可写的目的buffer中，而最后一个有效字符被置为'/0'。所以在之前的例子中，如果用函数StringCchCopy代替_tcscpy_s的话，szBuffer将会是"012345678"。注意（字符串）截断可能是、也可能不是你所需要的，这取决于你要达到什么目的，这也是它被当作是失败（默认情况下）的原因。例如，当拼接不同信息段组成一个路径时，截断结果是不可用的；当建立一个用于用户反馈的信息时，截断操作是容许的。如何把握截断结果取决于你。

Last but not least, you'll notice that an extended (Ex) version exists for many of the functions shown earlier. These extended versions take three additional parameters, which are described in Table 2-3.

最后但并非最不重要的，你会注意到许多前面出现的函数都存在一个扩展(Ex)版本。这些扩展版本有三个附加参数，如表2-3描述。

Table 2-3: Extended Version Parameters

Parameters and Values	Description
*size_t pcchRemaining**	Pointer to a variable that indicates the number of unused characters in the destination buffer. The copied terminating '/0' character is not counted. For example, if one character is copied into a buffer that is 10 characters wide, 9 is returned even though you won't be able to use more than 8 characters without truncation. If pcchRemaining is NULL, the count is not returned. 指向一个变量，该变量指示出目的buffer中未使用字符的个数。复制的结束符'/0'不计。例如，如果向一个10字符长的buffer中复制一个字符，将返回9，尽管在无截断的情况下不能使用多于8个字符。如果pcchRemaining是NULL，计数不返回。
*LPTSTR ppszDestEnd**	If ppszDestEnd is non-NULL, it points to the terminating '/0' character at the end of the string contained by the destination buffer. 如果ppszDestEnd非空，则指向目的buffer中字符串尾的结束符'/0'。
DWORD dwFlags	One or more of the following values separated by '\|'. 一个或多个以下的值，由'\|'分隔。
STRSAFE_FILL_BEHIND_NULL	If the function succeeds, the low byte of dwFlags is used to fill the rest of the destination buffer, just after the terminating '/0' character. (See the comment about STRSAFE_FILL_BYTE just after this table for more details.) 如果函数成功，dwFlags的低字节填入目的buffer中结束符'/0'之后的剩余部分。（参看此表之后关于STRSAFE_FILL_BYTE的注释，以获得更多信息。）
STRSAFE_IGNORE_NULLS	Treats NULL string pointers like empty strings (TEXT("")). 将NULL字符串指针看作空字符串(TEXT(""))。
STRSAFE_FILL_ON_FAILURE	If the function fails, the low byte of dwFlags is used to fill the entire destination buffer except the first '/0' character used to set an empty string result. (See the comment about STRSAFE_FILL_BYTE just after this table for more details.) In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any character in the string being returned is replaced by the filler byte value. 如果函数失败，除了首个填入'/0'的字符以确保结果是空字符串之外，用dwFlags的低字节填入整个目的buffer中。（参看此表之后关于STRSAFE_FILL_BYTE的注释，以获得更多信息。）当发生STRSAFE_E_INSUFFICIENT_BUFFER失败的情况，任何所返回字符串中的字符都被填充符替代。
STRSAFE_NULL_ON_FAILURE	If the function fails, the first character of the destination buffer is set to '/0' to define an empty string (TEXT("")). In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any truncated string is overwritten. 如果函数失败，目的buffer中的首个字符被置为'/0'以定义一个孔字符串(TEXT(""))。当发生STRSAFE_E_INSUFFICIENT_BUFFER失败的情况，任何阶段字符串都被改写。
STRSAFE_NO_TRUNCATION	As in the case of STRSAFE_NULL_ON_FAILURE, if the function fails, the destination buffer is set to an empty string (TEXT("")). In the case of a STRSAFE_E_INSUFFICIENT_BUFFER failure, any truncated string is overwritten. STRSAFE_NULL_ON_FAILURE的情况，如果函数失败，目的buffer被置为空字符串(TEXT(""))。在STRSAFE_E_INSUFFICIENT_BUFFER的情况下，任何阶段字符串都被改写。

Note Even if STRSAFE_NO_TRUNCATION is used as a flag, the characters of the source string are still copied, up to the last available character of the destination buffer. Then both the first and the last characters of the destination buffer are set to '/0'. This is not really important except if, for security purposes, you don't want to keep garbage data.

注意即使使用STRSAFE_NO_TRUNCATION做标记，原字符串的字符仍会被复制，这取决于目的buffer的最后有效字符。然后目的buffer的首个和最后一个字符被置为'/0'。如果不是出于安全的目的、你不想保有垃圾数据的话，这样做并非重要。

There is a last detail to mention that is related to the remark that you read at the bottom of page 21. In Figure 2-4, the 0xfd value is used to replace all the characters after the '/0', up to the end of the destination buffer. With the Ex version of these functions, you can choose whether you want this expensive filling operation (especially if the destination buffer is large) to occur and with which byte value. If you add STRSAFE_FILL_BEHIND_NULL to dwFlag, the remaining characters are set to '/0'. When you replace STRSAFE_FILL_BEHIND_NULL with the STRSAFE_FILL_BYTE macro, the given byte value is used to fill up the remaining values of the destination buffer.

还有最后一个细节需要注意，在图2-4中，值0xfd替代了目的buffer中'/0'之后的所有字符。这些函数的Ex版本，可以选择是否要执行这种高昂的填充操作（尤其是目的buffer很大时）以及填充什么值。如果给dwFlag加上STRSAFE_FILL_BEHIND_NULL值，剩余字符会被置为'/0'。如果用宏STRSAFE_FILL_BYTE代替STRSAFE_FILL_BEHIND_NULL，给定的字符会被填入目的buffer的剩余字节中。

Windows String Functions Windows的字符串函数

Windows also offers various functions for manipulating strings. Many of these functions, such as lstrcat and lstrcpy, are now deprecated because they do not detect buffer overrun problems. Also, the ShlwApi.h file defines a number of handy string functions that format operating system—related numeric values, such as StrFormatKBSize and StrFormatByteSize. See http://msdn2.microsoft.com/en-us/library/ms538658.aspx for a description of shell string handling functions.

Windows同样提供了各种操作字符串的函数。其中有许多函数，例如lstrcat和lstrcpy，由于不检测缓冲区溢出问题，已经不赞成再使用了。并且，ShlwApi.h文件中定义了许多易于使用的字符串函数，它们对与操作系统相关的字符串进行格式化，例如StrFormatKBSize和StrFormatByteSize。参考http://msdn2.microsoft.com/en-us/library/ms538658.aspx获得shell字符串处理函数的细节。

It is common to want to compare strings for equality or for sorting. The best functions to use for this are CompareString(Ex) and CompareStringOrdinal. You use CompareString(Ex) to compare strings that will be presented to the user in a linguistically correct manner. Here is the prototype of the CompareString function:

比较字符串是否相等或对其排序是常见的（操作），对这种情况最好的函数是CompareString(Ex)和CompareStringOrdinal。使用CompareString(Ex)来比较字符串会在言语角度以修正方式呈现给用户。以下是函数CompareString的原型：

int CompareString(

   LCID locale,

   DWORD dwCmdFlags,

   PCTSTR pString1,

   int cch1,

   PCTSTR pString2, int cch2);

This function compares two strings. The first parameter to CompareString specifies a locale ID (LCID), a 32-bit value that identifies a particular language. CompareString uses this LCID to compare the two strings by checking the meaning of the characters as they apply to a particular language. A linguistically correct comparison produces results much more meaningful to an end user. However, this type of comparison is slower than doing an ordinal comparison. You can get the locale ID of the calling thread by calling the Windows GetThreadLocale function:

此函数用来比较两字符串。第一个参数指定一个本地ID(LCID)，它是32位的，确定一种特定语言。CompareString检查LCID所应用的特定语言的字符含义，来进行字符串比较。语言修正比较对最终用户产生更多的含义。但是这种比较方式比顺序比较慢。通过Windows的GetThreadLocale函数可以得到调用线程的本地ID：

LCID GetThreadLocale();

The second parameter of CompareString identifies flags that modify the method used by the function to compare the two strings. Table 2-4 shows the possible flags.

CompareString的第二个参数标记出函数比较两字符串所使用的方法。图2-4列出可能的标记：

Table 2-4: Flags Used by the CompareString Function

Flag	Meaning
NORM_IGNORECASE LINGUISTIC_IGNORECASE	Ignore case difference. 忽略大小写。
NORM_IGNOREKANATYPE	Do not differentiate between hiragana and katakana characters.不区分平假名和片假名。
NORM_IGNORENONSPACE LINGUISTIC_IGNOREDIACRITIC	Ignore nonspacing characters.
NORM_IGNORESYMBOLS	Ignore symbols. 忽略符号。
NORM_IGNOREWIDTH	Do not differentiate between a single-byte character and the same character as a double-byte character. 不区分相同字符的单字节和双字节字符。
SORT_STRINGSORT	Treat punctuation the same as symbols. 将标点按符号处理。

The remaining four parameters of CompareString specify the two strings and their respective lengths in characters (not in bytes). If you pass negative values for the cch1 parameter, the function assumes that the pString1 string is zero-terminated and calculates the length of the string. This also is true for the cch2 parameter with respect to the pString2 string. If you need more advanced linguistic options, you should take a look at the CompareStringEx functions.

函数CompareString剩余的四个参数指定两个字符串及它们各自的字符长度（并非字节长度）。如果给参数cch1传了负值，函数会假定字符串pStirng1是零字符结尾，并计算字符串长度。同样，字符串pString2和参数cch2也是如此。如果需要更多的语言选项，应该查看函数CompareStringEx。

To compare strings that are used for programmatic strings (such as pathnames, registry keys/ values, XML elements/attributes, and so on), use CompareStringOrdinal:

比较提纲性的字符串时（比如路径名、注册表键/值、XML元素/属性、等等），应使用CompareStringOrdinal。

int CompareStringOrdinal(

  PCWSTR pString1,

  int cchCount1,

  PCWSTR pString2,

  int cchCount2,

  BOOL bIgnoreCase);

This function performs a code-point comparison without regard to the locale, and therefore it is fast. And because programmatic strings are not typically shown to an end user, this function makes the most sense. Notice that only Unicode strings are expected by this function.

此函数不考虑场所直接进行码点比较，因此速度很快。并且由于提纲性字符串通常不显示给终端用户，所以此函数就很有意义。注意此函数只接受Unicode字符串。

The CompareString and CompareStringOrdinal functions' return values are unlike the return values you get back from the C run-time library's *cmp string comparison functions. CompareString(Ordinal) returns 0 to indicate failure, CSTR_LESS_THAN (defined as 1) to indicate that pString1 is less than pString2, CSTR_EQUAL (defined as 2) to indicate that pString1 is equal to pString2, and CSTR_GREATER_THAN (defined as 3) to indicate that pString1 is greater than pString2. To make things slightly more convenient, if the functions succeed, you can subtract 2 from the return value to make the result consistent with the result of the C run-time library functions (-1, 0, and +1).

函数CompareString和CompareStringOrdinal的返回值，不像C运行期库的*cmp字符串比较函数所返回的值。CompareString(Ordinal)返回0指示失败，CSTR_LESS_THAN（定义为1）指示pString1小于pString2，CSTR_EQUAL（定义为2）指示pString1与pString2相等，CSTR_GREATER_THAN（定义为3）指示pString1大于pString2。为了稍微方便些，如果函数成功，可以用返回值减去2，来使结果与C运行期库函数的返回值一致（-1，0和+1）。