来源:http://www.jeffhung.net/blog/articles/jeffhung/1064/
本文援引自JeffHung.Blog。由于台湾和大陆在术语上并不统一,因此我对原文作了术语的改动,同时按照自己的一些理解,对原文不明晰、不严谨的地方作了一些改动。感谢jeffhung对该问题作出的探讨。
最近写程序的时候,发现不同的 compiler,其 snprintf() 的行为不太一样(本人在移植程序至终端时,由于平台模拟器和终端采用的不同编译器,亦碰到了这个问题,而且导致了一个严重BUG)。
- int sprintf(
- char *buffer
- , const char *format
- [, argument] ... );
- int snprintf(
- char *buffer
- , size_t count
- , const char *format
- [, argument] ... );
snprintf() 的功能是,就好像 printf()/fprintf()/sprintf() 一样,给定一个 Format-control string (格式化控制字符串),以及额外的一些不定个数的参数,将会依序依指定的格式,填入 Format-control string 里面,以 % 开头的字段 (format specifier 格式限定符, e.g %c %d)。printf() 和 fprintf() 会把结果,输出到 STDOUT 或指定的 FILE stream,而 sprintf() 则是会把结果,填入第一个参数:一个 C-style 字符串 buffer。然而,由于 sprintf() 在被调用后,无法获得 destination buffer (目标缓冲区)的大小,因此,若结果比实际上的 buffer 还要长,就会造成 buffer overflow 的问题。因此,snprintf() 多出了第二个参数:buffer 的大小,以避免这个问题。
Buffer 不够大时,snprintf() 会印出什么?
在正常使用下,snprintf() 很好用,但若是 buffer 大小不够填充如整个 source string的时候呢?例如,以下的程序先将一块 buffer buf 全部填成 'x',然后最后一个字符设成 NULL,呼叫 snprintf() 之后,将其返回值与 buffer 的内容印出:
- #include <stdio.h>
- #include <string.h>
- #ifdef _MSC_VER
- # define snprintf _snprintf
- #endif
- int main()
- {
- char buf[16];
- int ret;
- memset(buf, 'x', sizeof(buf)); // fill with 'x'
- buf[(sizeof(buf) / sizeof(buf[0])) - 1] = 0; // make last char null
- ret = snprintf(buf, 4, "%s", "0123456789");
- printf("ret: %d/n", ret);
- printf("buf: %s/n", buf);
- return 0;
- }
如果是 GCC,执行结果如下:
- //ret: 10
- //buf: 012
但如果是 VC6,执行结果竟然如下:
snprintf() 不是 C89 有规定的标准函数,但好歹在 C99 时,已经被标准列入。
各家说法
且让我们先看看各家的说法如何:
FreeBSD Man Pages(FreeBSD 手册)是这么形容 snprintf() 的:
PRINTF(3) FreeBSD Library Functions Manual PRINTF(3)
NAME
printf, fprintf, sprintf, snprintf, asprintf, vprintf, vfprintf,
vsprintf, vsnprintf, vasprintf -- formatted output conversion
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <stdio.h>
...
int
snprintf(char * restrict str, size_t size, const char * restrict format,
...);
...
int
vsnprintf(char * restrict str, size_t size, const char * restrict format,
va_list ap);
DESCRIPTION
...
These functions return the number of characters printed (not including
the trailing `/0' used to end output to strings) or a negative value if
an output error occurs, except for snprintf() and vsnprintf(), which
return the number of characters that would have been printed if the size
were unlimited (again, not including the final `/0').
...
The snprintf() and vsnprintf() functions will write at most size-1 of the
characters printed into the output string (the size'th character then
gets the terminating `/0'); if the return value is greater than or equal
to the size argument, the string was too short and some of the printed
characters were discarded. The output is always null-terminated.
...
Microsoft Visual C++ 的 MSDN 是这么说的(Visual C++ 6.0):
_snprintf,
...
Write formatted data to a string.
int _snprintf( char *buffer, size_t count, const char *format [, argument] ... );
...
Return Value
_snprintf returns the number of bytes stored in buffer, not counting the terminating null character. If the number of bytes required to store the data exceeds count, then count bytes of data are stored in buffer and a negative value is returned.
...
Parameters
buffer
Storage location for output
count
Maximum number of characters to store
format
Format-control string
argument
Optional arguments
Remarks
The _snprintf function formats and stores count or fewer characters and values (including a terminating null character that is always appended unless count is zero or the formatted string length is greater than or equal to count characters) in buffer. Each argument (if any) is converted and output according to the corresponding format specification in format. The format consists of ordinary characters and has the same form and function as the format argument for printf. If copying occurs between strings that overlap, the behavior is undefined.
MSDN 的(Visual Studio 2005)中解释的更为清晰:
If len < count, then len characters are stored in buffer, a null-terminator is appended, and len is returned.
If len = count, then len characters are stored in buffer, no null-terminator is appended, and len is returned.
If len > count, then count characters are stored in buffer, no null-terminator is appended, and a negative value is returned.
而 C99 则是这么说:
7.19.6.5 The snprintf function
1 Synopsis
#include <stdio.h>
int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);
Description
2 The snprintf function is equivalent to fprintf, except that the output is
written into an array (specified by argument s) rather than to a stream.
If n is zero, nothing is written, and s may be a null pointer. Otherwise,
output characters beyond the n-1st are discarded rather than being written
to the array, and a null character is written at the end of the characters
actually written into the array. If copying takes place between objects
that overlap, the behavior is undefined.
Returns
3 The snprintf function returns the number of characters that would have
been written had n been sufficiently large, not counting the terminating
null character, or a neg ative value if an encoding error occurred. Thus,
the null-terminated output has been completely written if and only if the
returned value is nonnegative and less than n.
行为差异一:snprintf() 的返回值
首先,就返回值的部份,C99 的说法有些绕口:
The snprintf function returns the number of characters that would have been written had n been sufficiently large, not counting the terminating null character, or a negative value if an encoding error occurred.
为避免因笔者的英文程度不够好而有所误解,特地请教了 lukhnos ,确认了 C99 的意思是: 不管传递给snprintf的 n 有多大,snprintf() 都会回传,假设 n 一定够大时,会输出至目标buff s的长度(也就是说返回值是通过Format-control string 和 optional arguments 所确定 的 formatted data string 的长度(当然不包含作为C-Style 字符串结束符 的 ‘/0’))。
FreeBSD 的行为,与 C99 是一致的:「... except for snprintf() and vsnprintf(), which return the number of characters that would have been printed if the size were unlimited ...」。
但 Microsoft Visual C++ 的行为,则与 C99 不同:「If the number of bytes required to store the data exceeds count, then count bytes of data are stored in buffer and a negative value is returned.」亦即,只要 buffer 不够大,就一律回传负值。
行为差异二:snprintf() 实际输出的数据
除了返回值,Microsoft Visual C++ 的行为与别人不同以外,连印出来的部份,也不相同。在最一开始的范例里,GCC 的 snprintf(),将 "012/0" 存到了 buf 里,连同 null character,一共是 4 个 characters;而 Microsoft Visual C++ 则是将 "0123" 存到了 buf 里,没有附上 null character,故若范例里没有做特别处理的话,将 buf 印出时,会出问题。
C99 说:「output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array.」翻成中文就是,将会输出 n 个字符,包含额外附加的 null character。
FreeBSD 的 manpage 也这么说:「The snprintf() and vsnprintf() functions will write at most size-1 of the characters printed into the output string (the size'th character then gets the terminating `/0');...」故范例程序里,GCC 印出 4 - 1 个字符,也就是 "012" 然后附上一个 null character,也就是 "012/0",这样的行为,是符合 C99 标准的。
而 MSDN 的说法则与 C99 标准不符:「If the number of bytes required to store the data exceeds count, then count bytes of data are stored in buffer...」从范例程序的执行结果来看,4 个字符 "0123" 被存到了 buf 里,但因为没有附上 null character,故 buf 的其它部份,都还是 x,以及一个防止印出 buf 时出错的最后一个 null character。
返回输出长度的意义
事实上,C99 规定 snprintf() 不管 buffer 够不够大,一定回传 作为输入的 formatted data string(Format-control string 和 optional arguments 所确定),这样的设计是很好用的,因为,大部分的时候,我们其实并不能知道,buffer 够不够大。是故,
在使用C99及符合C99标准的C库时(是指可以返回formatted data string 长度的C实现),程序通常这么写:
- #include <stdio.h>
- #include <stdlib.h>
- #ifdef _MSC_VER
- # include <io.h>
- # define STDOUT_FILENO 1
- # define write _write
- # define snprintf _snprintf
- #else
- # include <sys/types.h>
- # include <sys/uio.h>
- # include <unistd.h>
- #endif
- /** Write a "Hello, <name>!/n" message to file descriptor STDOUT_FILENO. */
- void write_hello(const char* name)
- {
- char* pbuf = 0;
- int size;
- // Get required size, and allocate enough memory
- size = snprintf(0, 0, "Hello, %s!/n", name);
- pbuf = (char*)malloc(size + 1);
- // Do the formatting
- snprintf(pbuf, size + 1, "Hello, %s!/n", name);
- // Write formatted string
- write(STDOUT_FILENO, pbuf, size);
- // Free allocated memory
- free(pbuf);
- }
- int main()
- {
- write_hello("sign"); // 4 chars: total 13 chars when write
- write_hello("jeffhung"); // 8 chars: total 17 chars when write
- write_hello("Honorificabilitudinitatibus"); // 27 chars: total 36 chars when write
- return 0;
- }
- // --[OUTPUT(GCC)]------------------------------------------------------------
- // Hello, sign!
- // Hello, jeffhung!
- // Hello, Honorificabilitudinitatibus!
- // --[OUTPUT(VC6)]------------------------------------------------------------
- // (crashed)程序崩溃
- // --[OUTPUT(VS2005)]------------------------------------------------------------
- // Hello, sign!
- // Hello, jeffhung!
- // Hello, Honorificabilitudinitatibus!
注:对write的解释说明:
write函数的第一个参数为文件、设备句柄,其中
0:标准输入 stdin
1:标准输出 stdout
2:标准错误输出 stderr
可以参见VS2005 <stdio.h>:
- #ifndef _STDSTREAM_DEFINED
- #define stdin (&__iob_func()[0])
- #define stdout (&__iob_func()[1])
- #define stderr (&__iob_func()[2])
- #define _STDSTREAM_DEFINED
- #endif
在上面可以看到VS2005还是可以得到正确输出的,经过验证,VS2005的C库(C Run-Time Libraries)和VC6 的C库中 对 _snprintf 处理不同。
在 _snprintf 第一个参数为0的时候,使用VC6 Debug模式调试程序,会出现 “Debug Assertion Failed… ASSERT(string != NULL)…”的报错信息 , 而在Release模式下调试程序,则会直接崩溃。也就是说MS VC6的C库实现者们 认为 0 是不会出现的输入,而在VS2005 中我们则看不到这个问题了,应该是MS CRT(C Run-Time Libraries) Develper 意识到了这点 ,在 目标 BUFF == NULL 的情况下返回了formatted data string 的长度(奇怪的是,在MSDN文档中竟然没有提到这点)。
但如果当 buffer 不够大时,不能回传实际上需要的 buffer 大小时,我们就只能够用试误法,去逼出真正需要的大小:
- #include <stdio.h>
- #include <stdlib.h>
- #ifdef _MSC_VER
- # include <io.h>
- # define STDOUT_FILENO 1
- # define write _write
- # define snprintf _snprintf
- #else
- # include <sys/types.h>
- # include <sys/uio.h>
- # include <unistd.h>
- #endif
- /** Write a "Hello, <name>!/n" message to file descriptor STDOUT_FILENO. */
- void write_hello(const char* name)
- {
- char* pbuf = 0;
- int size = 0;
- int len;
- do {
- size += 16;
- printf("[DEBUG] size == %d/n", size);
- // Allocate a buffer, don't know whether it is big enough or not
- pbuf = (char*)realloc(pbuf, size); // will do malloc if pbuf is NULL
- // ------------------------------------------------------------------
- // Do the formatting
- // ------------------------------------------------------------------
- // MSDN:
- // Let len be the length of the formatted data string (not including
- // the terminating null).
- // - If len < count, then len characters are stored in buffer,
- // a null-terminator is appended,
- // and len is returned.
- // - If len = count, then len characters are stored in buffer,
- // no null-terminator is appended,
- // and len is returned.
- // - If len > count, then count characters are stored in buffer,
- // no null-terminator is appended,
- // and a negative value is returned.
- // ------------------------------------------------------------------
- // Since snprintf in VC may not append a null-terminator, we pass
- // (size - 1) as the 2nd parameter and reserve the last buffer
- // element for appending the null-terminator by our self.
- // ------------------------------------------------------------------
- len = snprintf(pbuf, (size - 1), "Hello, %s!/n", name);
- printf("[DEBUG] len == %d/n", len);
- } while (len < 0);
- pbuf[len] = '/0';
- // Write formatted string
- write(STDOUT_FILENO, pbuf, len);
- // Free allocated memory
- free(pbuf);
- }
- int main()
- {
- write_hello("sign"); // 4 chars: total 13 chars when write
- write_hello("jeffhung"); // 8 chars: total 17 chars when write
- write_hello("Honorificabilitudinitatibus"); // 27 chars: total 36 chars when write
- return 0;
- }
- // --[OUTPUT(GCC)]------------------------------------------------------------
- // [DEBUG] size == 16
- // [DEBUG] len == 13
- // Hello, sign!
- // [DEBUG] size == 16
- // [DEBUG] len == 17
- // Hello, jeffhun[DEBUG] size == 16
- // [DEBUG] len == 36
- // Hello, Honorif
- // --[OUTPUT(VC6)]------------------------------------------------------------
- // [DEBUG] size == 16
- // [DEBUG] len == 13
- // Hello, sign!
- // [DEBUG] size == 16
- // [DEBUG] len == -1
- // [DEBUG] size == 32
- // [DEBUG] len == 17
- // Hello, jeffhung!
- // [DEBUG] size == 16
- // [DEBUG] len == -1
- // [DEBUG] size == 32
- // [DEBUG] len == -1
- // [DEBUG] size == 48
- // [DEBUG] len == 36
- // Hello, Honorificabilitudinitatibus!
当我们要 say “hello” to “
jeffhung” 时,因为长度为 16 的 buf 不够大,因此多做了一次循环。如果 buf 增长大小的速度,与给定的 name 字符串长度差很远的话,循环就要跑很多遍,不断地在 realloc(),不断地调用 snprintf(),不断地浪费时间。
因此,C99 规定不管n够不够大,snprintf() 都要返回预计将要输出的长度(strlen (formatted_data_string)),这样的设计,非常的有效率。
解决的办法
考虑到 local buffer 的效率比 dynamic buffer 要来的好,并整合标准与不标准的 snprintf() 用法,最后,我们更希望能够尽可能地调用snprintf() 的次数。所以,上面的程序,可以做如下修改:
- #include <stdio.h>
- #include <stdlib.h>
- #ifdef _MSC_VER
- # include <io.h>
- # define STDOUT_FILENO 1
- # define write _write
- # define snprintf _snprintf
- #else
- # include <sys/types.h>
- # include <sys/uio.h>
- # include <unistd.h>
- #endif
- /** Write a "Hello, <name>!/n" message to file descriptor STDOUT_FILENO. */
- void write_hello(const char* name)
- {
- char buf[16];
- char* pbuf = buf;
- int pbuf_size = sizeof(buf);
- int len = 0;
- int again = 0;
- printf("[DEBUG] name == /"%s/"/n", name);
- do {
- if (again) {
- #ifdef _MSC_VER
- pbuf_size += sizeof(buf);
- #else
- pbuf_size = len + 1;
- #endif
- pbuf = (pbuf == buf) ? (char *)malloc(pbuf_size)
- : (char *)realloc(pbuf, pbuf_size);
- }
- printf("[DEBUG] pbuf_size == %d/n", pbuf_size);
- len = snprintf(pbuf, pbuf_size, "Hello, %s!/n", name);
- printf("[DEBUG] len == %d/n", len);
- } while (again = ((len < 0) || (pbuf_size <= len)));
- #ifdef _MSC_VER
- pbuf[len] = '/0';
- #endif
- printf("[DEBUG] {%d} %s", len, pbuf); // to verify the null-terminator
- write(STDOUT_FILENO, pbuf, len);
- if (pbuf != buf) {
- printf("[DEBUG] free pbuf/n");
- free(pbuf);
- }
- }
- int main()
- {
- write_hello("sign"); // 4 chars: total 13 chars when write
- write_hello("jeffhung"); // 8 chars: total 17 chars when write
- write_hello("Honorificabilitudinitatibus"); // 27 chars: total 36 chars when write
- return 0;
- }
- // --[OUTPUT(GCC)]------------------------------------------------------------
- // [DEBUG] name == "sign"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 13
- // [DEBUG] {13} Hello, sign!
- // Hello, sign!
- // [DEBUG] name == "jeffhung"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 17
- // [DEBUG] pbuf_size == 18
- // [DEBUG] len == 17
- // [DEBUG] {17} Hello, jeffhung!
- // Hello, jeffhung!
- // [DEBUG] free pbuf
- // [DEBUG] name == "Honorificabilitudinitatibus"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 36
- // [DEBUG] pbuf_size == 37
- // [DEBUG] len == 36
- // [DEBUG] {36} Hello, Honorificabilitudinitatibus!
- // Hello, Honorificabilitudinitatibus!
- // [DEBUG] free pbuf
- // --[OUTPUT(VC6)]------------------------------------------------------------
- // [DEBUG] name == "sign"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 13
- // [DEBUG] {13} Hello, sign!
- // Hello, sign!
- // [DEBUG] name == "jeffhung"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 32
- // [DEBUG] len == 17
- // [DEBUG] {17} Hello, jeffhung!
- // Hello, jeffhung!
- // [DEBUG] free pbuf
- // [DEBUG] name == "Honorificabilitudinitatibus"
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 32
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 48
- // [DEBUG] len == 36
- // [DEBUG] {36} Hello, Honorificabilitudinitatibus!
- // Hello, Honorificabilitudinitatibus!
- // [DEBUG] free pbuf
如此的写法,仅需在两处地方动用 preprocessing directive(预处理指令),解决
snprintf() 行为的差异。
然而,如果每次用到 snprintf() 的时候,都要回忆一下 write_hello() 是怎么写的,然后依样画葫芦,这样也太蠢了。不过,其实我们可以把 write_hello() 改造一下,并援引 vsnprintf(),就可以写出 strprintf(),像 snprintf() 一般,但不是输出到一个 buffer,而是输出到一个 C++ std::string 里:
- #include <stdio.h>
- #include <stdlib.h>
- #include <stdarg.h>
- #include <string>
- #ifdef _MSC_VER
- # include <io.h>
- # define STDOUT_FILENO 1
- # define write _write
- # define snprintf _snprintf
- # define vsnprintf _vsnprintf
- #else
- # include <sys/types.h>
- # include <sys/uio.h>
- # include <unistd.h>
- #endif
- std::string strprintf(const char* fmt, ...)
- {
- char buf[16];
- char* pbuf = buf;
- int pbuf_size = sizeof(buf);
- int len = 0;
- int again = 0;
- va_list ap;
- va_start(ap, fmt);
- do {
- if (again) {
- #ifdef _MSC_VER
- pbuf_size += sizeof(buf);
- #else
- pbuf_size = len + 1;
- #endif
- pbuf = (char*)((pbuf == buf) ? (char *)malloc(pbuf_size)
- : (char *)realloc(pbuf, pbuf_size));
- }
- printf("[DEBUG] pbuf_size == %d/n", pbuf_size);
- len = vsnprintf(pbuf, pbuf_size, fmt, ap);
- printf("[DEBUG] len == %d/n", len);
- } while (again = ((len < 0) || (pbuf_size <= len)));
- #ifdef _MSC_VER
- pbuf[len] = '/0';
- #endif
- printf("[DEBUG] {%d} %s", len, pbuf); // to verify the null-terminator
- std::string str(pbuf);
- if (pbuf != buf) {
- printf("[DEBUG] free pbuf/n");
- free(pbuf);
- }
- return str;
- }
- void write_hello(const char* name)
- {
- // 9 chars: counting ending /n,
- // but not counting %s replacement and null-terminator
- std::string hello = strprintf("Hello, %s!/n", name);
- write(STDOUT_FILENO, hello.c_str(), hello.length());
- }
- int main()
- {
- write_hello("sign"); // 4 chars: total 13 chars when write
- write_hello("jeffhung"); // 8 chars: total 17 chars when write
- write_hello("Honorificabilitudinitatibus"); // 27 chars: total 36 chars when write
- return 0;
- }
- // --[OUTPUT(C99)]------------------------------------------------------------
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 13
- // [DEBUG] {13} Hello, sign!
- // Hello, sign!
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 17
- // [DEBUG] pbuf_size == 18
- // [DEBUG] len == 17
- // [DEBUG] {17} Hello, jeffhung!
- // [DEBUG] free pbuf
- // Hello, jeffhung!
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 36
- // [DEBUG] pbuf_size == 37
- // [DEBUG] len == 36
- // [DEBUG] {36} Hello, Honorificabilitudinitatibus!
- // [DEBUG] free pbuf
- // Hello, Honorificabilitudinitatibus!
- // --[OUTPUT(VC)]-------------------------------------------------------------
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == 13
- // [DEBUG] {13} Hello, sign!
- // Hello, sign!
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 32
- // [DEBUG] len == 17
- // [DEBUG] {17} Hello, jeffhung!
- // [DEBUG] free pbuf
- // Hello, jeffhung!
- // [DEBUG] pbuf_size == 16
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 32
- // [DEBUG] len == -1
- // [DEBUG] pbuf_size == 48
- // [DEBUG] len == 36
- // [DEBUG] {36} Hello, Honorificabilitudinitatibus!
- // [DEBUG] free pbuf
- // Hello, Honorificabilitudinitatibus!
如此一来,只要我们是用 C++,就可以很方便地,利用
strprintf(),援引 printf 系列的强大功能,产生格式化字符串。