strtok()的实现原理是什么?

按理说这是对所有ANSI C的问题,不针对具体编译器

大家都知道strtok()函数能把一个字符串按特定的分割方式一段一段取出来,典型的用法是:
char input[] = "abc,defgh,ij,klm";
char *p;
p = strtok(input, ",");
if (p) printf("%s/n", p); //显示"abc"

p = strtok(NULL, ",");
if (p) printf("%s/n", p); //显示"defgh"

p = strtok(NULL, ",");
if (p) printf("%s/n", p); //显示"ij"

p = strtok(NULL, ",");
if (p) printf("%s/n", p); //显示"klm"

对于第一次调用strtok(),大家都很明白,函数把abc后面的逗号改成NULL,返回值p指向&input[0],显示出来就是"abc"
后面的几次蛮奇怪,目标字符串竟然是NULL!
为什么要这样用呢?strtok是用什么办法记住上一次调用的目标字符串是input的呢?

寻找原程序:
>man 3 strtok
... ...
LIBRARY
Standard C Library (libc, -lc)
... ...

>strings /usr/lib/libc.a | grep strtok
strtok_r
__strtok_r
... ...
$FreeBSD: src/lib/libc/string/strtok.c,v 1.9 2002/09/07 02:53:19 tjr Exp $
... ...

>cat /usr/src/lib/libc/string/strtok.c
/*-
* Copyright (c) 1998 Softweyr LLC. All rights reserved.
*
* strtok_r, from Berkeley strtok
* Oct 13, 1998 by Wes Peters <wes@softweyr.com>
*
* Copyright (c) 1988, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notices, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notices, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by Softweyr LLC, the
* University of California, Berkeley, and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY SOFTWEYR LLC, THE REGENTS AND CONTRIBUTORS
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
* PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SOFTWEYR LLC, THE
* REGENTS, OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
* PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
* LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
* NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#if defined(LIBC_SCCS) && !defined(lint)
static char sccsid[] = "@(#)strtok.c 8.1 (Berkeley) 6/4/93";
#endif /* LIBC_SCCS and not lint */
#include <sys/cdefs.h>
__FBSDID("$FreeBSD: src/lib/libc/string/strtok.c,v 1.9 2002/09/07 02:53:19 tjr Exp $");

#include <stddef.h>
#ifdef DEBUG_STRTOK
#include <stdio.h>
#endif
#include <string.h>

char *__strtok_r(char *, const char *, char **);

__weak_reference(__strtok_r, strtok_r);

char *
__strtok_r(char *s, const char *delim, char **last)
{
char *spanp, *tok;
int c, sc;

if (s == NULL && (s = *last) == NULL)
return (NULL);

/*
* Skip (span) leading delimiters (s += strspn(s, delim), sort of).
*/
cont:
c = *s++;
for (spanp = (char *)delim; (sc = *spanp++) != 0;) {
if (c == sc)
goto cont;
}

if (c == 0) { /* no non-delimiter characters */
*last = NULL;
return (NULL);
}
tok = s - 1;

/*
* Scan token (scan for delimiters: s += strcspn(s, delim), sort of).
* Note that delim must have one NUL; we stop if we see that, too.
*/
for (;;) {
c = *s++;
spanp = (char *)delim;
do {
if ((sc = *spanp++) == c) {
if (c == 0)
s = NULL;
else
s[-1] = '/0';
*last = s;
return (tok);
}
} while (sc != 0);
}
/* NOTREACHED */
}

char *
strtok(char *s, const char *delim)
{
static char *last; /*定义一个静态变量*/

return (__strtok_r(s, delim, &last));
}

#ifdef DEBUG_STRTOK
/*
* Test the tokenizer.
*/
int
main(void)
{
char blah[80], test[80];
char *brkb, *brkt, *phrase, *sep, *word;

sep = "///:;=-";
phrase = "foo";

printf("String tokenizer test:/n");
strcpy(test, "This;is.a:test:of=the/string//tokenizer-function.");
for (word = strtok(test, sep); word; word = strtok(NULL, sep))
printf("Next word is /"%s/"./n", word);
strcpy(test, "This;is.a:test:of=the/string//tokenizer-function.");

for (word = strtok_r(test, sep, &brkt); word;
word = strtok_r(NULL, sep, &brkt)) {
strcpy(blah, "blah:blat:blab:blag");

for (phrase = strtok_r(blah, sep, &brkb); phrase;
phrase = strtok_r(NULL, sep, &brkb))
printf("So far we're at %s:%s/n", word, phrase);
}

return (0);
}

#endif /* DEBUG_STRTOK */


>cat /usr/include/string.h | grep strtok
char *strtok(char * __restrict, const char * __restrict);
char *strtok_r(char *, const char *, char **);

注:
restrict表示,在这个函数内,这两个指针的值得任何改变,都是通过这两个指针进行的。这样,编译器就可以自由优化了。从而使C可以达到Fortran一样的运算效率。
C99支持才支持restrict。

GCC 是支持C99的,但其默认值不是C99标准,为了使用C99语法可以在编译参数中加入 -std=c99(使用了但是还是出现error: invalid use of `restrict',gcc version 3.4.2 [FreeBSD] 20040728)

自己认为,最好使用strtok_r,而不使用strtok.

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值