strlen为何如此高效

Solaris上初步做了一个简单的性能比对，以下是得到的性能数据(以strlen的数据为例)：

strlen 执行时间是：32762毫秒
my_strlen执行时间是：491836毫秒

strlen 执行时间是：35075毫秒
my_strlen执行时间是：770397毫秒

1 /* Return the length of the null-terminated string STR.  Scan for
2    the null terminator quickly by testing four bytes at a time.  */
3 size_t strlen (str)  const char *str;
4 {
5         const char *char_ptr;
6         const unsigned long int *longword_ptr;
7         unsigned long int longword, magic_bits, himagic, lomagic;
8
9         /* Handle the first few characters by reading one character at a time.
10            Do this until CHAR_PTR is aligned on a longword boundary.  */
11
12         for (char_ptr = str; ((unsigned long int) char_ptr
13              & (sizeof (longword) - 1)) != 0;
14              ++char_ptr)
15                 if (*char_ptr == '/0')
16                         return char_ptr - str;
17
18         /* All these elucidatory comments refer to 4-byte longwords,
19            but the theory applies equally well to 8-byte longwords.  */
20
21         longword_ptr = (unsigned long int *) char_ptr;
22
23         himagic = 0x80808080L;
24         lomagic = 0x01010101L;
25
26         if (sizeof (longword) > 8)
27                 abort ();
28
30            we will test a longword at a time.  The tricky part is testing
31            if *any of the four* bytes in the longword in question are zero.  */
32
33         for (;;)
34         {
35                 longword = *longword_ptr++;
36
37                 if ( ((longword - lomagic) & himagic) != 0)
38                 {
39                         /* Which of the bytes was the zero?  If none of them were, it was
40                            a misfire; continue the search.  */
41
42                         const char *cp = (const char *) (longword_ptr - 1);
43
44                         if (cp[0] == 0)
45                                 return cp - str;
46                         if (cp[1] == 0)
47                                 return cp - str + 1;
48                         if (cp[2] == 0)
49                                 return cp - str + 2;
50                         if (cp[3] == 0)
51                                 return cp - str + 3;
52                         if (sizeof (longword) > 4)
53                         {
54                                 if (cp[4] == 0)
55                                         return cp - str + 4;
56                                 if (cp[5] == 0)
57                                         return cp - str + 5;
58                                 if (cp[6] == 0)
59                                         return cp - str + 6;
60                                 if (cp[7] == 0)
61                                         return cp - str + 7;
62                         }
63                 }
64         }
65 }

1) C标准库要求有很好的移植性，在绝大部分系统体系结构下都应该能正确运行。那么每次拿出4个字节比较(unsigned long int)，就需要考虑内存对齐问题，传入的字符串的首字符地址可不一定在4对齐的地址上；
2) 如何对四个字节进行测试，找出其中某个字节为全0，这是个技巧问题。
12～21行的代码解决的就是第一个问题：
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '/0')
return char_ptr - str;
/* All these elucidatory comments refer to 4-byte longwords,
but the theory applies equally well to 8-byte longwords.  */
longword_ptr = (unsigned long int *) char_ptr;

himagic = 0x80808080L;
lomagic = 0x01010101L;

himagic   1000 0000 1000 0000 1000 0000 1000 0000
lomagic   0000 0001 0000 0001 0000 0001 0000 0001

longword  1000 0001 1000 0001 1000 0001 1000 0001，然后按照那个条件表达式计算后，居然也满足!=0的条件，是不是作者的逻辑有问题呢？后来转念一想，这种逻辑是有“前提条件”的。回顾一下strlen是做什么的，其输入参数是任意的么？当然不是。输入的字符串中每个字符的值都在[0, 127]的ascii码范围内，也就是说每个字节最高位的bit都是0，这样longword就应该是如下这个样子了：
longword  0xxx xxxx 0xxx xxxx 0xxx xxxx 0xxx xxxx

longword 0000 0001 0000 0001 0000 0001 0000 0001

longword 0000 0000 0000 0001 0000 0001 0000 0001

• 点赞
• 评论 1
• 分享
x

海报分享

扫一扫，分享海报

• 收藏
• 手机看

分享到微信朋友圈

x

扫一扫，手机阅读

• 打赏

打赏

Hashmat

你的鼓励将是我创作的最大动力

C币 余额
2C币 4C币 6C币 10C币 20C币 50C币
• 一键三连

点赞Mark关注该博主, 随时了解TA的最新博文
10-24
10-30 1685

10-15 1万+
04-10 5277
10-28 712
02-28 1万+
03-28 1678
07-14 1184
07-12 21