试探<ctype.h>

        产品有个可以用户自定义输入的ID,原来输入范围是8个阿拉伯数组,设置时会做合规性检查。最近因为ID取值空间太小,要ID定义修改为8个字符组成字符串,字符可以为数字或字母,即{0-9,A-Z,a-z}。功能很简单,调用ctype.h中定义的isalnum()就可以了,网上搜索来的定义一般如代码清单1:

代码清单1:

int isalnum (int __c);

且慢,打开ctype.h头文件Crtl+F一下,又发现一个宏定义形式的isalnum(),如下:

代码清单2:

#define isalnum(__c)	(__ctype_lookup(__c)&(_U|_L|_N))

在ctype.h头文件再仔细看,看到一些列的宏定义

代码清单3:

#define	_U	01
#define	_L	02
#define	_N	04
#define	_S	010
#define _P	020
#define _C	040
#define _X	0100
#define	_B	0200

#ifdef __HAVE_LOCALE_INFO__
const char *__locale_ctype_ptr (void);
#else
#define __locale_ctype_ptr()	_ctype_
#endif

# define __CTYPE_PTR	(__locale_ctype_ptr ())

#ifndef __cplusplus
/* These macros are intentionally written in a manner that will trigger
   a gcc -Wall warning if the user mistakenly passes a 'char' instead
   of an int containing an 'unsigned char'.  Note that the sizeof will
   always be 1, which is what we want for mapping EOF to __CTYPE_PTR[0];
   the use of a raw index inside the sizeof triggers the gcc warning if
   __c was of type char, and sizeof masks side effects of the extra __c.
   Meanwhile, the real index to __CTYPE_PTR+1 must be cast to int,
   since isalpha(0x100000001LL) must equal isalpha(1), rather than being
   an out-of-bounds reference on a 64-bit machine.  */
#define __ctype_lookup(__c) ((__CTYPE_PTR+sizeof(""[__c]))[(int)(__c)])

#define	isalpha(__c)	(__ctype_lookup(__c)&(_U|_L))
#define	isupper(__c)	((__ctype_lookup(__c)&(_U|_L))==_U)
#define	islower(__c)	((__ctype_lookup(__c)&(_U|_L))==_L)
#define	isdigit(__c)	(__ctype_lookup(__c)&_N)
#define	isxdigit(__c)	(__ctype_lookup(__c)&(_X|_N))
#define	isspace(__c)	(__ctype_lookup(__c)&_S)
#define ispunct(__c)	(__ctype_lookup(__c)&_P)
#define isalnum(__c)	(__ctype_lookup(__c)&(_U|_L|_N))
#define isprint(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N|_B))
#define	isgraph(__c)	(__ctype_lookup(__c)&(_P|_U|_L|_N))
#define iscntrl(__c)	(__ctype_lookup(__c)&_C)

顺着宏定义一路找,找到最后发现_ctype_这个宏定义没有展开,那么这个_ctype_优势何方神圣呢?

在内网搜了一圈没有结果,最后架上梯子,在纽约州立大学布法罗分校网站的挂着一个小型操作系统代码存档文件里找到一个_ctype_的原型,它是一个char类型的数组,定义如代码清单4:

代码清单4:


00007 
00008 
00009 #include <ctype.h>
00010 
00011 char _ctype_[] = {
00012         0,
00013         _C,     _C,     _C,     _C,     _C,     _C,     _C,     _C,
00014         _C,     _S,     _S,     _S,     _S,     _S,     _C,     _C,
00015         _C,     _C,     _C,     _C,     _C,     _C,     _C,     _C,
00016         _C,     _C,     _C,     _C,     _C,     _C,     _C,     _C,
00017         _S,     _P,     _P,     _P,     _P,     _P,     _P,     _P,
00018         _P,     _P,     _P,     _P,     _P,     _P,     _P,     _P,
00019 #ifdef linux
00020         _D,     _D,     _D,     _D,     _D,     _D,     _D,     _D,
00021         _D,     _D,     _P,     _P,     _P,     _P,     _P,     _P,
00022 #else
00023         _N,     _N,     _N,     _N,     _N,     _N,     _N,     _N,
00024         _N,     _N,     _P,     _P,     _P,     _P,     _P,     _P,
00025 #endif
00026         _P,     _U|_X,  _U|_X,  _U|_X,  _U|_X,  _U|_X,  _U|_X,  _U,
00027         _U,     _U,     _U,     _U,     _U,     _U,     _U,     _U,
00028         _U,     _U,     _U,     _U,     _U,     _U,     _U,     _U,
00029         _U,     _U,     _U,     _P,     _P,     _P,     _P,     _P,
00030         _P,     _L|_X,  _L|_X,  _L|_X,  _L|_X,  _L|_X,  _L|_X,  _L,
00031         _L,     _L,     _L,     _L,     _L,     _L,     _L,     _L,
00032         _L,     _L,     _L,     _L,     _L,     _L,     _L,     _L,
00033         _L,     _L,     _L,     _P,     _P,     _P,     _P,     _C
00034 };

        至此才看出isalpha(__c)这个宏的完整的工作原理:首先定义了一个129字节的数组,数组第一个值为0,随后是ASCII 字符集的0-127对应的字符类型(数字、大写字母、小写字母、十六进制数...),具体参见代码清单5定义。

代码清单5:

#define	_U	01    // 大写字母字符
#define	_L	02    // 大写字母字符
#define	_N	04    // 数字字符
#define	_S	010   // 空白字符 
#define _P	020   // 标点字符
#define _C	040   // 控制字符
#define _X	0100  // 十六进制数字字符
#define	_B	0200  //

        这个预定义的数组为快速检测字符类型提供了方便,例如 isalpha()isdigit()isspace() 等,这些函数用于检查字符的属性,可以方便地进行字符分类。

        具体到isalpha(__c)的展开如下:

 isalnum(__c)展开得到(__ctype_lookup(__c)&(_U|_L|_N));而__ctype_lookup(__c)展开为 ((__CTYPE_PTR+sizeof(""[__c]))[(int)(__c)]);__CTYPE_PTR最终展开为_ctype_。

按照代码中的注释说明sizeof(""[__c])不论__c是多少位的字符集,sizeof(""[__c])返回1.并且确保查找的时候数组下标不会越界。

代码清单6:

/* These macros are intentionally written in a manner that will trigger
   a gcc -Wall warning if the user mistakenly passes a 'char' instead
   of an int containing an 'unsigned char'.  Note that the sizeof will
   always be 1, which is what we want for mapping EOF to __CTYPE_PTR[0];
   the use of a raw index inside the sizeof triggers the gcc warning if
   __c was of type char, and sizeof masks side effects of the extra __c.
   Meanwhile, the real index to __CTYPE_PTR+1 must be cast to int,
   since isalpha(0x100000001LL) must equal isalpha(1), rather than being
   an out-of-bounds reference on a 64-bit machine.  */
#define __ctype_lookup(__c) ((__CTYPE_PTR+sizeof(""[__c]))[(int)(__c)])

最终__ctype_lookup(__c)就简化为_ctype_[__c+1];

isalnum(__c)最终展开为_ctype_[__c+1]&(_U|_L|_N),若非数字或字母,返回0,若为数字或字母,返回_U、_L、_N三者之一。

  • 8
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值