PHP 的 base_convert 函数能在任意进制之间转换数字,这是常识。那么请你不要实际运行,用常识判断一下,这句代码运行的结果:
echo base_convert('http://demon.tw', 16, 10);
如果你的答案是 222,那么恭喜你答对了,其实上面那句代码跟这句是一样的:
echo base_convert('de', 16, 10);
也就是说,base_convert 函数会忽略掉该进制以外的其他字符。下面通过 base_convert 函数的 C 源码来分析原因,base_convert 函数定义在 PHP 源码的 ext/standard/math.c 中:
/* {{{ proto string base_convert(string number, int frombase, int tobase)
Converts a number in a string from any base <= 36 to any base <= 36 */
PHP_FUNCTION(base_convert)
{
zval **number, **frombase, **tobase, temp;
char *result;
if (ZEND_NUM_ARGS() != 3 || zend_get_parameters_ex(3, &number, &frombase, &tobase) == FAILURE) {
WRONG_PARAM_COUNT;
}
convert_to_string_ex(number);
convert_to_long_ex(frombase);
convert_to_long_ex(tobase);
if (Z_LVAL_PP(frombase) < 2 || Z_LVAL_PP(frombase) > 36) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid `from base' (%ld)", Z_LVAL_PP(frombase));
RETURN_FALSE;
}
if (Z_LVAL_PP(tobase) < 2 || Z_LVAL_PP(tobase) > 36) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Invalid `to base' (%ld)", Z_LVAL_PP(tobase));
RETURN_FALSE;
}
if(_php_math_basetozval(*number, Z_LVAL_PP(frombase), &temp) != SUCCESS) {
RETURN_FALSE;
}
result = _php_math_zvaltobase(&temp, Z_LVAL_PP(tobase) TSRMLS_CC);
RETVAL_STRING(result, 0);
}
前面几行都是解析和校验参数是否正确,关键代码是 _php_math_basetozval 和 _php_math_zvaltobase 函数,_php_math_basetozval 定义如下:
/* {{{ _php_math_basetozval */
/*
* Convert a string representation of a base(2-36) number to a zval.
*/
PHPAPI int _php_math_basetozval(zval *arg, int base, zval *ret)
{
long num = 0;
double fnum = 0;
int i;
int mode = 0;
char c, *s;
long cutoff;
int cutlim;
if (Z_TYPE_P(arg) != IS_STRING || base < 2 || base > 36) {
return FAILURE;
}
s = Z_STRVAL_P(arg);
cutoff = LONG_MAX / base;
cutlim = LONG_MAX % base;
for (i = Z_STRLEN_P(arg); i > 0; i--) {
c = *s++;
/* might not work for EBCDIC */
if (c >= '0' && c <= '9')
c -= '0';
else if (c >= 'A' && c <= 'Z')
c -= 'A' - 10;
else if (c >= 'a' && c <= 'z')
c -= 'a' - 10;
else
continue;
if (c >= base)
continue;
switch (mode) {
case 0: /* Integer */
if (num < cutoff || (num == cutoff && c <= cutlim)) {
num = num * base + c;
break;
} else {
fnum = num;
mode = 1;
}
/* fall-through */
case 1: /* Float */
fnum = fnum * base + c;
}
}
if (mode == 1) {
ZVAL_DOUBLE(ret, fnum);
} else {
ZVAL_LONG(ret, num);
}
return SUCCESS;
}
/* }}} */
代码太长看起来很烦,关键是这一段:
for (i = Z_STRLEN_P(arg); i > 0; i--) {
c = *s++;
/* might not work for EBCDIC */
if (c >= '0' && c <= '9')
c -= '0';
else if (c >= 'A' && c <= 'Z')
c -= 'A' - 10;
else if (c >= 'a' && c <= 'z')
c -= 'a' - 10;
else
continue;
if (c >= base)
continue;
遍历字符串,碰到除了 [0-9a-zA-Z] 以外的字符只是用 continue 直接跳到下一次循环,所以其他字符并不影响进制的转换。而且当 c 大于 base 时也是直接跳到下一次循环,所以该进制以外的其他字母亦不会影响进制的转换。这是 base_convert 函数的一个 BUG 呢,还是设计者有意为之?