1.问题代码
static int is_valid_ip(const char *ip)
{
int section = 0;
int dot = 0;
int last = -1;
while (*ip) //因为\0的ASCII码是0
{
if (*ip == '.') {//检测到.则对之前的section进行检验
dot++;
if (dot > 3)
{
return 0;
}//最多3个‘.’,超过说明异常
if (section >= 0 && section <= 255)
//应该先检验255,因为从下面看,有符号数会被第一时间检测到异常,所以小于0的情况不会出现
{
section = 0;//清零
}
else
{
return 0;//无效的section
}
}
else if (*ip >= '0' && *ip <= '9') //该位为数字
{
section = section * 10 + *ip - '0';//据此计算section,旧的数字乘以10以后加上现在的数字
if (last == '0') //上一位数字为0
{
return 0;//异常,因为不应该出现两位0
}
}
else//IP异常的情况
{
return 0;
}
last = *ip;
ip++;
}
if (section >= 0 && section <= 255) {//判定第四段section合法
if (3 == dot) {
section = 0;
return 1;
}
}
return 0;
}
2.主要问题
第一是,没有对section进行计数,导致形如“.XXX.XXX.XXX”的非法地址能够通过检验;
第二是,last判别非法存在误判,如101这种地址,last为0时,读取到个位的1时,会判定为非法地址;正确的判定方法应该为,如果当前section为0,而当前读到的数字也为0,则判定为非法,这样无论上一位是不是'.',都可以正确的判定了;
第三是,没有对连续多个点的异常进行判别,形如“XXX..XXX.XXX”的非法地址能够通过检验;
第四是,形如“XXX.XXX.XXX.”的非法地址能够通过检验(因为最后一个点的解析时会把section置为0,能够通过合法性检验)。
3.IPV4地址特征
1.字符串长度在7~15之间(1*4+3=7,3*4+3=15);【预判别,避免在解析超长非法IP上·花费大量时间】
2.有四个数字地址段;【可计数判别,4个】
3.数字段的段间有且仅有一个'.'作为分隔符;【根据四个数字段,可用计数方法判别,3个】
4.起始和结尾都是数字段;【满足2,3时自然满足4】
5.每个数字段合法,即满足0≤XXX≤255,当且仅当该数字段取值为0时,最高位可为0,此时数字应只有一位。【在识别数字段时若最高位为0,则要求下一位不能继续是数字】;
6.整个字符串中只能出现'0'~'9'和'.'。
4.控制策略
先根据长度进行预判,然后依次解析字符,用一个变量位判别下一位是否一定是点或者数字(注意这里是不对称的,因为数字后面可能是数字,但是点后面一定是数字),所以需要有第三个状态表示下一位可能为数字或点,解析过程中,对section取值范围和dot计数合法性进行检查,最终需要检查数目。
5.源码参考
代码来源:源码
/*
* inet_aton.c,v 1.3 1993/05/19 03:39:32 jch Exp
*/
/* Gated Release 3.5 */
/* Copyright (c) 1990,1991,1992,1993,1994,1995 by Cornell University. All */
/* rights reserved. Refer to Particulars and other Copyright notices at */
/* the end of this file. */
/* */
#include <sys/types.h>
#include <netinet/in.h>
/*
* Check whether "cp" is a valid ascii representation
* of an Internet address and convert to a binary address.
* Returns 1 if the address is valid, 0 if not.
* This replaces inet_addr, the return value from which
* cannot distinguish between failure and a local broadcast address.
*/
int
inet_aton(const char *cp, struct in_addr *ap)
{
int dots = 0;
register u_long acc = 0, addr = 0;
do {
register char cc = *cp;
switch (cc) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
acc = acc * 10 + (cc - '0');
break;
case '.':
if (++dots > 3) {
return 0;
}
/* Fall through */
case '\0':
if (acc > 255) {
return 0;
}
addr = addr << 8 | acc;
acc = 0;
break;
default:
return 0;
}
} while (*cp++) ;
/* Normalize the address */
if (dots < 3) {
addr <<= 8 * (3 - dots) ;
}
/* Store it if requested */
if (ap) {
ap->s_addr = htonl(addr);
}
return 1;
}
/*
* ------------------------------------------------------------------------
*
* GateD, Release 3.5
*
* Copyright (c) 1990,1991,1992,1993,1994,1995 by Cornell University.
* All rights reserved.
*
* THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT
* LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY
* AND FITNESS FOR A PARTICULAR PURPOSE.
*
* Royalty-free licenses to redistribute GateD Release
* 3 in whole or in part may be obtained by writing to:
*
* GateDaemon Project
* Information Technologies/Network Resources
* 200 CCC
* Cornell University
* Ithaca, NY 14853-2601 USA
*
* GateD is based on Kirton's EGP, UC Berkeley's routing
* daemon (routed), and DCN's HELLO routing Protocol.
* Development of GateD has been supported in part by the
* National Science Foundation.
*
* Please forward bug fixes, enhancements and questions to the
* gated mailing list: gated-people@gated.cornell.edu.
*
* ------------------------------------------------------------------------
*
* Portions of this software may fall under the following
* copyrights:
*
* Copyright (c) 1988 Regents of the University of California.
* All rights reserved.
*
* Redistribution and use in source and binary forms are
* permitted provided that the above copyright notice and
* this paragraph are duplicated in all such forms and that
* any documentation, advertising materials, and other
* materials related to such distribution and use
* acknowledge that the software was developed by the
* University of California, Berkeley. The name of the
* University may not be used to endorse or promote
* products derived from this software without specific
* prior written permission. THIS SOFTWARE IS PROVIDED
* ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,
* INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
*/
其优点在于switch循环简洁易懂,效率高。但是缺点也很明显,就是它只对地址段的取值和点的数目进行了校验,未考虑点相对于地址段的位置和地址段的数目,导致有很多非法值能通过合法性检验。即这是一段优美但是不完备的代码,正确的IP能通过判别,错误的IP有时也能。
比如形如.XXX.XXX.XXX.XXX或形如XXX.XXX.XXX或形如XXX..XXX.XXX或形如XXX.XXX.等形式的非法IP 地址,而且对于011.001.001.001这类不符合常规的地址也能通过校验,而且对过长的IP地址没有进行预筛选。
下一篇会给出:
1.修改后的代码;
2.93版官方代码;
3.新版官方代码。