想查询所有短域名是否被注册了,a-z1-9组成的所有3,4位组合的.com域名是否有没被注册的,于是开干,google后知道万网提供一个api可以查询,
http://panda.www.net.cn/cgi-bin/check.cgi?area_domain=google.com, 后面的google.com可以换做你想查询的域名,结果会返回一个xml页面,如下
123456
<?xml version=
"1.0"
encoding=
"gb2312"
?>
<property>
<returncode>200<
/returncode
>
<key>google.com<
/key
>
<original>211 : Domain name is not available<
/original
>
<
/property
>
上面的是域名已经被注册的返回页面,下面的是域名未被注册的页面,代码分别是211,210
123456
<?xml version=
"1.0"
encoding=
"gb2312"
?>
<property>
<returncode>200<
/returncode
>
<key>googleloveyou.com<
/key
>
<original>210 : Domain name is available<
/original
>
<
/property
>
域名查询api有了,于是想办法解决如何生成所有 字母和数字 组成的3,4,5位组合,这是难点,生成所有3位组合,大家很容易解决,但是生成3,4,5位的貌似有点犯难了, google一下没获得好的信息,于是自己画图,自己想,和朋友求思路,最后终于解决了,思路是 用数字代替字母,
如 string = "abcdefghijklmnopqrstuvwxyz1234567890"
到时生成的4位域名组合为 aaaa aaab aaac aaad .... aaba aabb ... ...abaa abab等,转化成数字分别为 0000 0001 0002 0003 .... 0010 0011 ... ... 0100 0101 这些数字代表的是string的index,当数字为字母长度时,上一位数字+1,最后将数字列表转换成字母列表即可,脚本如下:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182#!/usr/bin/python
# coding: utf-8
# author: GuangHongwei
# date: 2014/7/28
# mail:
import
time
import
urllib
import
re
api =
"http://panda.www.net.cn/cgi-bin/check.cgi?area_domain=%s"
# api地址
string =
"abcdefghijklmnopqrstuvwxyz1234567890"
# 所有字母
string_len = len(string)
# 长度
fname =
'name.txt'
# 还没被注册的域名写入该文件
suffix =
'.com'
# 域名后缀
domain_lenth_range = range(3, 5)
# 字母组合的长度,3到5但不包括5
def min(num):
""
"初始化第一个值数字列表"
""
name = []
for
i
in
range(num):
name.append(0)
return
name
def max(num, max_num):
""
"返回最大的值数字列表"
""
name = []
for
i
in
range(num):
name.append(max_num)
return
name
def num_2_string(name, string):
""
"将数字列表转化为字母组合列表"
""
new_name = []
for
i
in
name:
new_name.append(string[i])
return
''
.
join
(new_name)
def is_ava(domain):
""
"判断该域名是否被注册"
""
data = urllib.urlopen(api % domain).
read
()
ava_pattern = re.compile(r
'<original>(.*) : .*</original>'
)
perm_pattern = re.compile(r
'Forbidden'
)
result = ava_pattern.findall(data)
if
'210'
in
result:
'%s ---------> Ok'
% domain
return
True
elif
'211'
in
result:
'%s ---------> No'
% domain
return
False
else
:
'Forbidden'
return
False
def domain_name(num):
""
"域名组合生成器"
""
name = min(num)
last = max(num, string_len-1)
while
True:
yield num_2_string(name, string)
if
name == last:
break
name[num-1] += 1
while
string_len
in
name:
index = name.index(string_len)
name[index] = 0
name[index-1] += 1
def run(domain_lenth):
""
"执行,如果每被注册就写到文件中"
""
f =
open
(fname,
'a'
)
for
domain
in
domain_name(domain_lenth):
domain += suffix
if
is_ava(domain):
f.write(
'%s\n'
% domain)
f.flush()
time
.
sleep
(0.5)
if
__name__ ==
'__main__'
:
""
"最终执行, 循环执行每种长度组合"
""
for
i
in
domain_lenth_range:
run(i)
脚本可以通过附件下载,直接运行即可,结果如下
最后,由于万网的api,查询过于频繁,时间间隔太短,很快就会被封ip的,注意些。