批量下载RFC文档(python实现)

RFC文档有很多,有时候在没有联网的情况下也想翻阅,只能下载一份留存本地了。
看了看地址列表,大概是这个范围:
http://www.networksorcery.com/enp/rfc/rfc1000.txt
...
http://www.networksorcery.com/enp/rfc/rfc6409.txt

哈哈,很适合批量下载,第一个想到的就是迅雷……
可用的时候发现它只支持三位数的扩展(用的是迅雷7),我想要下的刚好是四位数……
郁闷之下萌生自己做一个的想法!
这东西很适合用python做,原理很简单,代码也很少,先读为快。
代码如下:

 1 #! /usr/bin/python
2 '''
3 File : getRFC.py
4 Author : Mike
5 E-Mail : Mike_Zhang@live.com
6 '''
7 import urllib,os,shutil,time
8
9 def downloadHtmlPage(url,tmpf = ''):
10 i = url.rfind('/')
11 fileName = url[i+1:]
12 if tmpf : fileName = tmpf
13 print url,"->",fileName
14 urllib.urlretrieve(url,fileName)
15 print 'Downloaded ',fileName
16 time.sleep(0.2)
17 return fileName
18
19 # http://www.networksorcery.com/enp/rfc/rfc1000.txt
20 # http://www.networksorcery.com/enp/rfc/rfc6409.txt
21 if __name__ == '__main__':
22 addr = 'http://www.networksorcery.com/enp/rfc'
23 dirPath = "RFC"
24 #startIndex = 1000
25 startIndex = int(raw_input('start : '))
26 #endIndex = 6409
27 endIndex = int(raw_input('end : '))
28 if startIndex > endIndex :
29 print 'Input error!'
30 if False == os.path.exists(dirPath):
31 os.makedirs(dirPath)
32 fileDownloadList = []
33 logFile = open("log.txt","w")
34 for i in range(startIndex,endIndex+1):
35 try:
36 t_url = '%s/rfc%d.txt' % (addr,i)
37 fileName = downloadHtmlPage(t_url)
38 oldName = './'+fileName
39 newName = './'+dirPath+'/'+fileName
40 if True == os.path.exists(oldName):
41 shutil.move(oldName,newName)
42 print 'Moved ',oldName,' to ',newName
43 except:
44 msgLog = 'get %s failed!' % (i)
45 print msgLog
46 logFile.write(msgLog+'\n')
47 continue
48 logFile.close()

除了RFC文档,这个程序稍加修改也可以做其它事情:比如批量下载MP3、电子书等等。

好,就这些了,希望对你有帮助。

转载于:https://my.oschina.net/u/3579120/blog/1532749

rfc是网络协义的重要学习资源,为方便大家查看,特收藏整理如下。下面是其中一篇内容: Network Working Group Steve Crocker Request for Comments: 1 UCLA 7 April 1969 Title: Host Software Author: Steve Crocker Installation: UCLA Date: 7 April 1969 Network Working Group Request for Comment: 1 CONTENTS INTRODUCTION I. A Summary of the IMP Software Messages Links IMP Transmission and Error Checking Open Questions on the IMP Software II. Some Requirements Upon the Host-to-Host Software Simple Use Deep Use Error Checking III. The Host Software Establishment of a Connection High Volume Transmission A Summary of Primitives Error Checking Closer Interaction Open Questions Crocker [Page 1] RFC 1 Host Software 7 April 1969 IV. Initial Experiments Experiment One Experiment Two Introduction The software for the ARPA Network exists partly in the IMPs and partly in the respective HOSTs. BB&N has specified the software of the IMPs and it is the responsibility of the HOST groups to agree on HOST software. During the summer of 1968, representatives from the initial four sites met several times to discuss the HOST software and initial experiments on the network. There emerged from these meetings a working group of three, Steve Carr from Utah, Jeff Rulifson from SRI, and Steve Crocker of UCLA, who met during the fall and winter. The most recent meeting was in the last week of March in Utah. Also present was Bill Duvall of SRI who has recently started working with Jeff Rulifson. Somewhat independently, Gerard DeLoche of UCLA has
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值