我正在写一个小的多线程http文件下载器,希望能够收缩可用的线程,因为代码遇到错误
这些错误将特定于返回的http错误,其中web服务器不允许再进行任何连接
如果我设置了一个由5个线程组成的池,每个线程都试图打开它自己的连接并下载一个文件块。服务器可能只允许2个连接,我相信会返回503个错误,我想检测这个并关闭一个线程,最终将池的大小限制为服务器允许的2个
我能让线自动停止吗?在
self.线程\u stop()是否足够?在
我还需要join()吗?在
这是我的worker类,它执行下载,从队列中获取数据进行处理,一旦下载,它就会将结果转储到resultQ中,并由主线程保存到文件中
在这里,我想检测一个http503并从可用池中停止/终止/删除一个线程-当然,将失败的块重新添加到队列中,以便其余的线程处理它class Downloader(threading.Thread):
def __init__(self, queue, resultQ, file_name):
threading.Thread.__init__(self)
self.workQ = queue
self.resultQ = resultQ
self.file_name = file_name
def run(self):
while True:
block_num, url, start, length = self.workQ.get()
print 'Starting Queue #: %s' % block_num
print start
print length
#Download the file
self.download_file(url, start, length)
#Tell queue that this task is done
print 'Queue #: %s finished' % block_num
self.workQ.task_done()
def download_file(self, url, start, length):
request = urllib2.Request(url, None, headers)
if length == 0:
return None
request.add_header('Range', 'bytes=%d-%d' % (start, start + length))
while 1:
try:
data = urllib2.urlopen(request)
except urllib2.URLError, u:
print "Connection did not start with", u
else:
break
chunk = ''
block_size = 1024
remaining_blocks = length
while remaining_blocks > 0:
if remaining_blocks >= block_size:
fetch_size = block_size
else:
fetch_size = int(remaining_blocks)
try:
data_block = data.read(fetch_size)
if len(data_block) == 0:
print "Connection: [TESTING]: 0 sized block" + \
" fetched."
if len(data_block) != fetch_size:
print "Connection: len(data_block) != length" + \
", but continuing anyway."
self.run()
return
except socket.timeout, s:
print "Connection timed out with", s
self.run()
return
remaining_blocks -= fetch_size
chunk += data_block
resultQ.put([start, chunk])
下面是我初始化线程池的地方,再往下,我将项目放入队列
^{pr2}$