网上得知,python3.x 不再有urllib2,
python3中,urllib和urllib2进行了合并,现在只有一个urllib模块
样式为:
申请号 | .... | 附件地址 |
123 | http://..................xls | |
456 | http://..................xls |
import pandas as pd
import os
import re
import urllib.request as request
def auto_save_file(path):
directory, file_name = os.path.split(path)
while os.path.isfile(path):
pattern = '(\d+)\)\.'
if re.search(pattern, file_name) is None:
file_name = file_name.replace('.', '(0).')
else:
current_number = int(re.findall(pattern, file_name)[-1])
new_number = current_number + 1
file_name = file_name.replace(f'({current_number}).', f'({new_number}).')
path = os.path.join(directory + os.sep + file_name)
return path
def downloadExcel():
df = pd.read_excel(r'C:\Users\zx\PycharmProjects\pythonProject\testDownload.xls')
res = df.loc[:, ['申请号', '附件地址']]
for i in range((res.shape[0])):
myurL = res['附件地址'][i]
filepath = '{0}{1}.xls'.format("e:\\", res['申请号'][i])
request.urlretrieve(myurL, auto_save_file(filepath))
下载 文档命名,一步搞定。太轻松了。 因为是按照申请号重命名,当申请号重复时,从第二份同名的开始文件名加后缀(0),(1),(2)等等