Python3 自动下载电脑已安装Chrome对应的chromedriver(Windows)

当我们爬取网页时,chrome时常用的浏览器,而驱动程序chromedriver是必不可少的。但是,有时也要把程序发给别人,甚至是不懂程序的人,这时就很难向他们说明怎样自己选择chromedriver的版本了。这篇文章提供了python语言(3.x)自动下载电脑上chrome对应的chromedriver的解决办法。

完整代码

import os
import urllib
import urllib.request
import winreg
import re
import sys
import zipfile

DriverVersions = {
    '73':'2.46',
    '72':'2.46',
    '71':'2.46',
    '70':'2.45',
    '69':'2.44',
    '68':'2.42',
    '67':'2.41',
    '66':'2.40',
    '65':'2.38',
    '64':'2.37',
    '63':'2.36',
    '62':'2.35',
    '61':'2.34',
    '60':'2.33',
    '59':'2.32',
    '58':'2.31',
    '57':'2.29',
    '56':'2.29',
    '55':'2.28',
    '54':'2.27',
    '53':'2.26',
    '52':'2.24',
    '51':'2.23',
    '50':'2.22',
    '49':'2.22',
    '48':'2.21',
    '47':'2.21',
    '46':'2.21',
    '45':'2.20',
    '44':'2.20',
    '43':'2.20',
    '42':'2.16',
    '41':'2.15',
    '40':'2.15',
    '39':'2.14',
    '38':'2.13',
    '37':'2.12',
    '36':'2.12',
    '35':'2.10',
    '34':'2.10',
    '33':'2.10',
    '32':'2.9',
    '31':'2.9',
    '30':'2.8',
    '29':'2.7'
}

def unzip_single(src_file, dest_dir, password=None):
    if password:
        password = password.encode()
    zf = zipfile.ZipFile(src_file)
    try:
        zf.extractall(path=dest_dir, pwd=password)
    except RuntimeError as e:
        raise OSError('Occurred an exception while extracting zip file. ')
    zf.close()

FullChromeVersion = winreg.QueryValueEx(winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE,'SOFTWARE\\WOW6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\Google Chrome'),'DisplayVersion')[0]
ChromeVersion = int(FullChromeVersion.split('.')[0])
print('Chrome version: '+FullChromeVersion)
if ChromeVersion <= 73:
    if not str(ChromeVersion) in DriverVersions:
        raise KeyError('There isn\'t a chromedriver that supports your Chrome version. ')
    try:
        urllib.request.urlretrieve('https://npm.taobao.org/mirrors/chromedriver/'+DriverVersions[str(ChromeVersion)]+'/chromedriver_win32.zip','chromedriver_win32.zip')
    except:
        print('Can\'t connect to the server! ')
        raise ConnectionError('Can\'t connect to the server')
    else:
        print('Extracting file... ')
        unzip_single('chromedriver_win32.zip','')
        print('Download successfully. ')
else:
    AvailableVersions = {}
    try:
        urlRead = urllib.request.urlopen(urllib.request.Request('https://npm.taobao.org/mirrors/chromedriver/')).read().decode()
    except:
        print('Can\'t connect to the server! ')
        raise ConnectionError('Can\'t connect to the server')
    else:
        for i in re.findall('<a href="/mirrors/chromedriver/(.*?)</a>',urlRead):
            if i[0] in 'qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM' or 'RELEASE' in i or int(i.split('.')[0]) <= 72:
                continue
            if not i.split('.')[0] in AvailableVersions:
                AvailableVersions[i.split('.')[0]] = i.split('/">')[0]
        if not str(ChromeVersion) in AvailableVersions:
            raise KeyError('There isn\'t has a chromedriver that supports your Chrome version. ')
        try:
            print('Downloading... \nURL:https://npm.taobao.org/mirrors/chromedriver/'+AvailableVersions[str(ChromeVersion)]+'/chromedriver_win32.zip')
            urllib.request.urlretrieve('https://npm.taobao.org/mirrors/chromedriver/'+AvailableVersions[str(ChromeVersion)]+'/chromedriver_win32.zip','chromedriver_win32.zip')
        except:
            print('Download failed. ')
        else:
            print('Extracting file... ')
            unzip_single('chromedriver_win32.zip','')
            print('Download successfully. ')

分析

首先,DriverVersions是一个存放29~73版本Chrome对应chromedriver的词典。

然后要获取Chrome的版本。
我们知道HKLM\SOFTWARE\WOW6432Node\Microsoft\Windows\CurrentVersion\App Paths下存的是各种软件的目录,毫不意外,Chrome的也在这里。只需要读取它就能找到Chrome安装到哪里了。打开这个目录,非常幸运地发现,Chrome也在这里。在HKLM\SOFTWARE\WOW6432Node\Microsoft\Windows\CurrentVersion\Uninstall\Google Chrome这一项下,我们找到了DisplayVersion,就是Chrome的版本信息。

版本信息之后,就要获取对应的Chromedriver。前面已经在一个词典里总结了29~73版本(这个版本号即版本号的第一位)的,如果是29~73版本的Chrome,就直接在dict里读取;如果高于73版本,对应的Chromedriver版本就是和Chrome的版本一样的。具体后面第二、三、四位是哪个版本,就要在网页上查找。

查找方式:通过获取镜像网站https://npm.taobao.org/mirrors/chromedriver/,可以看到每一个文件夹都是以版本号为名称,可以直接获取这些版本号。具体获取方式使用正则表达式。然后再用python进行筛选。
这里我摘一段网站的返回信息。

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>ChromeDriver Mirror</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <!-- Bootstrap -->
    <link href="https://cdn.staticfile.org/twitter-bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" media="screen">
    <style>
      #fork{position:fixed;top:0;right:0;_position:absolute;z-index: 10000;}
      .bottom{margin: 20px auto; width: 100%; text-align: center;}
      .container{width: 1080px; margin: 50px auto;}
    </style>
  <head>
  <body>
    <a href="https://github.com/cnpm/cnpmjs.org" id="fork" target="_blank">
        <img alt="Fork me on GitHub" src="//s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png">
    </a>
    <div class="container">
      <h1>Mirror index of <a target="_blank" href="http://chromedriver.storage.googleapis.com/">http://chromedriver.storage.googleapis.com/</a></h1>
<hr>
<pre><a href="../">../</a>
<a href="/mirrors/chromedriver/2.0/">2.0/</a>                                              2013-09-25T22:57:39.349Z                          -
<a href="/mirrors/chromedriver/2.1/">2.1/</a>                                              2013-09-25T22:57:49.481Z                          -
<a href="/mirrors/chromedriver/2.10/">2.10/</a>                                             2014-05-01T20:46:22.843Z                          -
<a href="/mirrors/chromedriver/2.11/">2.11/</a>                                             2014-10-08T01:17:17.918Z                          -
<a href="/mirrors/chromedriver/2.12/">2.12/</a>                                             2014-10-27T09:27:24.626Z                          -
<a href="/mirrors/chromedriver/2.13/">2.13/</a>                                             2014-12-10T13:17:59.776Z                          -
<a href="/mirrors/chromedriver/2.14/">2.14/</a>                                             2015-01-28T09:29:27.341Z                          -
<a href="/mirrors/chromedriver/2.15/">2.15/</a>                                             2015-03-26T22:08:19.898Z                          -
<a href="/mirrors/chromedriver/2.16/">2.16/</a>                                             2015-06-08T12:30:55.879Z                          -
<a href="/mirrors/chromedriver/2.17/">2.17/</a>                                             2015-07-30T22:11:44.809Z                          -
<a href="/mirrors/chromedriver/2.18/">2.18/</a>                                             2015-08-19T03:48:06.740Z                          -
<a href="/mirrors/chromedriver/2.19/">2.19/</a>                                             2015-08-28T06:09:42.121Z                          -
<a href="/mirrors/chromedriver/2.2/">2.2/</a>                                              2013-09-25T22:57:58.374Z                          -
<a href="/mirrors/chromedriver/2.20/">2.20/</a>                                             2015-10-08T23:22:48.789Z                          -
<a href="/mirrors/chromedriver/2.21/">2.21/</a>                                             2016-01-26T06:47:39.216Z                          -
<a href="/mirrors/chromedriver/2.22/">2.22/</a>                                             2016-06-04T19:54:50.312Z                          -
<a href="/mirrors/chromedriver/2.23/">2.23/</a>                                             2016-08-04T19:02:02.309Z                          -
<a href="/mirrors/chromedriver/2.24/">2.24/</a>                                             2016-09-09T00:57:14.652Z                          -
<a href="/mirrors/chromedriver/2.25/">2.25/</a>                                             2016-10-22T02:16:44.584Z                          -
<a href="/mirrors/chromedriver/2.26/">2.26/</a>                                             2016-12-05T23:24:16.587Z                          -
<a href="/mirrors/chromedriver/2.27/">2.27/</a>                                             2016-12-21T23:07:03.291Z                          -
<a href="/mirrors/chromedriver/2.28/">2.28/</a>                                             2017-03-08T22:53:11.244Z                          -
<a href="/mirrors/chromedriver/2.29/">2.29/</a>                                             2017-04-04T01:21:21.907Z                          -
<a href="/mirrors/chromedriver/2.3/">2.3/</a>                                              2013-09-25T22:58:07.947Z                          -
<a href="/mirrors/chromedriver/2.30/">2.30/</a>                                             2017-06-07T22:53:24.655Z                          -
<a href="/mirrors/chromedriver/2.31/">2.31/</a>                                             2017-07-22T01:08:24.087Z                          -
<a href="/mirrors/chromedriver/2.32/">2.32/</a>                                             2017-08-30T20:07:04.354Z                          -
<a href="/mirrors/chromedriver/2.33/">2.33/</a>                                             2017-10-03T21:09:52.970Z                          -
<a href="/mirrors/chromedriver/2.34/">2.34/</a>                                             2017-12-10T03:28:46.062Z                          -
<a href="/mirrors/chromedriver/2.35/">2.35/</a>                                             2018-01-10T02:35:57.501Z                          -
<a href="/mirrors/chromedriver/2.36/">2.36/</a>                                             2018-03-02T09:17:32.016Z                          -
<a href="/mirrors/chromedriver/2.37/">2.37/</a>                                             2018-03-16T06:19:07.262Z                          -
<a href="/mirrors/chromedriver/2.38/">2.38/</a>                                             2018-04-17T20:19:14.328Z                          -
<a href="/mirrors/chromedriver/2.39/">2.39/</a>                                             2018-05-30T06:19:55.386Z                          -
<a href="/mirrors/chromedriver/2.4/">2.4/</a>                                              2013-10-01T05:42:36.371Z                          -
<a href="/mirrors/chromedriver/2.40/">2.40/</a>                                             2018-06-07T23:44:20.210Z                          -
<a href="/mirrors/chromedriver/2.41/">2.41/</a>                                             2018-07-27T19:25:01.951Z                          -
<a href="/mirrors/chromedriver/2.42/">2.42/</a>                                             2018-09-13T18:14:11.882Z                          -
<a href="/mirrors/chromedriver/2.43/">2.43/</a>                                             2018-10-17T02:46:13.125Z                          -
<a href="/mirrors/chromedriver/2.44/">2.44/</a>                                             2018-11-20T00:32:52.802Z                          -
<a href="/mirrors/chromedriver/2.45/">2.45/</a>                                             2018-12-10T23:20:22.017Z                          -
<a href="/mirrors/chromedriver/2.46/">2.46/</a>                                             2019-02-01T19:22:24.040Z                          -
<a href="/mirrors/chromedriver/2.5/">2.5/</a>                                              2013-11-01T18:01:58.116Z                          -
<a href="/mirrors/chromedriver/2.6/">2.6/</a>                                              2013-11-05T07:13:23.018Z                          -
<a href="/mirrors/chromedriver/2.7/">2.7/</a>                                              2013-11-22T23:02:01.944Z                          -
<a href="/mirrors/chromedriver/2.8/">2.8/</a>                                              2013-12-16T23:41:09.841Z                          -
<a href="/mirrors/chromedriver/2.9/">2.9/</a>                                              2014-02-03T09:11:50.536Z                          -
<a href="/mirrors/chromedriver/70.0.3538.16/">70.0.3538.16/</a>                                     2018-09-17T20:50:43.843Z                          -
<a href="/mirrors/chromedriver/70.0.3538.67/">70.0.3538.67/</a>                                     2018-10-17T16:02:03.103Z                          -
<a href="/mirrors/chromedriver/70.0.3538.97/">70.0.3538.97/</a>                                     2018-11-06T07:19:03.877Z                          -
<a href="/mirrors/chromedriver/71.0.3578.137/">71.0.3578.137/</a>                                    2019-01-21T19:35:39.578Z                          -
<a href="/mirrors/chromedriver/71.0.3578.30/">71.0.3578.30/</a>                                     2018-11-01T21:02:45.154Z                          -
<a href="/mirrors/chromedriver/71.0.3578.33/">71.0.3578.33/</a>                                     2018-11-02T15:53:57.452Z                          -
<a href="/mirrors/chromedriver/71.0.3578.80/">71.0.3578.80/</a>                                     2018-12-11T19:10:42.607Z                          -
<a href="/mirrors/chromedriver/72.0.3626.69/">72.0.3626.69/</a>                                     2019-01-22T07:21:41.137Z                          -
<a href="/mirrors/chromedriver/72.0.3626.7/">72.0.3626.7/</a>                                      2018-12-11T19:09:45.570Z                          -
<a href="/mirrors/chromedriver/73.0.3683.20/">73.0.3683.20/</a>                                     2019-02-06T19:24:05.478Z                          -
<a href="/mirrors/chromedriver/73.0.3683.68/">73.0.3683.68/</a>                                     2019-03-07T22:34:54.837Z                          -
<a href="/mirrors/chromedriver/74.0.3729.6/">74.0.3729.6/</a>                                      2019-03-12T19:25:26.063Z                          -
<a href="/mirrors/chromedriver/75.0.3770.140/">75.0.3770.140/</a>                                    2019-07-12T18:06:25.447Z                          -
<a href="/mirrors/chromedriver/75.0.3770.8/">75.0.3770.8/</a>                                      2019-04-30T00:02:57.641Z                          -
<a href="/mirrors/chromedriver/75.0.3770.90/">75.0.3770.90/</a>                                     2019-06-13T21:21:15.477Z                          -
<a href="/mirrors/chromedriver/76.0.3809.12/">76.0.3809.12/</a>                                     2019-06-07T16:19:42.400Z                          -
<a href="/mirrors/chromedriver/76.0.3809.126/">76.0.3809.126/</a>                                    2019-08-20T18:01:27.496Z                          -
<a href="/mirrors/chromedriver/76.0.3809.25/">76.0.3809.25/</a>                                     2019-06-13T21:24:59.874Z                          -
<a href="/mirrors/chromedriver/76.0.3809.68/">76.0.3809.68/</a>                                     2019-07-16T17:09:55.657Z                          -
<a href="/mirrors/chromedriver/77.0.3865.10/">77.0.3865.10/</a>                                     2019-08-06T18:45:26.553Z                          -
<a href="/mirrors/chromedriver/77.0.3865.40/">77.0.3865.40/</a>                                     2019-08-20T18:02:46.906Z                          -
<a href="/mirrors/chromedriver/78.0.3904.105/">78.0.3904.105/</a>                                    2019-11-18T18:20:40.686Z                          -
<a href="/mirrors/chromedriver/78.0.3904.11/">78.0.3904.11/</a>                                     2019-09-12T16:45:50.292Z                          -
<a href="/mirrors/chromedriver/78.0.3904.70/">78.0.3904.70/</a>                                     2019-10-21T20:40:07.509Z                          -
<a href="/mirrors/chromedriver/79.0.3945.16/">79.0.3945.16/</a>                                     2019-10-30T16:10:56.644Z                          -
<a href="/mirrors/chromedriver/79.0.3945.36/">79.0.3945.36/</a>                                     2019-11-18T18:20:03.409Z                          -
<a href="/mirrors/chromedriver/80.0.3987.106/">80.0.3987.106/</a>                                    2020-02-13T19:21:31.091Z                          -
<a href="/mirrors/chromedriver/80.0.3987.16/">80.0.3987.16/</a>                                     2019-12-19T17:39:26.425Z                          -
<a href="/mirrors/chromedriver/81.0.4044.20/">81.0.4044.20/</a>                                     2020-02-13T19:11:47.807Z                          -
<a href="/mirrors/chromedriver/81.0.4044.69/">81.0.4044.69/</a>                                     2020-03-17T16:16:51.579Z                          -
<a href="/mirrors/chromedriver/83.0.4103.14/">83.0.4103.14/</a>                                     2020-04-16T19:48:28.068Z                          -
<a href="/mirrors/chromedriver/icons/">icons/</a>                                            2013-09-25T17:42:04.972Z                          -
<a href="/mirrors/chromedriver/70.0.3538.LATEST_RELEASE">70.0.3538.LATEST_RELEASE</a>                          2018-09-19T22:24:28.963Z                          12(12B)
<a href="/mirrors/chromedriver/index.html">index.html</a>                                        2013-09-25T16:59:18.911Z                          10574(10.33kB)
<a href="/mirrors/chromedriver/LATEST_RELEASE">LATEST_RELEASE</a>                                    2020-04-08T15:52:54.589Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_70">LATEST_RELEASE_70</a>                                 2019-02-21T05:37:43.183Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_70.0.3538">LATEST_RELEASE_70.0.3538</a>                          2018-11-06T07:19:08.413Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_71">LATEST_RELEASE_71</a>                                 2019-02-21T05:37:29.970Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_71.0.3578">LATEST_RELEASE_71.0.3578</a>                          2019-01-21T19:35:43.788Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_72">LATEST_RELEASE_72</a>                                 2019-02-21T05:37:17.996Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_72.0.3626">LATEST_RELEASE_72.0.3626</a>                          2019-01-22T07:21:45.396Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_73">LATEST_RELEASE_73</a>                                 2019-03-12T16:05:59.036Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_73.0.3683">LATEST_RELEASE_73.0.3683</a>                          2019-03-07T22:34:59.301Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_74">LATEST_RELEASE_74</a>                                 2019-03-12T19:25:31.583Z                          11(11B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_74.0.3729">LATEST_RELEASE_74.0.3729</a>                          2019-03-12T19:25:30.367Z                          11(11B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_75">LATEST_RELEASE_75</a>                                 2019-07-12T18:06:31.115Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_75.0.3770">LATEST_RELEASE_75.0.3770</a>                          2019-07-12T18:06:29.734Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_76">LATEST_RELEASE_76</a>                                 2019-08-20T18:01:32.838Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_76.0.3809">LATEST_RELEASE_76.0.3809</a>                          2019-08-20T18:01:31.671Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_77">LATEST_RELEASE_77</a>                                 2019-08-20T18:02:52.200Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_77.0.3865">LATEST_RELEASE_77.0.3865</a>                          2019-08-20T18:02:50.947Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_78">LATEST_RELEASE_78</a>                                 2019-11-18T18:20:46.724Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_78.0.3904">LATEST_RELEASE_78.0.3904</a>                          2019-11-18T18:20:45.336Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_79">LATEST_RELEASE_79</a>                                 2019-11-18T18:20:09.561Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_79.0.3945">LATEST_RELEASE_79.0.3945</a>                          2019-11-18T18:20:08.321Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_80">LATEST_RELEASE_80</a>                                 2020-02-13T19:34:11.419Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_80.0.3987">LATEST_RELEASE_80.0.3987</a>                          2020-02-13T19:33:45.571Z                          13(13B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_81">LATEST_RELEASE_81</a>                                 2020-03-17T16:16:57.283Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_81.0.4044">LATEST_RELEASE_81.0.4044</a>                          2020-03-17T16:16:55.944Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_83">LATEST_RELEASE_83</a>                                 2020-04-16T19:48:33.848Z                          12(12B)
<a href="/mirrors/chromedriver/LATEST_RELEASE_83.0.4103">LATEST_RELEASE_83.0.4103</a>                          2020-04-16T19:48:32.557Z                          12(12B)
</pre>
<hr>
    </div>
      <hr/>
      <div class="bottom">
        Copyright &copy; <a href="https://github.com/cnpm" target="_blank">cnpm</a>
        <a href="/">Home</a>
    </div>
  </body>
</html>

不难发现只要用'<a href="/mirrors/chromedriver/(.*?)</a>'这一正则表达式就能提取出来全部链接。再用python处理一下,只保留≥74版本的链接,即可提取到所有的chromedriver。如果还没有对应版本,那就只能报错了。

获取到了对应的版本就可以下载了。这里使用urllib库进行下载。下载目录可以更改,但是代码中有四个地方都要修改(即全部'chromedriver_win32.zip')。

下载完得到了一个.zip,需要我们进行解压。解压使用zipfile库。最后解压完,就完成下载了!


如果有格式问题,我把源代码上传到博文资源,欢迎下载、提出意见(对于我代码风格的就算了)。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值