docker打包scrapy

scrapy对接docker

一.安装python

  1. 配置yum源
    如:163.repo ali.repo bak epel.repo local.repo
  2. 配置pip源
    mkdir ~/.pip
    vim ~/.pip/pip.conf
    [global]
    index-url = https://pypi.tuna.tsinghua.edu.cn/simple
    Windows:https://blog.csdn.net/kan2016/article/details/81203465
  3. 安装python依赖
    yum install gcc openssl-devel libffi-devel sqlite* -y
  4. 源码安装python3.7
    tar -zxvf Python-3.7.0.tgz
    cd Python-3.7.0
    ./configure --prefix=/usr/local/python3/ --enable-loadable-sqlite-extensions --enable-shared
    make && make install
    ln -s /usr/local/python3/bin/python3 /usr/bin/python3
    ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3
    问题:error while loading shared libraries: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory
    https://blog.csdn.net/Hearthougan/article/details/83091205
  5. 其他
    安装ipython:pip3 install ipython
    删除python3:
    rpm -qa|grep python3|xargs rpm -ev --allmatches --nodeps
    whereis python3 |xargs rm -frv
    whereis python

二.安装scrapy

  1. 安装scrapy依赖
    yum install libxslt-devel pyOpenSSL python-lxml python-devel -y
    pip3 install twisted
  2. 安装scrapy
    pip3 install --upgrade pip
    pip3 install scrapy --default-timeout=100
    ln -s /usr/local/python3/bin/scrapy /usr/bin/scrapy
  3. 测试
    scrapy shell www.baidu.com
  4. scrapy常用命令
    scrapy shell url
    scrapy startproject project_name
    scrapy genspider name allowed_domains
    scrapy crawl name
    scrapy crawl name -o items.csv
    scrapy parse --spider=basic url 用于调试异常的url
    scrapy crawl aitaotu -o info.csv -s CLOSESPIDER_ITEMCOUNT=200
  5. IDE中启动scrapy,有问题
    if name == ‘main’:
    from scrapy import cmdline
    cmdline.execute(“scrapy crawl basic”.split())
  6. 其他
    rar命令:
    wget http://www.rarlab.com/rar/rarlinux-x64-5.3.0.tar.gz
    tar -zxvf/cd/make
    unrar x ScrapySplashTest.rar

三.安装docker

  1. 参考:https://www.runoob.com/docker/centos-docker-install.html
    安装软件包:
    yum install -y yum-utils device-mapper-persistent-data lvm2

    设置稳定仓库:
    yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

    安装最新版本:
    yum install docker-ce -y
    #yum install docker-ce docker-ce-cli containerd.io -y 依赖已安装
    或指定版本:
    yum list docker-ce --showduplicates | sort -r
    yum install docker-ce-<VERSION_STRING> -y

    启动/测试:
    systemctl start docker
    systemctl enable docker
    docker versionUser-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36

    问题:WARNING: bridge-nf-call-iptables is disabled
    https://yq.aliyun.com/articles/278801

  2. 设置docker镜像加速
    镜像站点:https://www.daocloud.io/mirror#accelerator-doc
    curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io

    vim /etc/docker/daemon.json
    {“registry-mirrors”:[“http://hub-mirror.c.163.com”]}

    systemctl daemon-reload
    systemctl restart docker

  3. docker常用命令
    docker images
    docker pull ubuntu:latest
    docker rmi hello-world

    docker ps -a
    docker stop/start id
    docker rm -f 93c482e50098
    docker rm $(docker ps -aq)

    查看容器的输出:
    docker logs -f bf08b7f2cd89

    通过镜像运行一个容器,并设置端口映射
    docker run -p 0.0.0.0:5555:5555 proxypool:latest

    docker run -itd --name ubuntu-test ubuntu /bin/bash
    docker exec -it ubuntu-test /bin/bash
    docker search python

三.构建docker镜像

  1. 参考:https://cuiqingcai.com/8448.html

  2. 在项目根目录下创建requirements.txt、Dockerfile
    [root@localhost ScrapySplashTest]# cat requirements.txt
    scrapy
    pymongo
    scrapy-splash

    [root@localhost ScrapySplashTest]# cat Dockerfile
    FROM python
    ENV PATH /usr/local/bin:$PATH
    ADD . /code
    WORKDIR /code
    RUN pip3 install -r requirements.txt --default-timeout=100
    CMD scrapy crawl taobao

  3. 构建镜像
    docker build -t name:verison .

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值