Selenium是浏览器功能自动化测试工具,在终端界面下也能运行,所以可作为爬虫的 js 引擎。
Dockerfile:
FROM ubuntu:16.04
MAINTAINER tuweifeg "907391489@qq.com"
RUN apt update; \
apt install -y bzip2 \
unzip \
vim \
wget \
libxss1 \
libappindicator1 \
xvfb \
libindicator7
# gdebi
RUN mkdir -p /home/ubuntu/project; \
mkdir -p /home/ubuntu/soft; \
cd /home/ubuntu/soft; \
wget https://www.slimjet.com/chrome/download-chrome.php?file=files%2F75.0.3770.80%2Fgoogle-chrome-stable_current_amd64.deb; \
apt install -y ./*google-chrome*.deb; \
rm *google-chrome*.deb; \
# gdebi *google-chrome*.deb; \
wget https://npm.taobao.org/mirrors/chromedriver/75.0.3770.90/chromedriver_linux64.zip; \
unzip chromedriver_linux64.zip; \
rm chromedriver_linux64.zip; \
wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh; \
sh Anaconda3-5.0.1-Linux-x86_64.sh -b; \
rm Anaconda3-5.0.1-Linux-x86_64.sh; \
echo 'export PATH=/root/anaconda3/bin:$PATH' >> ~/.bashrc
RUN /root/anaconda3/bin/pip install -i https://pypi.tuna.tsinghua.edu.cn/simple jieba==0.39 \
pymysql==0.9.3 \
selenium==3.141.0
ENV LANG C.UTF-8
EXPOSE 8000:8000
EXPOSE 8001:8001
EXPOSE 8002:8002
EXPOSE 8003:8003
EXPOSE 8004:8004
WORKDIR /home/ubuntu/
关于 *.deb 的安装,有三种办法
1. dpkg -i [包名] , 容易出现依赖不存在导致安装失败。因此先试错一次,然后用apt install -f 去修复依赖
dpkg -i [包名]
apt install -f
dpkg -i [包名]
2. gdebi [包名], 自动安装依赖,但需要先安装 gdebi
apt install gdebi
gdebi [包名]
3. apt install [包名], 仅支持 ubuntu16.04 及之后版本, 需要注意的是即使在当前路径包名前依然要加上路径 ./
apt install [包名]