2018 Scrapy Environment Enhance(1)DEV on Ubuntu
On Ubuntu 16.04 - worked
First of all, try to set up the Python and Scrapy ENV on ubuntu 16.04
> sudo apt-get install -qy python python-dev python-distribute python-pip ipython
> sudo apt-get install -qy firefox xvfb
> sudo pip install selenium pyvirtualdisplay
> sudo pip install boto3
> sudo pip install beautifulsoup4 requests
> sudo apt-get install -qy libffi-dev libxml2-dev libxslt-dev lib32z1-dev libssl-dev
> sudo pip install lxml scrapy scrapyjs
> sudo pip install --upgrade pip
Try to directly run my scrapy on that machine
> python3 -m venv ./pythonenv
The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt-get install python3-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
Failing command: ['/home/carl/work/price-monitor/content-fetch2/pythonenv/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']
Try to that command
> sudo apt-get install python3-venv
> python3 -m venv ./pythonenv
> source ./pythonenv/bin/activate
Exception on Ubuntu
WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Solution:
> wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
Unzip that file and give permission
> chmod a+x chromedriver
> sudo mv chromedriver /usr/local/bin/
Exception
> chromedriver
chromedriver: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
Solution:
> sudo apt-get install libxi6 libgconf-2-4
> sudo apt-get install libnss3 libgconf-2-4
Verify the chromedriver
> chromedriver
Starting ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7) on port 9515
Only local connections are allowed.
> chromedriver -v
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
Install the Browser
> sudo sudo apt install chromium-browser
Check Browser Version
> chromium-browser -version
Chromium 65.0.3325.181 Built on Ubuntu , running on Ubuntu 16.04
CentOS on AWS - failed
> wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
> sudo rpm -Uvh epel-release-latest-7.noarch.rpm
> yum repolist | grep epel
epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 11,653+809
Check the browser
http://install.linux.ncsu.edu/pub/yum/itecs/public/chromium/rhel7/noarch/
> wget http://install.linux.ncsu.edu/pub/yum/itecs/public/chromium/rhel7/noarch/chromium-release-2.2-1.noarch.rpm
Install the browser
> sudo yum localinstall chromium-release-2.2-1.noarch.rpm
> chromedriver -v
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
CentOS7 on VirtualBox - failed
> sudo wget -O /etc/yum.repos.d/chromium-el6.repo http://people.centos.org/hughesjr/chromium/6/chromium-el6.repo
Install browser
> sudo yum install chromium
Exception
Requires: libudev.so.0()(64bit)
Solution:
> sudo yum install libudev-devel
> cd /usr/lib64/
> sudo ln -sf libudev.so.1 libudev.so.0
https://pkgs.org/download/libudev.so.0()(64bit)
> wget http://mirror.yandex.ru/fedora/russianfedora/russianfedora/nonfree/el/releases/7/Everything/x86_64/os//opera-developer-24.0.1558.21-3.el7.R.x86_64.rpm
Install that
> sudo yum localinstall opera-developer-24.0.1558.21-3.el7.R.x86_64.rpm
Install chrome again
> sudo yum install chromium
> chromium-browser -version
Chromium 31.0.1650.63 Built from source for CentOS release 6.5 (Final)
Install chrome driver
> wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
Unzip that file and give permission
> chmod a+x chromedriver
> sudo mv chromedriver /usr/local/bin/
> chromedriver --version
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
Install Python3 on CentOS
> sudo yum update
> sudo yum install yum-utils
> sudo yum groupinstall development
Install Python3.6
> sudo yum install https://centos7.iuscommunity.org/ius-release.rpm
> sudo yum install python36u
Check Version
> python3.6 -V
Python 3.6.4
> sudo yum install python36u-pip
> sudo yum install python36u-devel
References:
http://neuralfoundry.com/scrapy-in-a-container-docker-development-environment/
https://github.com/dataisbeautiful/scrapy-development-docker
https://github.com/scrapy-plugins/scrapy-splash
https://www.jianshu.com/p/4052926bc12c
https://www.cnblogs.com/jclian91/p/8590617.html
https://stackoverflow.com/questions/8255929/running-selenium-webdriver-python-bindings-in-chrome/24364290#24364290
https://www.tecmint.com/how-to-enable-epel-repository-for-rhel-centos-6-5/
https://linuxconfig.org/how-to-install-chromium-web-browser-on-rhel7-linux
https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-local-programming-environment-on-centos-7
https://janikarhunen.fi/how-to-install-python-3-6-1-on-centos-7.html
On Ubuntu 16.04 - worked
First of all, try to set up the Python and Scrapy ENV on ubuntu 16.04
> sudo apt-get install -qy python python-dev python-distribute python-pip ipython
> sudo apt-get install -qy firefox xvfb
> sudo pip install selenium pyvirtualdisplay
> sudo pip install boto3
> sudo pip install beautifulsoup4 requests
> sudo apt-get install -qy libffi-dev libxml2-dev libxslt-dev lib32z1-dev libssl-dev
> sudo pip install lxml scrapy scrapyjs
> sudo pip install --upgrade pip
Try to directly run my scrapy on that machine
> python3 -m venv ./pythonenv
The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt-get install python3-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
Failing command: ['/home/carl/work/price-monitor/content-fetch2/pythonenv/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']
Try to that command
> sudo apt-get install python3-venv
> python3 -m venv ./pythonenv
> source ./pythonenv/bin/activate
Exception on Ubuntu
WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Solution:
> wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
Unzip that file and give permission
> chmod a+x chromedriver
> sudo mv chromedriver /usr/local/bin/
Exception
> chromedriver
chromedriver: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
Solution:
> sudo apt-get install libxi6 libgconf-2-4
> sudo apt-get install libnss3 libgconf-2-4
Verify the chromedriver
> chromedriver
Starting ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7) on port 9515
Only local connections are allowed.
> chromedriver -v
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
Install the Browser
> sudo sudo apt install chromium-browser
Check Browser Version
> chromium-browser -version
Chromium 65.0.3325.181 Built on Ubuntu , running on Ubuntu 16.04
CentOS on AWS - failed
> wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
> sudo rpm -Uvh epel-release-latest-7.noarch.rpm
> yum repolist | grep epel
epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 11,653+809
Check the browser
http://install.linux.ncsu.edu/pub/yum/itecs/public/chromium/rhel7/noarch/
> wget http://install.linux.ncsu.edu/pub/yum/itecs/public/chromium/rhel7/noarch/chromium-release-2.2-1.noarch.rpm
Install the browser
> sudo yum localinstall chromium-release-2.2-1.noarch.rpm
> chromedriver -v
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
CentOS7 on VirtualBox - failed
> sudo wget -O /etc/yum.repos.d/chromium-el6.repo http://people.centos.org/hughesjr/chromium/6/chromium-el6.repo
Install browser
> sudo yum install chromium
Exception
Requires: libudev.so.0()(64bit)
Solution:
> sudo yum install libudev-devel
> cd /usr/lib64/
> sudo ln -sf libudev.so.1 libudev.so.0
https://pkgs.org/download/libudev.so.0()(64bit)
> wget http://mirror.yandex.ru/fedora/russianfedora/russianfedora/nonfree/el/releases/7/Everything/x86_64/os//opera-developer-24.0.1558.21-3.el7.R.x86_64.rpm
Install that
> sudo yum localinstall opera-developer-24.0.1558.21-3.el7.R.x86_64.rpm
Install chrome again
> sudo yum install chromium
> chromium-browser -version
Chromium 31.0.1650.63 Built from source for CentOS release 6.5 (Final)
Install chrome driver
> wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
Unzip that file and give permission
> chmod a+x chromedriver
> sudo mv chromedriver /usr/local/bin/
> chromedriver --version
ChromeDriver 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7)
Install Python3 on CentOS
> sudo yum update
> sudo yum install yum-utils
> sudo yum groupinstall development
Install Python3.6
> sudo yum install https://centos7.iuscommunity.org/ius-release.rpm
> sudo yum install python36u
Check Version
> python3.6 -V
Python 3.6.4
> sudo yum install python36u-pip
> sudo yum install python36u-devel
References:
http://neuralfoundry.com/scrapy-in-a-container-docker-development-environment/
https://github.com/dataisbeautiful/scrapy-development-docker
https://github.com/scrapy-plugins/scrapy-splash
https://www.jianshu.com/p/4052926bc12c
https://www.cnblogs.com/jclian91/p/8590617.html
https://stackoverflow.com/questions/8255929/running-selenium-webdriver-python-bindings-in-chrome/24364290#24364290
https://www.tecmint.com/how-to-enable-epel-repository-for-rhel-centos-6-5/
https://linuxconfig.org/how-to-install-chromium-web-browser-on-rhel7-linux
https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-local-programming-environment-on-centos-7
https://janikarhunen.fi/how-to-install-python-3-6-1-on-centos-7.html