ubuntu使用selenium

chrome下载安装(驱动+浏览器)

官网驱动下载介绍
jar包所在:https://search.maven.org/artifact/org.seleniumhq.selenium/selenium-java
驱动程序:地址:https://registry.npmmirror.com/binary.html?path=chromedriver/ 或者https://sites.google.com/chromium.org/driver/home
下列比较合适的Chrome驱动:
chromedriver_linux64_102.0.5005.61.zip
chromedriver_linux64_100.0.4896.60.zip
驱动程序,需要与安装的chrome保持一致
下载Chrome:https://www.chromedownloads.net/chrome64linux/
下载并安装chome:

root@ubuntu:~/apps/browser#  wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
root@ubuntu:~/apps/browser# dpkg -i google-chrome-stable_current_amd64.deb
Selecting previously unselected package google-chrome-stable.
(Reading database ... 243318 files and directories currently installed.)
Preparing to unpack google-chrome-stable_current_amd64.deb ...
Unpacking google-chrome-stable (102.0.5005.61-1) ...
Setting up google-chrome-stable (102.0.5005.61-1) ...
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/x-www-browser (x-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/gnome-www-browser (gnome-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/google-chrome (google-chrome) in auto mode
Processing triggers for mailcap (3.70+nmu1ubuntu1) ...
Processing triggers for gnome-menus (3.36.0-1ubuntu3) ...
Processing triggers for desktop-file-utils (0.26-1ubuntu3) ...
Processing triggers for man-db (2.10.2-1) ...
root@ubuntu:~/apps/browser# google-chrome --version
Google Chrome 102.0.5005.61

Firefox下载安装(驱动+浏览器)

firefox驱动
firefox下载
安装过程:

root@ubuntu:~/apps/browser# wget https://download-installer.cdn.mozilla.net/pub/firefox/releases/101.0/linux-x86_64/en-US/firefox-101.0.tar.bz2
root@ubuntu:~/apps/browser# tar -jxvf firefox-101.0.tar.bz2
root@ubuntu:~/apps/browser# ls
chromedriver  chromedriver_linux64.zip  firefox  firefox-101.0.tar.bz2  google-chrome-stable_current_amd64.deb
root@ubuntu:~/apps/browser# mv firefox /opt/
root@ubuntu:~/apps/browser# ln -s /opt/firefox/firefox /usr/bin/firefox
root@ubuntu:~/apps/browser# firefox --version
Mozilla Firefox 101.0

在使用firefox的时候,会发生:

firefox Error: no DISPLAY environment variable specified

解决方法是:

sudo apt-get install xvfb
/usr/bin/Xvfb :99 -ac -screen 0 1024x768x8 & export DISPLAY=":99"

Ps:最后冒着试试看的想法尝试了下上面的代码,最终把问题解决;
回答来源
版本如下:

root@ubuntu:~# uname -a
Linux ubuntu 5.13.0-40-generic #45~20.04.1-Ubuntu SMP Mon Apr 4 09:38:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu:~# ll
-rwxr-xr-x  1 root    root    13867568 Mar 26 09:56 chromedriver*
-rw-r--r--  1 root    root     6986125 Apr  6 09:54 chromedriver_linux64.zip
-rw-r--r--  1 root    root    76889183 Apr 12 09:11 firefox-99.0.1.tar.bz2
-rwxr-xr-x  1 cyxinda cyxinda  8673104 Apr  6 23:54 geckodriver*
-rw-r--r--  1 root    root     2715370 Apr  7 19:59 geckodriver-v0.31.0-linux64.tar.gz
-rw-r--r--  1 root    root    87064480 Apr 14 05:10 google-chrome-stable_current_amd64.deb
https://chromedriver.storage.googleapis.com/102.0.5005.61/chromedriver_linux64.zip
root@ubuntu:~/selenium# wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
root@ubuntu:~/selenium# dpkg -i google-chrome*.deb
root@ubuntu:~/selenium# cd /opt/google/chrome
root@ubuntu:/opt/google/chrome# ls
chrome                   google-chrome                      nacl_helper            product_logo_48.png
chrome_100_percent.pak   icudtl.dat                         nacl_helper_bootstrap  product_logo_64.png
chrome_200_percent.pak   libEGL.so                          nacl_irt_x86_64.nexe   resources.pak
chrome_crashpad_handler  libGLESv2.so                       product_logo_128.png   swiftshader
chrome_debug.log         liboptimization_guide_internal.so  product_logo_16.png    v8_context_snapshot.bin
chrome-sandbox           libvk_swiftshader.so               product_logo_24.png    vk_swiftshader_icd.json
cron                     libvulkan.so.1                     product_logo_256.png   WidevineCdm
default-app-block        locales                            product_logo_32.png    xdg-mime
default_apps             MEIPreload                         product_logo_32.xpm    xdg-settings
root@ubuntu:/opt/google/chrome# ./google-chrome --version
Google Chrome 100.0.4896.127
#--author--张俊杰@Nick
import datetime
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options     #建议使用谷歌浏览器
import time
chrome_options = Options()
chrome_options.add_argument('--headless')
#使不使用headless版本,也许你想感受一下浏览器自动打开,自动点击的快感,也不一定
browser = webdriver.Chrome(chrome_options=chrome_options,executable_path = 'C:/Users/Administrator/AppData/Local/Google/Chrome/Application/chromedriver.exe')
#chromedriver下载下来之后复制到chrome.exe同文件夹下即可
print("打开网页中。。。")
browser.get("http://106.37.208.243:8068/GJZ/Business/Publish/Main.html")
print("网页响应中。。。")
wait = WebDriverWait(browser,20)#毕竟代码运行的速度和浏览器打开的速度不再一个量级,一个闪电侠,一个奥特曼
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID,"mainframe")))#这一步很关键
browser.find_element_by_id('ddm_River').click()#模拟点击“流域”
browser.find_element_by_xpath("/html/body/div[1]/div[2]/div/ul/li[1]").click()#模拟点击“所有流域”
wait.until(EC.presence_of_element_located((By.CLASS_NAME,"grid")))#定位到数据
print("获取网页数据中。。。")
time.sleep(10)
soup = BeautifulSoup(browser.page_source,"lxml")
browser.close()
data_head = soup.select(".panel-heading")[0]
grid_data = soup.select(".grid")[0]
data_colhead = data_head.findAll("td")
data_rows = grid_data.findAll("tr")
water_df = pd.DataFrame(columns=[c.text for c in data_colhead])
print("提取网页数据中。。。")
for i,data_row in enumerate(data_rows):
    water_loc = water_df.iloc[:,0].values
    water_data = water_df.iloc[:,1].values
    row_dat = [r.text for r in data_row]
    water_df.loc[i] = row_dat
#系统时间
data_str = datetime.datetime.now().strftime('%Y_%m_%d')
#可修改保存路径
water_df.to_csv("E:/python/国家地表水爬虫/%s_国家地表水水质自动监测系统检测数据.csv" % (data_str),index=None, encoding="GB18030")
print("数据提取完成!!")
12345678910111213141516171819202122232425262728293031323334353637383940414243

附安装jdk

安装OpenJDK8
sudo apt-get install openjdk-8-jdk

配置Java环境变量
sudo vim /etc/profile
在profile末尾添加以下内容:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值