#coding:gbk
from selenium import webdriver
from selenium.webdriver.common import options
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.by import By
import time
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(executable_path ="chromedriver",options=options)
driver.get("https://liquicity.com/artists/")
time.sleep(1)
num=-1
while True:
try:
num+=1
artists=driver.find_elements(By.CLASS_NAME,'elementor-button-text')[num].text
print(artists)
with open("liquicity_artists.txt","a") as f:
if num == 22 :
f.write("Ella Noel\n")
else:
f.write(artists+"\n")
except:
break
先看看是哪一行有Unicode字符,发现是第22行。那么写一个if判断到22行时就输出修改过的字母追加到文本就行。
with open("liquicity_artists.txt","a") as f:
if num == 22 :
f.write("Ella Noel\n")
else:
f.write(artists+"\n")
这里的open用的a参数,这样就可以追加文本,而不是替换文本之前的内容。
结果正常导出到txt,编码也是ANSI码。
不过这种方法适合于我这种爬取数据少的情况下,其实更简单直接复制控制台里面的输出的内容过去到文本里面改也可以。
参考文献:
python selenium while 循环