Python爬虫报错:AttributeError: ‘NoneType’ object has no attribute ‘children’
近期,有很多小伙伴问我在爬虫过程中遇到的一个问题AttributeError: ‘NoneType’ object has no attribute ‘children’,通过查询后也没有找到较好的解决方法。本文给大家提供一个解决方案。
爬取网页示例
网站:产业信息网
图片示例:
代码
Created on Fri Jan 20 11:30:31 2023
@author: 北辰远_code
I love python.快乐每一天!
"""
import requests
from bs4 import BeautifulSoup
import bs4
import csv
def getHTMLText(url):#爬取网站数据
try:
r = requests.get(url, timeout = 30)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return '爬取失败'
def fillUnivlist(ulist,html):#解析网站数据
soup = BeautifulSoup(html,"html.parser")
for tr in soup.find('tbody').children:
if isinstance(tr,bs4.element.Tag):
tds = tr('td')
ulist.append([tds[0].text,tds[1].text,tds[2].text,tds[3].text,tds[4].text,tds[5].text,tds[6].text,tds[7].text])
def writeUlistfile(ulist,dataname):#将网站存入csv文件
with open(dataname,'w',encoding = 'utf-8',newline='') as fout:
writer = csv.writer(fout)
for row in ulist:
writer.writerow(row)
url1 = 'https://www.chyxx.com/industry/202105/953391.html'
html1 = getHTMLText(url1)
uinfo1 =[]
fillUnivlist(uinfo1,html1)
writeUlistfile(uinfo1,'各种油产量初.csv')
报错
Traceback (most recent call last):
File "D:\Users\Qi520503\Desktop\shiyou\未命名3.py", line 45, in <module>
fillUnivlist(uinfo1,html1)
File "D:\Users\Qi520503\Desktop\shiyou\未命名3.py", line 29, in fillUnivlist
for tr in soup.find('tbody').children:
AttributeError: 'NoneType' object has no attribute 'children'
解决方案:
加入urllib3模块,关闭ssl警告。
代码如下:
import requests
from bs4 import BeautifulSoup
import bs4
import csv
import urllib3
urllib3.disable_warnings()
def getHTMLText(url):#爬取网站数据
try:
r = requests.get(url, timeout = 30,verify=False)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return '爬取失败'
运行后无报错,爬取的数据正常保存在csv文件中。