今天下午学习了一下BeautifulSoup,正好本人书荒,于是以笔趣阁网站为研究对象,就写了个爬小说的代码。放上来供大家参考,也请高手指正。
先放代码:
代码
import urllib.request as ur
from bs4 import BeautifulSoup
import ssl
import re
def get_soup(address):
'''抓取网页,创建BeautifulSoup对象'''
context = ssl._create_unverified_context() # 取消验证
headers = {
'User-Agent': 'Chrome/68.0.3440.84'}
request = ur.Request(address, headers=headers)
response = ur.urlopen(request, timeout=20, context=context)
content = response.read()