分别用find、find_all和select爬取论文信息
find只会爬取到第一条满足条件的信息,而find_all和select会爬取所有满足条件的信息
论文链接
find和find_all方法:
import requests
from bs4 import BeautifulSoup
from requests import RequestException
def get_html(url):
try:
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36'
}
respons = requests.get(url, headers=headers)
if respons.status_code == 200:
respons.encoding = respons.apparent_encoding
return respons.text
return None
except RequestException as e:
print(e)
return None
if __name__ == "__main__":
url = 'http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zgszyx201807023'
h