查看新浪标题html编码所需要方式:
(1)在新浪网页标题栏单击鼠标右键,点击检查
(2)点击右上角的指标,在网页中点击标题获取标题的类
代码如下
import requests
from bs4 import BeautifulSoup
res = requests.get("http://news.sina.com.cn/china/")
res.encoding = "utf-8" #指定编码格式
soup = BeautifulSoup(res.text,"html.parser")
for news in soup.select(".news-item"):
if len(news.select('h2')) > 0:
head = news.select('h2')[0].text
url = news.select('a')[0]['href']
time = news.select('.time')[0].text
print(time,head,url)