发生错误的代码:
import requests
from bs4 import BeautifulSoup
import re
headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/99.0.4844.51 Safari/537.36'}
def getHTMLdocument(url):
res_news=requests.get(url,headers=headers)
return res_news.text
url="https://listado.mercadolibre.com.mx/security-camera"
print(url)
bs=BeautifulSoup(url,'html.parser')
print(bs)
错误结果:looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client like requests to get the document behind the URL, and feed that document to Beautiful Soup
翻译:
看起来像一个网址。 Beautiful Soup 不是 HTTP 客户端。 您可能应该使用像请求这样的 HTTP 客户端来获取 URL 后面的文档,并将该文档提供给 Beautiful Soup
D:\anaconda3\python.exe C:\Users\My\PycharmProjects\pythonProject1\test.12.3.py
D:\anaconda3\lib\site-packages\bs4\__init__.py:417: MarkupResemblesLocatorWarning: "https://listado.mercadolibre.com.mx/security-camera" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client like requests to get the document behind the URL, and feed that document to Beautiful Soup.
warnings.warn(
https://listado.mercadolibre.com.mx/security-camera
https://listado.mercadolibre.com.mx/security-camera
Process finished with exit code 0
解决办法:修改代码: bs=BeautifulSoup(url,'html.parser')
正确代码:
import requests
from bs4 import BeautifulSoup
import re
headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/99.0.4844.51 Safari/537.36'}
def getHTMLdocument(url):
res_news=requests.get(url,headers=headers)
return res_news.text
url="https://listado.mercadolibre.com.mx/security-camera"
print(url)
bs=BeautifulSoup(getHTMLdocument(url),'html.parser')
print(bs)
输出结果: