大致有一段html如下
<script>
var a= 10
var info= {'a':10,'b':20}
</script>
解析方法如下,注意:script.text打印处理是空字符串,这里改用了pretty()获取字符串
from bs4 import BeautifulSoup as bs
session = requests.Session()
res = session.get(url, timeout=10)
soup = bs(res.text, 'html.parser')
pattern = re.compile(r"var info= (.*?);$", re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)
info = pattern.search(script.prettify()).group(1)
data = json.loads(info)
for key,value in data.items():
print(key,value)