问 题
我想爬电影票房的数据,网站是http://www.cbooo.cn/movieweek,我要爬网页最下面的【票房日期:2016-11-14至2016-11-20 单周票房:57271万 单周场次:1463995场 单周人次:1781万】这些数据,代码如下:
from bs4 import BeautifulSoup
import urllib.request
z = input("请输入网址:")
a = urllib.request.urlopen(z).read()
b = BeautifulSoup(a,"html.parser")
c = b.select("#content > div.alldate")
for i in c:
print(i.get_text())
输出结果是
票房日期:
单月票房:万
单月场次:万场
单月人次:万
关键的数据没有啊,这是怎么回事呢,我最想要的是那些数据,怎么弄也没有,跪求解决办法
谢谢
谢谢
谢谢
解决方案
因为你需要的数据是有ajax动态生成的,在html源码中是找不到的,所以需要能够动态加载js工具,你可以用这个
selenium+PhantomJS来执行js的内容,不过这个相对来说比较慢。
不过针对你需要抓取的网站,用游览器抓包发现 发现ajax请求路径是
所以你可以直接发起请求,
urllib.urlopen("http://www.cbooo.cn/BoxOffice/getWeekInfoData?sdate=2016-11-14").read()
不需要用上面的phantomJS。发现返回的json字符串中有你所需要的数据,你需要的数据在最后的data2。
{
"data1": [
{
"MovieRank": "1",
"MovieID": "640103",
"MovieName": "我不是潘金莲",
"WeekAmount": "20531",
"SumWeekAmount": "20553",
"People": "644",
"MovieDay": "3",
"AvgPrice": "32",
"AvgPeople": "27",
"Amount_Up": "0",
"Screen_Up": "0",
"People_Up": "0",
"DefaultImage": "http://www.cbooo.cn/moviepic/229639.jpg",
"Rank_Up": "0",
"WomIndex": "0.00"
},
{
"MovieRank": "2",
"MovieID": "325408",
"MovieName": "奇异博士",
"WeekAmount": "13324",
"SumWeekAmount": "70321",
"People": "380",
"MovieDay": "17",
"AvgPrice": "35",
"AvgPeople": "13",
"Amount_Up": "-51",
"Screen_Up": "-40",
"People_Up": "-51",
"DefaultImage": "http://www.cbooo.cn/moviepic/108737.jpg",
"Rank_Up": "-1",
"WomIndex": "8.32"
},
{
"MovieRank": "3",
"MovieID": "625158",
"MovieName": "比利·林恩的中场战事",
"WeekAmount": "5474",
"SumWeekAmount": "13561",
"People": "122",
"MovieDay": "10",
"AvgPrice": "45",
"AvgPeople": "7",
"Amount_Up": "-32",
"Screen_Up": "-1",
"People_Up": "-42",
"DefaultImage": "http://www.cbooo.cn/moviepic/217130.jpg",
"Rank_Up": "-1",
"WomIndex": "8.20"
},
{
"MovieRank": "4",
"MovieID": "656548",
"MovieName": "深海浩劫",
"WeekAmount": "5441",
"SumWeekAmount": "5441",
"People": "195",
"MovieDay": "6",
"AvgPrice": "28",
"AvgPeople": "12",
"Amount_Up": "0",
"Screen_Up": "0",
"People_Up": "0",
"DefaultImage": "http://www.cbooo.cn/moviepic/216485.jpg",
"Rank_Up": "0",
"WomIndex": "0.00"
},
{
"MovieRank": "5",
"MovieID": "653289",
"MovieName": "航海王之黄金城",
"WeekAmount": "3201",
"SumWeekAmount": "10185",
"People": "116",
"MovieDay": "10",
"AvgPrice": "27",
"AvgPeople": "7",
"Amount_Up": "-54",
"Screen_Up": "14",
"People_Up": "-55",
"DefaultImage": "http://www.cbooo.cn/moviepic/232344.jpg",
"Rank_Up": "-2",
"WomIndex": "8.70"
},
{
"MovieRank": "6",
"MovieID": "627541",
"MovieName": "外公芳龄38",
"WeekAmount": "2129",
"SumWeekAmount": "5635",
"People": "82",
"MovieDay": "10",
"AvgPrice": "26",
"AvgPeople": "7",
"Amount_Up": "-39",
"Screen_Up": "31",
"People_Up": "-39",
"DefaultImage": "http://www.cbooo.cn/moviepic/227040.jpg",
"Rank_Up": "-2",
"WomIndex": "8.03"
},
{
"MovieRank": "7",
"MovieID": "626571",
"MovieName": "勇士之门",
"WeekAmount": "1715",
"SumWeekAmount": "1715",
"People": "56",
"MovieDay": "3",
"AvgPrice": "31",
"AvgPeople": "6",
"Amount_Up": "0",
"Screen_Up": "0",
"People_Up": "0",
"DefaultImage": "http://www.cbooo.cn/moviepic/210856.jpg",
"Rank_Up": "0",
"WomIndex": "0.00"
},
{
"MovieRank": "8",
"MovieID": "633157",
"MovieName": "阿拉丁与神灯",
"WeekAmount": "1338",
"SumWeekAmount": "1338",
"People": "53",
"MovieDay": "3",
"AvgPrice": "25",
"AvgPeople": "9",
"Amount_Up": "0",
"Screen_Up": "0",
"People_Up": "0",
"DefaultImage": "http://www.cbooo.cn/moviepic/231914.jpg",
"Rank_Up": "0",
"WomIndex": "0.00"
},
{
"MovieRank": "9",
"MovieID": "628324",
"MovieName": "驴得水",
"WeekAmount": "818",
"SumWeekAmount": "17104",
"People": "26",
"MovieDay": "24",
"AvgPrice": "31",
"AvgPeople": "9",
"Amount_Up": "-72",
"Screen_Up": "-68",
"People_Up": "-72",
"DefaultImage": "http://www.cbooo.cn/moviepic/236741.jpg",
"Rank_Up": "-4",
"WomIndex": "8.16"
},
{
"MovieRank": "10",
"MovieID": "627597",
"MovieName": "夏有乔木 雅望天堂",
"WeekAmount": "437",
"SumWeekAmount": "15631",
"People": "11",
"MovieDay": "108",
"AvgPrice": "40",
"AvgPeople": "110",
"Amount_Up": "0",
"Screen_Up": "0",
"People_Up": "0",
"DefaultImage": "http://www.cbooo.cn/moviepic/216992.jpg",
"Rank_Up": "0",
"WomIndex": ""
}
],
"data2": [
{
"sDate": "2016-11-14至2016-11-20",
"BoxOffice": "57271",
"ShoCount": "1463995",
"AudienceCount": "1781"
}
] }
扫一扫关注IT屋
微信公众号搜索 “ IT屋 ” ,选择关注与百万开发者在一起