爬虫第一次打卡

本文记录了作者初次接触网络爬虫的学习经历,包括理解爬虫的基本原理,安装必备的Python库如requests和BeautifulSoup,以及编写简单的爬取网页数据的代码。通过这次实践,作者对网络爬虫有了初步的认识。
摘要由CSDN通过智能技术生成
url="https://movie.douban.com/explore#!type=movie&tag=%E7%83%AD%E9%97%A8&sort=recommend&page_limit=20&page_start=0"
data={
        'type': 'movie',
        'tag': '热门',
        'sort': 'time',
        'page_limit':'20',
        'page_state':'0',
    }
    
headers={
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36',
    }
response = requests.post(url,data=data,headers=headers)     #发起请求
json_data=response.json() 
print(json_data)
import requests
from bs4 import BeautifulSoup
url="https://movie.douban.com/top250?start=0&filter="
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',}
res = requests.get(url, headers=headers)
# print(res.status_code) 
text = res.text
soup = BeautifulSoup(text,'html.parser')
tags = soup('img')
for tag in tags:
  result=tag['alt']+":"+tag['src']
  print(result)


200
肖申克的救赎:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p480747492.jpg
霸王别姬:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2561716440.jpg
阿甘正传:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1484728154.jpg
这个杀手不太冷:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p511118051.jpg
美丽人生:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2578474613.jpg
泰坦尼克号:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p457760035.jpg
千与千寻:https://img1.doubanio.com/view/photo/s_ratio_poster/public/p2557573348.jpg
辛德勒的名单:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p492406163.jpg
盗梦空间:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p513344864.jpg
忠犬八公的故事:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p524964016.jpg
海上钢琴师:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2574551676.jpg
楚门的世界:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p479682972.jpg
三傻大闹宝莱坞:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p579729551.jpg
机器人总动员:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p1461851991.jpg
放牛班的春天:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p1910824951.jpg
星际穿越:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2206088801.jpg
大话西游之大圣娶亲:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2455050536.jpg
熔炉:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1363250216.jpg
疯狂动物城:https://img1.doubanio.com/view/photo/s_ratio_poster/public/p2315672647.jpg
无间道:https://img3.doubanio.com/view/photo/s_ratio_poster/public/p2564556863.jpg
龙猫:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p2540924496.jpg
教父:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p616779645.jpg
当幸福来敲门:https://img1.doubanio.com/view/photo/s_ratio_poster/public/p1312700628.jpg
怦然心动:https://img1.doubanio.com/view/photo/s_ratio_poster/public/p501177648.jpg
触不可及:https://img9.doubanio.com/view/photo/s_ratio_poster/public/p1454261925.jpg
扫码下载豆瓣 App:https://img3.doubanio.com/f/movie/a02f6ed325fc52e220f299d51e730c422e2bcd16/pics/movie/douban_app_ad/qrcode.png
import requests
from bs4 import BeautifulSoup
url="https://movie.douban.com/explore#!type=movie&tag=%E7%83%AD%E9%97%A8&sort=recommend&page_limit=20&page_start=0"
data={
        'type': 'movie',
        'tag': '热门',
        'sort': 'time',
        'page_limit':'20',
        'page_state':'0',
    }
    
headers={
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36',
    }
response = requests.post(url,data=data,headers=headers)     #发起请求
up = BeautifulSoup(response.text,'html.parser')
tags = soup('img')
for tag in tags:
  result=tag['alt']+":"+tag['src']
  print(result)


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值