python爬虫案例(2)


前言

python数据抓取,得到多页的图书各个维度的数据,并对某些字符串做了数据处理,然后用pandas模块将数据存储为excel文件。

1.源代码

代码如下:
import requests
from bs4 import BeautifulSoup
import pandas as pd

Cookie = '__permanent_id=20220513170904104321665604699593064; __ozlvd=1652433083; dest_area=country_id%3D9000%26province_id%3D111%26city_id%20%3D0%26district_id%3D0%26town_id%3D0; MDD_permanent_id=20220514110040183284142509428454779; MDD_province_str=%E5%8C%97%E4%BA%AC; MDD_province_id=111; MDD_city_str=%E5%8C%97%E4%BA%AC%E5%B8%82; MDD_city_id=1; MDD_area_str=%E4%B8%9C%E5%9F%8E%E5%8C%BA; MDD_area_id=1110101; secret_key=efcb6c415140cae4f4c0a166ed773e7d; ddscreen=2; pos_6_start=1653042630185; pos_6_end=1653042630438; bind_cust_third_id=ocil5uKns4SFMnjuJsmygDnUQfWQ; tx_open_id=oqh4kuCwpq6-aPSHG-lYsnfn-3DM; tx_nickname=UmV0cmVhdGluRw==; tx_figureurl=https://thirdwx.qlogo.cn/mmopen/vi_32/kq93oibR8fZeMg5EIjQiaUbaFVCAepVuiawVCrlvtYNSSSEfs88hTlNQnZVP4iaAJB5TiboTt9xz09zh4rRiaHEpvsBQ/132; bind_mobile=18584453761; USERNUM=S0BYivTI/XkrpOVp7wxu+Q==; login.dangdang.com=.AYH=&.ASPXAUTH=bglMFMAotILDL+d2saPYtHsgm8l4SjYxhnmav2+n5eLJmeDOKipaKA==; order_follow_source=-%7C-O-123%7C%2311%2C11%7C%23login_third_qq%2Clogin_third_weixin%7C%230%2C0%7C%23%2C; dangdang.com=email=MTg1ODQ0NTM3NjE1MjAyN0BkZG1vYmlscGhvbmVfX3VzZXIuY29t&nickname=UmV0cmVhdGluRw==&display_id=2430960574019&customerid=MMblqHaxKqSWrLrHb/ftbQ==&viptype=&show_name=185%2A%2A%2A%2A3761; ddoy=email=1858445376152027%40ddmobilphone__user.com&nickname=RetreatinG&validatedflag=0&agree_date=1; sessionID=pc_718ab82bae1d1b0a0e4092a232479cbbb4424b1ed467604ac95ee3cb10764fc3; bind_custid=771659450; __visit_id=20220521184656643343611246104875436; __out_refer=; LOGIN_TIME=1653130620425; __rpm=...1653048077521%7C...1653130641546; __trace_id=20220521185722198192659899464462250'
headers = {
   
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
    'Cookie': Cookie}


def dangdang(page):
    url = "http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2021-0-1-{}".format(page)
    #    http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2021-0-1-2
    res = requests.get(url, headers=headers)
    res.raise_for_status()
    res.encoding = res.apparent_encoding
    
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值