python爬虫如何爬亚马逊_5行代码实现Python简易爬虫,抓取亚马逊首页

import requests

headers = {

'authority': 'www.amazon.com',

'cache-control': 'max-age=0',

'rtt': '100',

'downlink': '7.8',

'ect': '4g',

'upgrade-insecure-requests': '1',

'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',

'sec-fetch-user': '?1',

'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',

'sec-fetch-site': 'cross-site',

'sec-fetch-mode': 'navigate',

'referer': 'https://www.amazon.com/',

'accept-encoding': 'gzip, deflate, br',

'accept-language': 'zh-CN,zh;q=0.9',

'cookie': 'session-id=136-1846890-2675233; ubid-main=135-9186825-4739358; aws-priv=eyJ2IjoxLCJldSI6MCwic3QiOjB9; aws-target-static-id=1574328096260-555987; aws-target-data=%7B%22support%22%3A%221%22%7D; lc-main=en_US; s_vnum=2017617388388%26vn%3D2; aws-ubid-main=368-2368523-6774828; aws-session-id=135-4385250-1911763; sess-aws-at-main="dpiimGao6GmWACwx5D01BuPiYpXptZXrM5SGGD6L1fM="; aws-business-metrics-last-visit=1589961329571; s_fid=2F58B4996A2316F2-0B4C6B6523423B61; i18n-prefs=USD; x-wl-uid=1bE+RtlWJokDeV8gzp6KnwZEIYvdnc9EqUv/j5zbkewIWsQVY6xB4y8Y78aHHfQdgcALvAcGlFtNBGQ4jkJsvlr//g31Vfv3n9zrXJSIIIGeQwVtYe1hzrWSOGXJ3KCZAvcDFxXIiQuo=; regStatus=registered; _mkto_trk=id:112-TZM-766&token:_mch-amazon.com-1590742253839-60320; aws-session-id-time=2223357980l; aws-session-token=i52Y3W6OwFg6Cqbc2Pb1IaQWphrw8JAtXy+439cMrHnzL+H0ntL98dIHDT5iBTmF/8HM2x6yOPiZyBvALBicmTgEV3hT9tAAgs4rhdhlJpxUQbrmmD65SLRHsMCHtACZ0rVUNWPG2L8+Kh4BXbWq2sn68XpkUDDF2QovoP/YhzKwxOjiZECWnIf7mayj1uNvKNBvbr4kP1Udl2fwDngVmG0pEJ0/OL+l; __utma=194891197.1403979554.1589960891.1589960891.1592637982.2; __utmz=194891197.1592637982.2.2.utmccn=(referral)|utmcsr=us-east-1.signin.aws.amazon.com|utmcct=/oauth|utmcmd=referral; aws-account-alias=015741542882; aws-target-visitor-id=1574328096262-977650.38_0; aws-userInfo=%7B%22arn%22%3A%22arn%3Aaws%3Aiam%3A%3A015741542882%3Auser%2Fadmin%40amztracker.com%22%2C%22alias%22%3A%22015741542882%22%2C%22username%22%3A%22admin%2540amztracker.com%22%2C%22keybase%22%3A%22%22%2C%22issuer%22%3A%22http%3A%2F%2Fsignin.aws.amazon.com%2Fsignin%22%2C%22signinType%22%3A%22PUBLIC%22%7D; s_vn=1621497365238%26vn%3D24; s_dslv=1595499552869; s_nr=1595499552881-Repeat; x-main=6t8C3W27YeqeNRWEY3X2idQREa3SAV85UcjvcuGcpG2bWTRDA8UZvvSSwyB4IeMV; at-main=Atza|IwEBINv2TQX_ng5LMugpmVYRhgvpzTPtZZwy0vz7C9Mm8KU78FYg4FEhTYANsiWszzwCivXk2JpNvF5Ryg7opOSq2ThURm18cq7V510-x-Dbo5GcPt7macejE-ZA3GxTWGCuRcvLPCmg4FA40zVnfEWd_9zuD69QvDLOxCc0JpYlfQ_4sNXUsoNcgurIPOGzlyeulxaPS0nd84TaYvH3DoOMHe-G; sess-at-main="rahZ0ImAq1qrb+ZGCQeCKuNLHrIOPNskvmLcRYwhdO8="; sst-main=Sst1|PQEiWbRhK36yDCFhadvNoYBvCziTCT8qndUicnSU9ZDsdvx918vIq18IRXpHmWiKZ6VSUpUbPgCcgwSbDPdzVGegRmnbLmy_2nWXfJKYvNZdq18xuJ6D2UlFQXrWa9cH_4XJgRKu5R-4KpSXCCn9TB13ttIIzekiuMIJ6PlXs936b1TPVzmfDBusqcXACrHoSApA62Nc196xjRCyLv8Z2Stzi930Nbx66f_RK4Fg8b9wS-Xqhc1WK533i6lNHKKIRqcV-vyHH2Td89M30FBIBvynpWARa5bo63I6FD4tvK_ivaoMWh5VntLVE55j3yPjBxBZFlhBB4GA1cm7UqGgqR8VrA; session-id-time=2082787201l; skin=noskin; csm-hit=tb:XC9TWVR8WJSKSTZ9HBPT+s-XC9TWVR8WJSKSTZ9HBPT|1596593262092&t:1596593262093&adb:adblk_no; session-token=/SNiCTxUdqg4wCzTStps7AajfmbX8xyeeZVKJ/O9d/3prVMNR0MY5bfpPvZwqc/U4Im16iVy78SWdzzulwt+dvp/KJAjHogt3p0UE/xDoQ4W+URbnwimgQXJ2QxndVVqzDhS07v/IFXi1bsbWtuB49iIVI0Fv+2M66nEC637/ZfvXt5rZmtbh1qURAzLevyzG5jPR6CxTwuxDotfTagsr5DM4Aa6Zy6V5wyRx7BdI6JcezuKHXO2uqmhbqUx9+JsAvOfOf+WI31DFmt4Opm5zg==',

}

response = requests.get('https://www.amazon.com/FOXCESD-Exercise-Tangle-Free-Bearings-Skipping/dp/B088R6R7WR', headers=headers)

html = response.text.encode(response.encoding).decode("utf-8")

print(html)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
回答: 使用Python爬虫可以实现指定目标网站的数据抓取,并且可以绕过一些简便方法无法解决的问题,比如目标网站的封杀。\[1\]以下是一个使用Python亚马逊网站的示例代码: ```python import urllib.request req = urllib.request.urlopen('https://www.amazon.com') print(req.code) ``` 这段代码使用了Python的urllib库来发送HTTP请求并获取亚马逊网站的响应代码。\[2\] 另外,如果你对Python爬虫感兴趣,还可以在QQ技术交流群中获取更多资源,比如3000多本Python电子书、Python开发环境安装教程、Python自学视频等。\[3\] #### 引用[.reference_title] - *1* [【Python爬虫】:使用「Requests」+「bs4」写亚马逊爬虫](https://blog.csdn.net/weixin_33655208/article/details/114446890)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* *3* [带你一步步破解亚马逊 淘宝 京东的反爬虫机制!](https://blog.csdn.net/weixin_52994140/article/details/117957969)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值