A rookie of python_crawler----1(tf)

最新推荐文章于 2022-03-16 16:53:48 发布

xiaoyuyulala

最新推荐文章于 2022-03-16 16:53:48 发布

阅读量120

点赞数

分类专栏： Python学习记录文章标签： python_crawler

本文链接：https://blog.csdn.net/qq_42192672/article/details/84981225

版权

Python学习记录专栏收录该内容

18 篇文章 0 订阅

订阅专栏

记录一个菜鸟学习爬虫的过程

下面这个代码很简单，爬取的是TF官网上热门口红的信息

采取的是最基本的BeautifulSoup和requests库

#A simple code for crawling the information of the popular TF-lipsticks
import requests
import re
from bs4 import BeautifulSoup

url='https://www.tom-ford.cn/'
data={}
headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'Chrome/70.0.3538.77 Safari/537.36'
        }

response = requests.get(url, headers=headers)
html_doc = response.content  # TF
#print(response.status_code)   #状态码
#print(response.content.decode("utf-8")) #内容

soup = BeautifulSoup(
    html_doc,
    'html.parser',
    from_encoding='utf-8'  # html文档编码#
)

TF_type = soup.find_all('a', href=re.compile(r"goods-"))

for tf_type in TF_type:
    #print(tf_type.name,tf_type['href'],tf_type.get_text())
    print(tf_type.get_text())

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

xiaoyuyulala

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
A rookie of python_crawler----1(tf)

记录一个菜鸟学习爬虫的过程下面这个代码很简单，爬取的是TF官网上热门口红的信息采取的是最基本的BeautifulSoup和requests库#A simple code for crawling the information of the popular TF-lipsticksimport requestsimport refrom bs4 import Beautiful...
复制链接

扫一扫