python3[爬虫实战] 爬虫之requests爬取新浪微博京东客服

爬取的内容为京东客服的微博及评论

思路:主要是通过手机端访问新浪微博的api接口,然后进行数据的筛选,

类似于这样的:https://m.weibo.cn/u/5650743478?uid=5650743478&luicode=10000011&lfid=100103type%3D1%26q%3D%40京东客服&featurecode=20000320

这个主要是登陆上去的微博的url链接,


也可以在
https://passport.weibo.cn/signin/welcome?entry=mweibo&r=http%3A%2F%2Fm.weibo.cn%2F
进行新浪微博的登陆,

可以看到的界面:

这里写图片描述

这里主要爬取的内容为:

说说,说说下面的评论条目

虽然很简单,但是,不得不说句mmp,爬取的过程很坎坷,现在是一直在ip上,另外,个人经过尝试,睡眠时间30秒一次也不是很好的效果, 睡眠10秒就足够了,可能该封你的ip还是会封的,我这问题应该封ip的情况

爬取的方法主要是通过手机端api进行json数据的获取,然后进行数据的提取。

这里可以使用火狐fox的插件使用:

主要api:

说说API:

第一条微博:
https://m.weibo.cn/api/container/getIndex?uid=5650743478&luicode=10000011&lfid=100103type%3D1%26q%3D京东客服&featurecode=20000320&type=uid&value=5650743478&containerid=1076035650743478

第二条微博:
https://m.weibo.cn/api/container/getIndex?uid=5650743478&luicode=10000011&lfid=100103type%3D1%26q%3D京东客服&featurecode=20000320&type=uid&value=5650743478&containerid=1076035650743478&page=2

类似于这样子的,

详情评论内容API:

在每条评论下会有一个idstr:4137390568546147

然后跳到评论详情页:
https://m.weibo.cn/status/4137390568546147

评论条目拼加方式:
https://m.weibo.cn/api/comments/show?id=4137390568546147&page=1
https://m.weibo.cn/api/comments/show?id=4137390568546147&page=2

带大家看一下评论api下返回的数据:JSON格式的

{
    "cardlistInfo": {
        "containerid": "1076035650743478",
        "v_p": 42,
        "show_style": 1,
        "total": 3264,
        "page": 2
    },
    "cards": [
        {
            "card_type": 9,
            "itemid": "1076035650743478_-_4137858652321796",
            "scheme": "https://m.weibo.cn/status/FfSSl9K0k?mblogid=FfSSl9K0k&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
            "mblog": {
                "created_at": "2小时前",
                "id": "4137858652321796",
                "mid": "4137858652321796",
                "idstr": "4137858652321796",
                "text": "明天又要上班了,用四个字描述下你现在的心情吧<span class=\"url-icon\"><img src=\"//h5.sinaimg.cn/m/emoticon/icon/others/d_erha-0d2bea3a7d.png\" style=\"width:1em;height:1em;\" alt=\"[二哈]\"></span> ​​​",
                "textLength": 50,
                "source": "微博 weibo.com",
                "favorited": false,
                "thumbnail_pic": "http://wx4.sinaimg.cn/thumbnail/006apWvQgy1fi7tkjguy4j309q09qt8q.jpg",
                "bmiddle_pic": "http://wx4.sinaimg.cn/bmiddle/006apWvQgy1fi7tkjguy4j309q09qt8q.jpg",
                "original_pic": "http://wx4.sinaimg.cn/large/006apWvQgy1fi7tkjguy4j309q09qt8q.jpg",
                "user": {
                    "id": 5650743478,
                    "screen_name": "京东客服",
                    "profile_image_url": "https://tva4.sinaimg.cn/crop.38.7.206.206.180/006apWvQjw8f9dwuejt68j307y0630sz.jpg",
                    "profile_url": "https://m.weibo.cn/u/5650743478?uid=5650743478&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
                    "statuses_count": 3245,
                    "verified": true,
                    "verified_type": 2,
                    "verified_type_ext": 0,
                    "verified_reason": "北京京东世纪贸易有限公司",
                    "description": "订单咨询、问题反馈、意见建议……获取专业贴心服务,尽在京东客服",
                    "gender": "f",
                    "mbtype": 2,
                    "urank": 29,
                    "mbrank": 2,
                    "follow_me": false,
                    "following": false,
                    "followers_count": 18427,
                    "follow_count": 235,
                    "cover_image_phone": "https://tva4.sinaimg.cn/crop.0.0.640.640.640/006apWvQjw1f2g20q03tbj30e80e8t93.jpg"
                },
                "reposts_count": 0,
                "comments_count": 4,
                "attitudes_count": 2,
                "isLongText": false,
                "visible": {
                    "type": 0,
                    "list_id": 0
                },
                "mblogtype": 0,
                "bid": "FfSSl9K0k",
                "pics": [
                    {
                        "pid": "006apWvQgy1fi7tkjguy4j309q09qt8q",
                        "url": "https://wx4.sinaimg.cn/orj360/006apWvQgy1fi7tkjguy4j309q09qt8q.jpg",
                        "size": "orj360",
                        "geo": {
                            "width": "350",
                            "height": "350",
                            "croped": false
                        },
                        "large": {
                            "size": "large",
                            "url": "https://wx4.sinaimg.cn/large/006apWvQgy1fi7tkjguy4j309q09qt8q.jpg",
                            "geo": {
                                "width": "350",
                                "height": "350",
                                "croped": false
                            }
                        }
                    }
                ]
            },
            "show_type": 0,
            "openurl": ""
        },
        {
            "card_type": 9,
            "itemid": "1076035650743478_-_4137692553365577",
            "scheme": "https://m.weibo.cn/status/FfOyre7xv?mblogid=FfOyre7xv&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
            "mblog": {
                "created_at": "13小时前",
                "id": "4137692553365577",
                "mid": "4137692553365577",
                "idstr": "4137692553365577",
                "text": "你觉得举办哪种《中国有_____》比赛,你能进入决赛? ​​​",
                "textLength": 49,
                "source": "微博 weibo.com",
                "favorited": false,
                "thumbnail_pic": "http://wx2.sinaimg.cn/thumbnail/006apWvQgy1fi7ul9n9rfj30k00lsgnj.jpg",
                "bmiddle_pic": "http://wx2.sinaimg.cn/bmiddle/006apWvQgy1fi7ul9n9rfj30k00lsgnj.jpg",
                "original_pic": "http://wx2.sinaimg.cn/large/006apWvQgy1fi7ul9n9rfj30k00lsgnj.jpg",
                "user": {
                    "id": 5650743478,
                    "screen_name": "京东客服",
                    "profile_image_url": "https://tva4.sinaimg.cn/crop.38.7.206.206.180/006apWvQjw8f9dwuejt68j307y0630sz.jpg",
                    "profile_url": "https://m.weibo.cn/u/5650743478?uid=5650743478&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
                    "statuses_count": 3245,
                    "verified": true,
                    "verified_type": 2,
                    "verified_type_ext": 0,
                    "verified_reason": "北京京东世纪贸易有限公司",
                    "description": "订单咨询、问题反馈、意见建议……获取专业贴心服务,尽在京东客服",
                    "gender": "f",
                    "mbtype": 2,
                    "urank": 29,
                    "mbrank": 2,
                    "follow_me": false,
                    "following": false,
                    "followers_count": 18427,
                    "follow_count": 235,
                    "cover_image_phone": "https://tva4.sinaimg.cn/crop.0.0.640.640.640/006apWvQjw1f2g20q03tbj30e80e8t93.jpg"
                },
                "reposts_count": 0,
                "comments_count": 13,
                "attitudes_count": 1,
                "isLongText": false,
                "visible": {
                    "type": 0,
                    "list_id": 0
                },
                "mblogtype": 0,
                "bid": "FfOyre7xv",
                "pics": [
                    {
                        "pid": "006apWvQgy1fi7ul9n9rfj30k00lsgnj",
                        "url": "https://wx2.sinaimg.cn/orj360/006apWvQgy1fi7ul9n9rfj30k00lsgnj.jpg",
                        "size": "orj360",
                        "geo": {
                            "width": 360,
                            "height": 392,
                            "croped": false
                        },
                        "large": {
                            "size": "large",
                            "url": "https://wx2.sinaimg.cn/large/006apWvQgy1fi7ul9n9rfj30k00lsgnj.jpg",
                            "geo": {
                                "width": "720",
                                "height": "784",
                                "croped": false
                            }
                        }
                    }
                ]
            },
            "show_type": 0,
            "openurl": ""
        },
        {
            "card_type": 9,
            "itemid": "1076035650743478_-_4137390568546147",
            "scheme": "https://m.weibo.cn/status/FfGHmzRf5?mblogid=FfGHmzRf5&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
            "mblog": {
                "created_at": "昨天 14:24",
                "id": "4137390568546147",
                "mid": "4137390568546147",
                "idstr": "4137390568546147",
                "text": "周末就是买买买,吃吃吃<span class=\"url-icon\"><img src=\"//h5.sinaimg.cn/m/emoticon/icon/default/d_huaixiao-bb5966dcc6.png\" style=\"width:1em;height:1em;\" alt=\"[坏笑]\"></span> ​​​",
                "textLength": 28,
                "source": "微博 weibo.com",
                "favorited": false,
                "thumbnail_pic": "http://wx2.sinaimg.cn/thumbnail/006apWvQgy1fi7taijr9pg307e05kgvl.gif",
                "bmiddle_pic": "http://wx2.sinaimg.cn/bmiddle/006apWvQgy1fi7taijr9pg307e05kgvl.gif",
                "original_pic": "http://wx2.sinaimg.cn/large/006apWvQgy1fi7taijr9pg307e05kgvl.gif",
                "user": {
                    "id": 5650743478,
                    "screen_name": "京东客服",
                    "profile_image_url": "https://tva4.sinaimg.cn/crop.38.7.206.206.180/006apWvQjw8f9dwuejt68j307y0630sz.jpg",
                    "profile_url": "https://m.weibo.cn/u/5650743478?uid=5650743478&luicode=10000011&lfid=1076035650743478&featurecode=20000320",
                    "statuses_count": 3245,
                    "verified": true,
                    "verified_type": 2,
                    "verified_type_ext": 0,
                    "verified_reason": "北京京东世纪贸易有限公司",
                    "description": "订单咨询、问题反馈、意见建议……获取专业贴心服务,尽在京东客服",
                    "gender": "f",
                    "mbtype": 2,
                    "urank": 29,
                    "mbrank": 2,
                    "follow_me": false,
                    "following": false,
                    "followers_count": 18427,
                    "follow_count": 235,
                    "cover_image_phone": "https://tva4.sinaimg.cn/crop.0.0.640.640.640/006apWvQjw1f2g20q03tbj30e80e8t93.jpg"
                },
                "reposts_count": 0,
                "comments_count": 19,
                "attitudes_count": 1,
                "isLongText": false,
                "visible": {
        
  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值