python 爬取中国邮政编码

远方的飞猪

已于 2023-07-30 14:46:27 修改

阅读量1.5k

点赞数 1

分类专栏：爬虫 Python 文章标签： python xpath

于 2020-09-19 00:41:05 首次发布

本文链接：https://blog.csdn.net/tanjunchen/article/details/108675742

版权

本文介绍了如何使用Python进行网络爬虫，从指定网站抓取中国各地的邮政编码信息，详细提供了源代码链接。

摘要由CSDN通过智能技术生成

中国邮政编码

http://www.yb21.cn

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
from lxml import etree
from multiprocessing import Manager, cpu_count, Pool
import requests
from urllib.parse import urljoin
import pandas as pd
from datetime import datetime
import time

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'
}


class PostSpider(object):
    url = "http://www.yb21.cn"

    def index_page(self, url_queue):
        res = requests.get(self.url, headers=headers)
        res.encoding = "gbk"
        html = etree.HTML(res.text)
        city_href = html.xpath("//a/@href")
        for href in city_href:
            url_queue.put(urljoin(self.url, href))

    def spider(self, url_queue,

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

远方的飞猪

关注关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
python 爬取中国邮政编码

源代码下载：https://github.com/tanjunchen/SpiderProject/tree/master/ZipCode中国邮政编码http://www.yb21.cn#!/usr/bin/env python# -*- coding: utf-8 -*-import jsonfrom lxml import etreefrom multiprocessing import Manager, cpu_count, Poolimport requestsfr.
复制链接

扫一扫