Python3爬取OpenStreetMap平台的城市道路交通网数据

江湖人称王某人的程序员

已于 2022-06-05 16:51:20 修改

阅读量4.7k

点赞数 5

分类专栏： Python 日常文章标签： python OpenStreetMap 爬虫

于 2022-02-06 14:12:13 首次发布

本文链接：https://blog.csdn.net/qq_41445357/article/details/122797755

版权

Python 日常专栏收录该内容

17 篇文章 3 订阅

订阅专栏

Python3 爬取OpenStreetMap平台的城市道路交通网数据

前言
思路
代码
效果
结语

前言

论文需要城市道路信息数据，OpenStreetMap平台是一个开源免费的全球地图信息平台，但是对于中国地图信息收录不是很全。我们可以通过API进行获取指定城市的交通道路信息。实验代码在Python3环境中跑，开放环境是Jupyter。

思路

首先获取城市ID，将城市ID进行转换成10位字符串，传递处理后的城市ID生成OSM文件。这里有关键的两个数据包
1、获取城市ID

<osm-script>
    <query type="relation">
        <has-kv k="boundary" v="administrative"/>
        <has-kv k="name:zh" v="合肥市"/>
    </query>
    <print/>
</osm-script>

2、获取道路信息

<osm-script timeout="1800" element-limit="100000000">
  <union>
    <area-query ref="10位的城市id"/>
    <recurse type="node-relation" into="rels"/>
    <recurse type="node-way"/>
    <recurse type="way-relation"/>
  </union>
  <union>
    <item/>
    <recurse type="way-node"/>
  </union>
  <print mode="body"/>
</osm-script>

代码

import requests
import re

def getCityRpadDataByOSM(cityName):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36",
        "Content-Type": "application/x-www-form-urlencoded"
    }
    data = '<osm-script><query type="relation"><has-kv k="boundary" v="administrative"/><has-kv k="name:zh" v="'+cityName+'"/></query><print/></osm-script>'
    url = "http://www.overpass-api.de/api/interpreter"
    response = requests.post(url, data = data.encode(), headers = headers)
    # 利用正则表达式提取 id
    match = re.search('<relation id="(.*?)">',response.text)
    id = match.group(1)
    # id 需要 10位
    if id:
        id = str(3600000000+int(id))
        print(id)
    else:
        return
    data2 = '<osm-script timeout="1800" element-limit="100000000"><union><area-query ref="'+id+'"/><recurse type="node-relation" into="rels"/><recurse type="node-way"/><recurse type="way-relation"/></union><union><item/><recurse type="way-node"/></union><print mode="body"/></osm-script>'
    response2 = requests.post(url, data = data2, headers = headers)
    if len(response2.text)>1000:
    # 这里设置阀值是因为 网络问题会导致出现超时，丢掉这个包
        with open(cityName+".osm","w",encoding='utf-8') as f:
            f.write(response2.text)

getCityRpadDataByOSM("合肥市")