Python筛选并计算落在shp各个多边形中的格点数据

GUoranger

已于 2024-06-26 11:45:33 修改

阅读量189

点赞数 2

文章标签： python

于 2024-06-26 11:42:30 首次发布

本文链接：https://blog.csdn.net/GUoranger/article/details/139984921

版权

问题

一个含有一个及以上多边形的地理空间shp文件，一个含有经度、纬度、参数数据的csv文件。需要：

### 筛选落在shp各个多边形中的格点数据
### 可将落在shp各个多边形中的每个格点数据都显示出来
### 同时计算落在每个多边形中的所有格点数据的平均值
### 如果没有格点落在多边形中（即多边形无数据），那么采用距离多边形中心最近的有效格点数据替代。

思路

将csv文件中的经度、纬度信息处理成为地理空间点信息；
将csv的地理空间点与shp的地理空间多边形进行空间连接，筛选出落在shp各个多边形中csv点数据；
计算shp各个多边形的位置中心，从csv有效栅格数据中找到距离每个shp中心最近的栅格数据；
计算落在每个多边形中的所有格点数据的平均值；
判断由于没有格点落在多边形中而引起的多边形对应数据缺失的情况，采用距离多边形中心最近的有效格点数据替代；
将结果保存到新csv文件中。

代码

### 筛选 落在shp各个多边形中的 格点数据
### 可将 落在shp各个多边形中的 每个格点数据 都显示出来
### 同时 计算落在每个多边形中的 所有格点数据的 平均值
### 如果 没有格点落在多边形中（即多边形无数据），那么 采用距离多边形中心最近的 有效格点数据 替代

import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import Point
from sklearn.neighbors import BallTree

# Load the shapefile
shapefile_path = "/home/shp/2015county.shp"
gdf_polygon = gpd.read_file(shapefile_path)
gdf_polygon_subset = gdf_polygon[['区划码', '地名']]  # 原来shp中信息非常多，这里就选取关注的子集
# Load the CSV file
csv_directory = "/home/O3/O3_D10K_20000.csv"
df_point = pd.read_csv(csv_directory)

# Convert the CSV dataframe to Point geometry
def create_point_geometry(row):
    longtitude = row['longitude']
    latitude = row['latitude']
    return Point(longtitude, latitude)
# Create Point geometries from latitude and longitude
geometry = df_point.apply(create_point_geometry, axis=1)
# Convert DataFrame to GeoDataFrame
gdf_points = gpd.GeoDataFrame(df_point, geometry=geometry)
# Set the coordinate reference system (CRS) if known
gdf_points.crs = {'init': 'epsg:4326'}
# Transform the CRS if needed (to match shapefile CRS)
gdf_points = gdf_points.to_crs(gdf_polygon.crs)

# Find the nearest valid point
# 计算多边形的中心
gdf_polygon['centroid'] = gdf_polygon.centroid
polygon_centers = list(zip(gdf_polygon['centroid'].x, gdf_polygon['centroid'].y))
# 获取点目标数据为非零、非空、非NaN的点的中心
valid_gdf_points = gdf_points.dropna(subset=['MDA8_O3'])
valid_gdf_points = valid_gdf_points[valid_gdf_points['MDA8_O3'] != 0]
valid_point_coordinates = list(zip(valid_gdf_points.geometry.x, valid_gdf_points.geometry.y))
# 使用BallTree来加速最邻近搜索
tree = BallTree(valid_point_coordinates, leaf_size=15)
# Query BallTree to find nearest valid point indices and distances for each polygon centroid
distances, indices = tree.query(np.array(polygon_centers), k=1)
# Flatten indices and distances
nearest_point_indices = indices.flatten()
nearest_point_distances = distances.flatten()
# Map nearest point indices back to gdf_polygon
gdf_polygon['nearest_point_index'] = nearest_point_indices
gdf_polygon['nearest_point_distance'] = nearest_point_distances
# Retrieve MDA8_O3 values corresponding to nearest points， and Assign MDA8_O3 values to gdf_polygon
nearest_point_MDA8_O3 = valid_gdf_points.iloc[nearest_point_indices]['MDA8_O3'].values
gdf_polygon['nearest_point_MDA8_O3'] = nearest_point_MDA8_O3

# Perform spatial join # 进行空间连接
mean_data=[]
valid_points_in_polygons = gpd.sjoin(valid_gdf_points, gdf_polygon, how="inner", op='within')
# Calculate mean of MDA8_O3 values grouped by '县级码' (assuming it's a column in gdf_polygon)
mean_data = valid_points_in_polygons.groupby('区划码')['MDA8_O3'].mean().reset_index()
# Merge mean data back to gdf based on '区划码'
merged_gdf = gdf_polygon_subset.merge(mean_data, on='区划码', how='left')
# Check if there are any NaN values in mean_data['MDA8_O3']
if merged_gdf['MDA8_O3'].isnull().any():
    # Handle NaN values here; for example, replace NaN with values from another column
    merged_gdf['MDA8_O3'].fillna(gdf_polygon['nearest_point_MDA8_O3'], inplace=True)

# Define the output directory for the new CSV files
output_file1 = "/home/O3_PointInCounty_2000.csv"
output_file2 = "/home/O3_County_2000.csv"

# Save the spatially joined data to the new CSV files
valid_points_in_polygons.to_csv(output_file1, index=False)
merged_gdf.to_csv(output_file2, index=False)

print('done')

GUoranger

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
Python筛选并计算落在shp各个多边形中的格点数据

## 如果没有格点落在多边形中（即多边形无数据），那么采用距离多边形中心最近的有效格点数据替代。将csv的地理空间点与shp的地理空间多边形进行空间连接，筛选出落在shp各个多边形中csv点数据；一个含有一个及以上多边形的地理空间shp文件，一个含有经度、纬度、参数数据的csv文件。计算shp各个多边形的位置中心，从csv有效栅格数据中找到距离每个shp中心最近的栅格数据；### 同时计算落在每个多边形中的所有格点数据的平均值。没有格点落在多边形中而引起的多边形对应数据缺失的情况，
复制链接

扫一扫