在Python中从l零开始学习创建热图(Creating Heatmap From Scratch in Python)

Heatmap sample热图样例:

Introduction

Heatmap is frequently used to visualize event occurrence or density. There are some Python libraries or GIS software/tool that can be used to create a heatmap like QGIS, ArcGIS, Google Table Fusion, etc. Unfortunately, this post won't discussed how to create a heatmap using those software/tool, but more than that, we will write our own code to create a heatmap in Python 3 from scratch using Python common library.

The algorithm which will be used to create a heatmap in Python is Kernel Density Estimation (KDE). Please refer to this post (QGIS Heatmap Using KDE Explained) to get more explanation about KDE and another post (Heatmap Calculation Tutorial) which give an example how to calculate intensity for a point from a reference point using KDE.

Importing Library

Actually, there are some libraries in Python that can be used to create heatmap like Scikit-learn or Seaborn. But we will use just some libraries such as matplotlib, numpy and math. So we are starting with importing those three libraries.

import matplotlib.pyplot as plt
import numpy as np
import math

Heatmap Dataset

To create a heatmap, we need a point dataset that consist of x,y coordinates. Here we create two list for x and y. The plot of dataset can be seen in figure 1.

Heatmap Point Dataset in Python
Figure 1. Point Dataset

Grid Size and Radius

In creating heatmap using KDE we need to specify the bandwidth or radius of the kernel shape and output grid size. For this case, I use radius 10 m and grid size 1 m. Later you can change these parameters to see how they affect the heatmap result.

#DEFINE GRID SIZE AND RADIUS(h)
grid_size=1
h=10

Getting X,Y Min/Max to Construct Grid

To construct grid we use mesh grid. Therefore we need to find x,y minimum and maximum to generate a sequence number of x and y. These sequence numbers then will be used to construct mesh grid. To include all the dataset coverage with a little bit more space, I subtract x,y minimum with radius and add it up for x,y maximum.

#GETTING X,Y MIN AND MAX
x_min=min(x)
x_max=max(x)
y_min=min(y)
y_max=max(y)

#CONSTRUCT GRID
x_grid=np.arange(x_min-h,x_max+h,grid_size)
y_grid=np.arange(y_min-h,y_max+h,grid_size)
x_mesh,y_mesh=np.meshgrid(x_grid,y_grid)

Calculate Grid Center Point

After constructing mesh grid. Next we calculate the center point for each grid. This can be done with adding x mesh and y mesh coordinate with half of grid size. The center point will be used later to calculate the distance of each grid to dataset points.

#GRID CENTER POINT
xc=x_mesh+(grid_size/2)
yc=y_mesh+(grid_size/2)

Kernel  Density Estimation Function

To calculate a point density or intensity we use a function called kde_quartic. We are using Quartic kernel shape, that's why it has "quartic" term in the function name. This function has two arguments: point distance(d) and kernel radius (h).

#FUNCTION TO CALCULATE INTENSITY WITH QUARTIC KERNEL
def kde_quartic(d,h):
    dn=d/h
    P=(15/16)*(1-dn**2)**2
    return P

Compute Density Value for Each Grid

This is the hardest part of this post. Computing the density value for each grid. We are doing this in three looping. First loop is for mesh data list or grid. Second loop for each center point of those grids and third loop to calculate the distance of the center point to each dataset point. Using the distance, then we compute the density value of each grid with kde_quartic function which already defined before. It will return a density value for each distance to a data point. Here we only consider the point with a distance within the kernel radius. We do not consider the point outside the kernel radius and set the density value to 0. Then we sum up all density value for a grid to get the total density value for the respective grid   The total density value then is stored in a list which is called intensity_list.

#PROCESSING
intensity_list=[]
for j in range(len(xc)):
    intensity_row=[]
    for k in range(len(xc[0])):
        kde_value_list=[]
        for i in range(len(x)):
            #CALCULATE DISTANCE
            d=math.sqrt((xc[j][k]-x[i])**2+(yc[j][k]-y[i])**2) 
            if d<=h:
                p=kde_quartic(d,h)
            else:
                p=0
            kde_value_list.append(p)
        #SUM ALL INTENSITY VALUE
        p_total=sum(kde_value_list)
        intensity_row.append(p_total)
    intensity_list.append(intensity_row)

Visualize The Result

The last part we visualize the result using matplotlib color mesh. We also add a color bar to see the intensity value. The heatmap result can be seen in figure 2.

#HEATMAP OUTPUT    
intensity=np.array(intensity_list)
plt.pcolormesh(x_mesh,y_mesh,intensity)
plt.plot(x,y,'ro')
plt.colorbar()
plt.show()
Heatmap Output in Python
Figure 2. Heatmap Output

The complete code snippet can be found below:

import matplotlib.pyplot as plt
import numpy as np
import math

#POINT DATASET
x=[20,28,15,20,18,25,15,18,18,20,25,30,25,22,30,22,38,40,38,30,22,20,35,33,35]
y=[20,14,15,20,15,20,32,33,45,50,20,20,20,25,30,38,20,28,33,50,48,40,30,35,36]

#DEFINE GRID SIZE AND RADIUS(h)
grid_size=1
h=10

#GETTING X,Y MIN AND MAX
x_min=min(x)
x_max=max(x)
y_min=min(y)
y_max=max(y)

#CONSTRUCT GRID
x_grid=np.arange(x_min-h,x_max+h,grid_size)
y_grid=np.arange(y_min-h,y_max+h,grid_size)
x_mesh,y_mesh=np.meshgrid(x_grid,y_grid)

#GRID CENTER POINT
xc=x_mesh+(grid_size/2)
yc=y_mesh+(grid_size/2)

#FUNCTION TO CALCULATE INTENSITY WITH QUARTIC KERNEL
def kde_quartic(d,h):
    dn=d/h
    P=(15/16)*(1-dn**2)**2
    return P

#PROCESSING
intensity_list=[]
for j in range(len(xc)):
    intensity_row=[]
    for k in range(len(xc[0])):
        kde_value_list=[]
        for i in range(len(x)):
            #CALCULATE DISTANCE
            d=math.sqrt((xc[j][k]-x[i])**2+(yc[j][k]-y[i])**2) 
            if d<=h:
                p=kde_quartic(d,h)
            else:
                p=0
            kde_value_list.append(p)
        #SUM ALL INTENSITY VALUE
        p_total=sum(kde_value_list)
        intensity_row.append(p_total)
    intensity_list.append(intensity_row)

#HEATMAP OUTPUT    
intensity=np.array(intensity_list)
plt.pcolormesh(x_mesh,y_mesh,intensity)
plt.plot(x,y,'ro')
plt.colorbar()
plt.show()
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
要将热图的变量名放在图表右边,你可以使用 `plt.tick_params` 函数来调整 x 轴和 y 轴的刻度位置。下面是一个示例代码: ```python import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # 创建一个包含多个变量的数据框 data = pd.DataFrame({ 'var1': [1, 2, 3, 4, 5], 'var2': [2, 4, 6, 8, 10], 'var3': [3, 6, 9, 12, 15] }) # 计算相关系数矩阵 correlation_matrix = data.corr() # 使用热图可视化相关系数矩阵 sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm') # 调整 x 轴和 y 轴的刻度位置 plt.tick_params(axis='x', bottom=False, top=False, labelbottom=False) plt.tick_params(axis='y', left=False, right=True, labelleft=False, labelright=True) # 显示图表 plt.show() ``` 在上面的示例代码,我们首先创建了一个包含多个变量的 DataFrame 对象 `data`。然后使用 `data.corr()` 计算了变量之间的相关系数矩阵。接下来,使用 seaborn 库的 `heatmap` 函数将相关系数矩阵以热图的形式进行可视化,并使用 `annot=True` 参数在图显示相关系数的数值。最后,使用 `plt.tick_params` 函数分别调整了 x 轴和 y 轴的刻度位置,以将变量名放在图表的右边。 请注意,这种方法将刻度线隐藏了,并将刻度标签放在了图表的右边。如果你希望保留刻度线,你可以在 `tick_params` 函数调整相应的参数。你可以根据自己的需求调整代码的相关部分。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

David-Chow

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值