本文的目的是展示饼图和图表如何提升科研论文的价值。
本文演示了如何使用简单的工具和库创建这些可视化效果。
本文使用了不同的python可视化库来生成不同类型的图表,并且使用了地理数据,包括河流、人口、城市、污染等,以呈现适合科研论文的高质量数据可视化。而本文使用的代码均可以拿来即用~记得点赞收藏呀~
1. 弦线图
我认为弦线图是河流、道路和水渠等地理要素从一个地区流向另一个地区的最佳图形表示方法。例如,我曾以印度河流为例,展示它们的水是如何流经多个邦,同时又被共享的。以戈达瓦里河为例,它发源于中央邦,然后流经马哈拉施特拉邦、特兰加纳邦和安得拉邦。
我从维基百科收集数据,将其输入 Python,并使用Plotly 库创建了这个有价值的可视化图表。因此,如果你正在做一个项目、一篇科学研究论文或任何学术工作,你都可以将其作为地图式可视化来实现。如果你担心图像质量,大可不必!提高像素分辨率可以增强清晰度,不同的颜色可以突出特定区域。
# Install required libraries
!pip install plotly pandas
# Import necessary libraries
import plotly.graph_objects as go
import pandas as pd
# Define river water flow data (in million cubic meters)
data = {
"River": ["Ganga", "Yamuna", "Narmada", "Krishna", "Kaveri", "Godavari", "Brahmaputra", "Mahanadi", "Indus", "Tapi"],
"From": ["Uttarakhand", "Himachal Pradesh", "Madhya Pradesh", "Maharashtra", "Karnataka", "Maharashtra",
"Arunachal Pradesh", "Chhattisgarh", "Himachal Pradesh", "Madhya Pradesh"],
"To": ["Uttar Pradesh", "Haryana", "Gujarat", "Telangana", "Tamil Nadu", "Andhra Pradesh",
"Assam", "Odisha", "Punjab", "Maharashtra"],
"Flow": [500, 700, 650, 800, 900, 750, 1000, 600, 550, 620]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Create nodes (Unique states)
states = list(set(df["From"]).union(set(df["To"])))
# Assign an index to each state
state_indices = {state: i for i, state in enumerate(states)}
# Convert states to numerical indices
df["Source"] = df["From"].map(state_indices)
df["Target"] = df["To"].map(state_indices)
# Create Chord Diagram (Sankey-style)
fig = go.Figure(go.Sankey(
node=dict(
pad=15, thickness=20,
label=states,
color=["blue"if state in df["From"].values else"green"for state in states]
),
link=dict(
source=df["Source"],
target=df["Target"],
value=df["Flow"],
color="rgba(0, 128, 255, 0.5)"
)
))
# Customize layout
fig.update_layout(title_text="Inter-State River Water Flow in India", font_size=12)
# Show plot
fig.show()

2. 三维饼图
利用Matplotlib实现了球体上的三维饼图。Matplotlib 可以帮助我们用数学计算出的有意义的间距来表示精确的数据。这就是 Matplotlib 的魅力所在!除此之外,我们还使用了Pandas和NumPy,它们都是基本但功能强大的库。
在本演示中,我们使用球体上的垂直三维条形图展示了印度各邦的小麦产量,每个条形图的高度表示产量水平。这创建了一个漂亮的可视化,非常适合科学研究论文。通过一些基本脚本,我们成功地复制了这一概念!
# Install required libraries
!pip install numpy pandas matplotlib
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Define agricultural production data (in million metric tons)
data = {
"State": ["Uttar Pradesh", "Punjab", "Haryana", "Madhya Pradesh", "Maharashtra", "Bihar", "West Bengal"],
"Wheat": [30, 25, 18, 20, 10, 8, 5],
"Rice": [15, 12, 10, 8, 6, 20, 18],
"Sugarcane": [150, 40, 30, 25, 120, 35, 15]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Choose category to visualize (Wheat, Rice, or Sugarcane)
category = "Wheat"
# Extract values
labels = df["State"]
sizes = df[category]
# Normalize data to fit on sphere
sizes = sizes / np.sum(sizes) * 100
# Convert pie chart angles to spherical coordinates
theta = np.linspace(0, 2*np.pi, len(sizes) + 1)
phi = np.linspace(0, np.pi, len(sizes) + 1)
# Convert to Cartesian coordinates
x = np.outer(np.cos(theta), np.sin(phi))
y = np.outer(np.sin(theta), np.sin(phi))
z = np.outer(np.ones_like(theta), np.cos(phi))
# Create figure
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
# Plot sphere
ax.plot_surface(x, y, z, color='lightblue', alpha=0.5)
# Define colors
cmap = plt.get_cmap("tab10")
colors = [cmap(i) for i in range(len(sizes))]
# Plot pie slices
for i in range(len(sizes)):
ax.bar3d(
x[i, :-1], y[i, :-1], z[i, :-1],
0.05, 0.05, sizes[i] / 10, color=colors[i], alpha=0.8
)
# Add labels near bars
ax.text(x[i, 0], y[i, 0], z[i, 0] + sizes[i] / 10 + 0.02, labels[i],
color='black', fontsize=10, fontweight='bold')
# Labels
ax.set_title(f"State-wise {category} Production in India")
ax.set_xlabel("X Axis")
ax.set_ylabel("Y Axis")
ax.set_zlabel("Contribution (%)")
# Hide axis
ax.set_xticks([])
ax.set_yticks([])
ax.set_zticks([])
# Add legend
legend_patches = [plt.Line2D([0], [0], marker='o', color='w', markersize=10, markerfacecolor=colors[i], label=labels[i]) for i in range(len(labels))]
ax.legend(handles=legend_patches, loc='upper left', bbox_to_anchor=(1.1, 1))
# Show plot
plt.show()

3. Kiviat(雷达图)
我发现雷达图是表示空气质量指数的最佳方式,我们需要展示特定地点的不同空气颗粒。这是通过Matplotlib实现的,NumPy用于计算,Pandas用于过滤数据。
在下图中,方形代表印度城市,而圆角代表污染物颗粒。例如,与Mumbai,Kolkata和 Bangalore等城市相比,Delhi占据了最外层,在所有污染物类别中排名最差。
α=0.6 → 0.8
使网格线更暗、更明显
# Install required libraries
!pip install numpy pandas matplotlib
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Define AQI data (in µg/m³)
data = {
"City": ["Delhi", "Mumbai", "Kolkata", "Bangalore", "Chennai"],
"PM2.5": [150, 90, 120, 80, 70],
"NO2": [60, 40, 50, 30, 35],
"CO": [2.1, 1.8, 1.5, 1.2, 1.3],
"SO2": [20, 15, 18, 10, 12]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Define AQI parameters
categories = ["PM2.5", "NO2", "CO", "SO2"]
num_vars = len(categories)
# Convert data to a range [0,1] for fair comparison
normalized_df = df.copy()
for col in categories:
normalized_df[col] = df[col] / df[col].max()
# Convert to numpy array
values = normalized_df[categories].values
# Compute angles for radar chart
angles = np.linspace(0, 2 * np.pi, num_vars, endpoint=False).tolist()
# Close the radar chart (loop back to the first point)
angles += angles[:1]
# Create figure
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True))
# Plot each city's AQI values
colors = ['red', 'blue', 'green', 'purple', 'orange']
for i, city in enumerate(df["City"]):
city_values = values[i].tolist()
city_values += city_values[:1] # Close the shape
ax.plot(angles, city_values, label=city, color=colors[i], linewidth=2)
ax.fill(angles, city_values, color=colors[i], alpha=0.2)
# Add category labels
ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories, fontsize=12)
# Set title
ax.set_title("Air Quality Index (AQI) Comparison Across Indian Cities", fontsize=14, fontweight='bold')
# Add legend
ax.legend(loc="upper right", bbox_to_anchor=(1.2, 1.1))
# Show plot
plt.show()

4. 树形图
为了生成这个树形图,我们使用了Plotly,在这里我们没有生成带有节点和分支的树形结构。相反,我们生成了方括号及其各自的部分。与其他类型的图表相比,这种图表有助于展示类别在其百分比部分的。
只说印度有 80% 的热带雨林和 5% 的苔原是很奇怪的,我们也不能总是研究论文中的表格,这样会让读者感到乏味。相反,这些方框代表了特定的部分,使信息更加引人入胜。这就是本图的主要动机,我们通过使用 Plotly了这一点。
# Install required libraries
!pip install pandas plotly
# Import necessary libraries
import pandas as pd
import plotly.express as px
# Define forest cover data (in square kilometers)
data = {
"State": ["Madhya Pradesh", "Arunachal Pradesh", "Chhattisgarh", "Maharashtra", "Odisha", "Kerala", "West Bengal"],
"Forest Type": ["Deciduous", "Evergreen", "Deciduous", "Deciduous", "Deciduous", "Evergreen", "Mangroves"],
"Area": [77000, 9000, 54000, 50000, 45000, 29000, 4200]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Create Treemap
fig = px.treemap(df,
path=["State", "Forest Type"], # Hierarchical structure
values="Area",
color="Forest Type",
title="Forest Cover Distribution by State & Type in India",
color_discrete_map={"Deciduous": "green", "Evergreen": "darkgreen", "Mangroves": "blue"})
# Show plot
fig.show()

5. 瀑布图
瀑布图在金融领域相当流行,例如,用于表示个国家的收入、收益、国内生产总值、贸易逆差等。这种图表有点难以消化--它不像普通的条形图。它涉及净值计算,包括加法和减法,这就是为什么它被广泛应用于研究领域的原因,只需查看图表数据即可最大限度地减少计算步骤。
为了更好地理解,请看这里:终计算结果是 990 降雨量贡献(BCM),其中包括整个州的降雨量。我们可以说,990 是最终数字,在这个数字中,喀拉拉邦贡献了 100。这意味着,如果我们从990中减去泰米尔纳德邦之前的所有数字,就得到喀拉拉邦的贡献**。同样,我们也可以计算出其他各州的情况。
Step State Rainfall Contribution (BCM) Cumulative Total (BCM)
1 Uttarakhand +120 120
2 Uttar Pradesh +180 300 (120 + 180)
3 Himachal Pradesh +90 390 (300 + 90)
4 Maharashtra +150 540 (390 + 150)
5 Andhra Pradesh +130 670 (540 + 130)
6 Karnataka +140 810 (670 + 140)
7 Tamil Nadu +80 890 (810 + 80)
8 Kerala +100 990 (890 + 100)
# Install required libraries
!pip install pandas plotly
# Import necessary libraries
import pandas as pd
import plotly.graph_objects as go
# Define rainfall contribution data (in BCM - Billion Cubic Meters)
data = {
"State": ["Uttarakhand", "Uttar Pradesh", "Himachal Pradesh", "Maharashtra",
"Andhra Pradesh", "Karnataka", "Tamil Nadu", "Kerala"],
"River Basin": ["Ganga", "Ganga", "Yamuna", "Godavari", "Godavari", "Krishna", "Krishna", "Western Ghats"],
"Rainfall Contribution": [120, 180, 90, 150, 130, 140, 80, 100]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Create Waterfall Chart
fig = go.Figure(go.Waterfall(
name="Rainfall Contribution",
orientation="v",
measure=["relative"] * len(df), # Makes it a cumulative plot
x=df["State"], # X-axis: States
text=df["River Basin"], # Shows River Basin names
y=df["Rainfall Contribution"], # Y-axis: Rainfall Contribution
textposition="outside",
connector={"line":{"color":"rgb(63, 63, 63)"}}
))
# Customize layout
fig.update_layout(title="State-wise Rainfall Contribution to River Basins in India",
xaxis_title="States",
yaxis_title="Rainfall Contribution (BCM)",
showlegend=False)
# Show plot
fig.show()

6. 旭日图
旭日图是科学研究文件中最常用的图表之一。它用来表示在一个示范中的百分比或份额,例如一个国家的人口按省分布。
让我们以下面的图表为例: 印度及其人口最多的州和城市。例如,浦那和孟买是马哈拉施特拉邦人口最多的城市。我们使用 Plotly 和 Pandas 实现了这一目标。
在这里,Pandas 帮助我们筛选出给定的数据,使我们能够确定人口最多的城市及其各自的州,而Plotly 则帮助我们以图形格式表示这些信息。
# Install required libraries
!pip install pandas plotly
# Import necessary libraries
import pandas as pd
import plotly.express as px
# Define hierarchical population data
data = {
"Country": ["India"] * 7,
"State": ["Uttar Pradesh", "Uttar Pradesh", "Maharashtra", "Maharashtra", "West Bengal", "Tamil Nadu", "Karnataka"],
"District": ["Lucknow", "Varanasi", "Mumbai", "Pune", "Kolkata", "Chennai", "Bangalore"],
"Population": [3.5, 2.3, 12.5, 7.1, 4.6, 5.3, 11.0] # Population in millions
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Create Sunburst Chart (Fixed Version)
fig = px.sunburst(df,
path=["Country", "State", "District"], # Hierarchical structure
values="Population",
title="India's Population Breakdown (Country → State → District)")
# Show plot
fig.show()

7.折线图
当我们比较两件不同的事情以显示随时间的进展时,我们通常使用一种叫做折线图的图形。请看下面的图--我绘制了两条线,显示它们在各自时间内的进展情况,数据放在X和Y轴上。
在地理研究论文中,这种图表用于展示降雨、温度和气旋的强度。下面,我截取了有关孟加拉湾和阿拉伯海气旋强度上升的数据。
我们只用了基本的编码就实现了这个图表。这里使用了Matplotlib,这是一个高科学性的 Python 库,还使用了 NumPy进行基本计算,比如按升序存储年份和气旋频率。
import numpy as np
import matplotlib.pyplot as plt
# Define cyclone frequency data (years vs. cyclone counts)
years = np.array([1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020])
bay_of_bengal = np.array([4, 5, 8, 6, 7, 10, 12, 15])
arabian_sea = np.array([1, 2, 3, 3, 4, 5, 6, 8])
# Plot cyclone frequency trends
plt.figure(figsize=(10, 5))
plt.plot(years, bay_of_bengal, marker='o', linestyle='-', label='Bay of Bengal', color='red')
plt.plot(years, arabian_sea, marker='s', linestyle='-', label='Arabian Sea', color='blue')
# Customize labels and title
plt.xlabel("Year")
plt.ylabel("Cyclone Count")
plt.title("Cyclone Frequency Trends Over Decades")
plt.legend()
plt.grid(True)
# Show plot
plt.show()

8. 径向图
下面,我们使用了螺旋图或径向图,它一般用于显示时间与各自变化之间的关系,如降水量和温度。这种图表在天气预报中非常有用。
在中心,我们通常会受到最小的影响,而随着我们向外表面移动,影响会越来越大。请看下图,降雨量和温度与月份的关系图。您可以看到它是如何准确地了数据,显示出有四个降雨月,而温度在 6-7 个月内保持稳定。
我们使用了强大的数据可视化库之一 Matplotlib来实现这一目标,它是科学数据的代表。此外,我们还使用了NumPy进行数学计算,并使用Pandas对数据进行过滤并按月份排列。
# Install required libraries
!pip install numpy pandas matplotlib
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Define monthly temperature & rainfall data
months = np.array(["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"])
temperature = np.array([15, 18, 22, 28, 32, 35, 33, 31, 29, 26, 22, 18]) # in °C
rainfall = np.array([20, 25, 30, 40, 50, 150, 300, 280, 180, 100, 50, 30]) # in mm
# Convert months to angles (full circle)
theta = np.linspace(0, 2*np.pi, len(months))
# Create figure & polar plot
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw={"projection": "polar"})
# Plot temperature in spiral form
ax.plot(theta, temperature, marker="o", color="red", label="Temperature (°C)", linewidth=2)
ax.fill(theta, temperature, color="red", alpha=0.2)
# Plot rainfall in spiral form (scaled for visualization)
ax.plot(theta, rainfall / 10, marker="o", color="blue", label="Rainfall (mm / 10)", linewidth=2)
ax.fill(theta, rainfall / 10, color="blue", alpha=0.2)
# Customize labels
ax.set_xticks(theta)
ax.set_xticklabels(months, fontsize=12)
ax.set_title("Spiral Chart of Monthly Temperature & Rainfall in India", fontsize=14, fontweight="bold")
# Add legend
ax.legend(loc="upper right")
# Show plot
plt.show()

9. Voronoi 图
Voronoi图实际上是地图的另一种表现形式,因此当我们需要表示分布在表面上的数据时非常有用。它还广泛应用于天体物理学中的绘图应用。因此,如果你正在进行任何天体物理学或天文学相关的项目,不妨试试这款软件!它能以与谷歌追踪地理位置相同的方式映射数据。
我还注意到,只需增加像素数,就能提高图像质量。我们使用Matplotlib实现了这一点,它是科学可视化领域好的工具之一。Voronoi图在展示产品分布和空间数据方面非常强大。
除了Matplotlib,我们还使用了Pandas来过滤数值和NumPy来进行数学计算,例如距离测量。最后,我们了数据的二维绘图表示。
在这个例子中,我们展示了印度的城市密度。例如,加尔各答占地面积较大,人口1400 万,而孟买占地面积较小,人口却2000 万。
# Install required libraries
!pip install numpy pandas matplotlib scipy
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
# Define city data (coordinates & population)
data = {
"City": ["Mumbai", "Delhi", "Bangalore", "Kolkata", "Chennai", "Hyderabad", "Pune", "Ahmedabad", "Jaipur", "Lucknow"],
"Population": [20.4, 18.9, 12.9, 14.9, 10.5, 10.0, 7.5, 8.0, 3.9, 3.5], # in millions
"Latitude": [19.0760, 28.7041, 12.9716, 22.5726, 13.0827, 17.3850, 18.5204, 23.0225, 26.9124, 26.8467],
"Longitude": [72.8777, 77.1025, 77.5946, 88.3639, 80.2707, 78.4867, 73.8567, 72.5714, 75.7873, 80.9462]
}
# Convert to DataFrame
df = pd.DataFrame(data)
# Extract coordinates
points = np.column_stack((df["Longitude"], df["Latitude"]))
# Compute Voronoi diagram
vor = Voronoi(points)
# Create figure
fig, ax = plt.subplots(figsize=(10, 8))
# Plot Voronoi diagram
voronoi_plot_2d(vor, ax=ax, show_vertices=False, line_colors='blue', line_width=1, line_alpha=0.6)
# Plot city points
ax.scatter(df["Longitude"], df["Latitude"], color="red", s=df["Population"] * 10, label="Cities (Scaled by Population)")
# Add city names
for i, city in enumerate(df["City"]):
ax.text(df["Longitude"][i], df["Latitude"][i], city, fontsize=12, fontweight="bold", ha="right")
# Customize plot
ax.set_title("Voronoi Diagram of Indian Cities by Population Density", fontsize=14, fontweight="bold")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.legend()
plt.grid(True)
# Show plot
plt.show()
