数据来源
使用SUMO仿真获取数据,设置信号灯,利用SUMO中randomTrips工具进行随机车流的生成。选取路网其中一个车道,通过E1,E2 detector获取不同时刻"meanSpeed (m/s)", “flow (#/hour)”, “meanHaltingDuration(s)”, “nVehEntered(#)”, "meanOccupancy(%)"数据。其数据的含义如下表所示:
name | definition |
---|---|
meanVehicleNumber(#vehicles) | The mean number of vehicles that were on the detector (averaged over the interval duration). |
flow (#vehicles/hour) | The number of contributing vehicles extrapolated to an hour |
meanSpeed (m/s) | The mean velocity over all collected data samples. |
meanOccupancy (%) | The percentage (0-100%) of the detector’s place that was occupied by vehicles, summed up for each time step and averaged by the interval duration. |
meanHaltingDuration (s) | The mean halting duration of vehicles that entered the area and are still inside or have left the area within the reported interval. |
通过这些数据,可以获取包含路段密度(meanOccupancy), 交通流量(flow), 速度(meanSpeed), 旅行时间(meanSpeed), 路段内车辆累积总量(meanVehicleNumber), 平均等待时间(meanHaltingDuration)等数据。
具体代码:
- 设置时间(时间步长为1s,每50s采集一次数据)
t_start = 200
t_end = 20000
sample_interval = 50
- 获取E1 detector output 数据
AverageFlow = defaultdict(list)
tree = ET.parse("E1.output.xml")
root = tree.getroot()
for interval_elem in root.iter("interval"):
lane_id = interval_elem.get("id")
interval_end = float(interval_elem.get("end"))
averageflow = interval_elem.get("flow")
if interval_end >= t_start and interval_end < t_end:
AverageFlow[lane_id].append(averageflow)
- 获取E2 detector output数据
meanHaltingDuration = defaultdict(list)
meanOccupancy = defaultdict(list)
meanVehicleNumber = defaultdict(list)
tree = ET.parse("E2.output.xml")
root = tree.getroot()
for interval_elem in root.iter("interval"):
lane_id = interval_elem.get("id")
interval_end = float(interval_elem.get("end"))
waitingtime = interval_elem.get("meanHaltingDuration")
density = interval_elem.get("meanOccupancy")
vehicle_num = interval_elem.get("meanVehicleNumber")
if interval_end >= t_start and interval_end < t_end:
meanHaltingDuration[lane_id].append(waitingtime)
meanOccupancy[lane_id].append(density)
meanVehicleNumber[lane_id].append(vehicle_num)
- 将数据写入csv文件
epsilon = 0.001
for lane in lane_net.keys():
filename = lane + '.csv'
with open(filename, mode='w', newline='') as file:
writer = csv.writer(file)
writer.writerow([
"Time(s)", "AverageSpeed(m/s)",
"AverageFlow(#/hour)", "meanHaltingDuration(s)",
"meanVehicleNumber(#)", "meanOccupancy(%)", "MeanTravelTime(s)"
])
for i in range(len(time)):
writer.writerow([
time[i], AverageSpeed[lane][i], AverageFlow[lane][i],
meanHaltingDuration[lane][i], meanVehicleNumber[lane][i],
meanOccupancy[lane][i], lane_net[lane]['laneLength'] /
(AverageSpeed[lane][i] + epsilon)
])
数据特征
最大时间为19950,共396个时间点的数据。6个特征。
数据处理与分析
分析不同数据随时间的变化、各个交通量数据的概率统计特征以及不同交通量的相关性。
Line charts
绘制不同交通量随时间变化的图。代码如下:
import matplotlib.pyplot as plt
# Function to create a line chart
def plot_line_chart(df, x, y, y_label, title):
plt.figure(figsize=(14, 7))
plt.plot(df[x], df[y], marker='o')
plt.title(title)
plt.xlabel(x)
plt.ylabel(y_label)
plt.grid(True)
plt.show()
# Create line charts for Average Speed, Average Flow, and Mean Travel Time, mean Halting Duration, mean Vehicle Number, and mean Occupancy
plot_line_chart(traffic_data, 'Time(s)', 'AverageSpeed(m/s)', 'Speed (m/s)', 'Average Speed Over Time')
plot_line_chart(traffic_data, 'Time(s)', 'AverageFlow(#/hour)', 'Flow (#/hour)', 'Average Flow Over Time')
plot_line_chart(traffic_data, 'Time(s)', 'MeanTravelTime(s)', 'Travel Time (s)', 'Mean Travel Time Over Time')
plot_line_chart(traffic_data, 'Time(s)', 'meanHaltingDuration(s)', 'Halting Duration (s)', 'Mean Halting Duration Over Time')
plot_line_chart(traffic_data, 'Time(s)', 'meanVehicleNumber(#)', 'Vehicle Number (#)', 'Mean Vehicle Number Over Time')
plot_line_chart(traffic_data, 'Time(s)', 'meanOccupancy(%)', 'Occupancy (%)', 'Mean Occupancy Over Time')
可视化结果如下:






Boxplot
研究不同交通量数据的统计特征。选择箱线图绘制,从而更好地了解每个特征的数据分布、中位数、四分位数以及可能的异常值。绘制代码如下:
# Set up the matplotlib figure and axes, with 2 rows and 3 columns
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
# Flatten the axes array for easy iteration
axes = axes.flatten()
# Plot colored boxplots for each feature in its own subplot
for i, feature in enumerate(traffic_data.columns[1:]):
traffic_data.boxplot(feature, ax=axes[i], patch_artist=True,
boxprops=dict(facecolor='#99FF99'))
axes[i].set_title(f'Boxplot of {feature}', fontsize=10)
axes[i].set_xlabel('')
axes[i].set_ylabel('')
# Adjust layout to prevent overlapping
plt.tight_layout()
plt.show()
可视化结果如下:

Correlation Heatmap
绘制相关性热力图,研究不同交通量之间的相关系数。代码如下:
import seaborn as sns
import pandas as pd
from matplotlib import rcParams
# Set the global font to be Times New Roman
rcParams['font.family'] = 'serif'
rcParams['font.serif'] = 'Times New Roman'
df = pd.read_csv('0-4_0.csv')
# Calculate the correlation matrix
correlation_matrix = df.corr()
# Plot the heatmap
plt.figure(figsize=(12, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Heatmap of Correlation Between Different Traffic Features', fontsize=20)
# plt.xticks(rotation=0)
plt.show()
通过相关性热力图可以看出不同数据之间的相关性。比如可以看出,路段内车辆累计总数和路段密度之间有很强的正相关性,平均旅行时间则和平均速度、流量、车辆数量、密度都有负相关性。