前言
本文的基于鲸鱼优化算法的SVR的示例算法是基于11个x自变量和2个y因变量的
算法原理
SVR
SVR(Support Vector Regression)是一种常用的统计方法,主要用于解决回归问题。其通过在高维空间中找到一个超平面,使得预测值与实际值之间的误差最小。其目标函数可表示为
其中,Y 是目标变量,X 是输入变量,w 和 b 是模型参数,λ和γ是正则化参数。
鲸鱼优化算法
鲸鱼优化算法(Whale Optimization Algorithm,WOA)是一种模拟鲸鱼捕食行为的启发式优化算法,本项目中主要用来求SVR的最佳超参数。该算法通过模拟鲸鱼寻找食物的行为,来寻找问题的最优解。WOA算法的主要步骤包括:初始化鲸鱼群体的位置和速度,计算每个鲸鱼的适应度值,更新鲸鱼的位置和速度,以及判断是否满足停止条件。其中的每条鲸鱼都相当于一个SVR模型,每条鲸鱼的位置和速度分别对应SVR模型的参数,每个鲸鱼的适应度值就是由SVR模型训练完的拟合优度(R^2),通过更新鲸鱼的位置的速度找到满足条件的R^2,从而找到较优的SVR模型的超参数,最后通过训练最优的SVR模型预测出与真实值偏差最小的应力值和应变值。
二、算法实现过程
1. 首先,确定优化问题的目标函数。本项目的目标是找到一组参数,使得SVR模型在训练集上的误差最小化,表示为目标函数
- 然后,利用鲸鱼优化算法来寻找这个目标函数的最优解。鲸鱼优化算法通过模拟鲸鱼捕食行为来搜索解空间,算法的基本步骤如下:
a. 初始化一群鲸鱼(解),每个鲸鱼代表一个参数组合。
b. 计算每个鲸鱼的适应度值(即目标函数值)。
c. 更新鲸鱼的位置和速度,使其朝着适应度值较高的方向移动。
d. 检查是否满足停止条件(如迭代次数、适应度变化阈值等),如果满足,则输出当前最优解;否则,返回步骤b。 - 接下来,将找到的最优解应用到SVR模型中,以获得最终的预测结果。具体来说,将参数w和b输入到SVR模型中,然后使用训练集数据对模型进行训练。最后,使用测试集数据评估模型的性能。
数据准备
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import matplotlib.pyplot as plt
plt.rcParams['font.family']='SimHei'
#from a1 import a1
# 从 CSV 文件读取数据
data = pd.read_csv('数据1.csv')
#a1()
# 分离特征和目标变量
X = data.iloc[:, :11].values
y = data.iloc[:, -2:].values
# 数据标准化
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X_scaled = scaler_X.fit_transform(X)
y_scaled = scaler_y.fit_transform(y)
# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.1, random_state=42)
鲸鱼优化算法
# 鲸鱼优化算法
def whale_optimization_algorithm(objective_function, bounds, iterations):
population_size = 10
dim = len(bounds)
population = np.random.uniform(low=bounds[:, 0], high=bounds[:, 1], size=(population_size, dim))
fitness_values = np.zeros(population_size, dtype=object) # 设置数据类型为 object
# 使用tqdm创建进度条
progress_bar = tqdm(total=iterations, desc="Optimizing", position=0)
# 保存每次优化后的 R2 值
r2_values_svr1 = []
r2_values_svr2 = []
for iteration in range(iterations):
for i in range(population_size):
fitness_values[i] = objective_function(population[i])
best_index = np.argmin(fitness_values)
best_solution = population[best_index]
a = 2 - iteration * (2 / iterations) # a decreases linearly from 2 to 0
for i in range(population_size):
r1 = np.random.random(dim) # random values in [0, 1)
r2 = np.random.random(dim)
A = 2 * a * r1 - a # Eq. (2.3)
C = 2 * r2 # Eq. (2.4)
D = np.abs(C * best_solution - population[i]) # Eq. (2.5)-Eq. (2.6)
new_solution = best_solution - A * D # Eq. (2.2)
new_solution = np.clip(new_solution, bounds[:, 0], bounds[:, 1]) # keep within bounds
population[i] = new_solution
# 计算每次优化后的 R2 值并保存
current_params = population[best_index]
current_r2_svr1, current_r2_svr2 = -fitness_values[best_index][0], -fitness_values[best_index][1]
r2_values_svr1.append(current_r2_svr1)
r2_values_svr2.append(current_r2_svr2)
# 输出当前优化参数和 R2 值
# print(f"Iteration {iteration + 1}: Params: {current_params}, R2 SVR1: {current_r2_svr1}, R2 SVR2: {current_r2_svr2}")
# 更新进度条
progress_bar.update(1)
# 关闭进度条
progress_bar.close()
return best_solution, r2_values_svr1, r2_values_svr2
定义目标函数
# 定义优化目标函数
def optimization_function(params):
C1, epsilon1, gamma1, C2, epsilon2, gamma2 = params
# 创建 SVR 模型
svr1 = SVR(C=C1, epsilon=epsilon1, gamma=gamma1, kernel='linear')
svr2 = SVR(C=C2, epsilon=epsilon2, gamma=gamma2, kernel='linear')
# 训练 SVR 模型
svr1.fit(X_train, y_train[:, 0])
svr2.fit(X_train, y_train[:, 1])
# 在测试集上进行预测
predictions1_scaled = svr1.predict(X_test)
predictions2_scaled = svr2.predict(X_test)
# 计算 R2 Score
r2_1 = r2_score(y_test[:, 0], predictions1_scaled)
r2_2 = r2_score(y_test[:, 1], predictions2_scaled)
# 将 R2 Score 取负值,因为 WOA 是一个最小化算法
return -r2_1, -r2_2
调用算法
# 定义参数空间
bounds = np.array([(0.1, 100), (0.01, 1), (0.001, 1), (0.1, 100), (0.01, 1), (0.001, 1)])
# 运行鲸鱼优化算法
best_params, r2_values_svr1, r2_values_svr2 = whale_optimization_algorithm(optimization_function, bounds, iterations=100)
# 输出最优参数
print("Best Parameters:", best_params)
# 创建最优 SVR 模型
best_svr1 = SVR(C=best_params[0], epsilon=best_params[1], gamma=best_params[2], kernel='linear')
best_svr2 = SVR(C=best_params[3], epsilon=best_params[4], gamma=best_params[5], kernel='linear')
# 训练最优 SVR 模型
best_svr1.fit(X_train, y_train[:, 0])
best_svr2.fit(X_train, y_train[:, 1])
# 在测试集上进行预测
best_predictions1_scaled = best_svr1.predict(X_test)
best_predictions2_scaled = best_svr2.predict(X_test)
# 计算最优 R2 Score
best_r2_1 = r2_score(y_test[:, 0], best_predictions1_scaled)
best_r2_2 = r2_score(y_test[:, 1], best_predictions2_scaled)
print("Best R2 Score for Target Variable 1 on Test Set:", best_r2_1)
print("Best R2 Score for Target Variable 2 on Test Set:", best_r2_2)
# inver=scaler_y.inverse_transform(y_test)
# print(inver[0:3,0:2])
#
# arr = np.zeros((9, 2))
# inver=scaler_y.inverse_transform(arr(best_predictions1_scaled,best_predictions2_scaled))
# print(inver[0:3,0:2])
#计算最优MSE
def mse(t1, t2):
return 0.5 * np.sum((t1 - t2)**2)
best_r2_1 = mse(y_test[:, 0], best_predictions1_scaled)
best_r2_2 = mse(y_test[:, 1], best_predictions2_scaled)
print("Best R2 Score for Target Variable 1 on Test Set:", best_r2_1)
print("Best R2 Score for Target Variable 2 on Test Set:", best_r2_2)
可视化展示
# 可视化展示
plt.subplot(221)
plt.plot(r2_values_svr1, label='应力模型WOA-SVR的训练过程')
plt.rcParams.update({'font.size': 20})
#plt.plot(r2_values_svr2, label='R2 Values SVR2')
plt.legend()
plt.xlabel('Iterations',fontdict={'size':30})
plt.ylabel('R2 Score',fontdict={'size':30})
plt.subplot(223)
#plt.plot(r2_values_svr1, label='应力模型的训练精度',s=200)
plt.plot(r2_values_svr2, label='应变模型WOA-SVR的训练过程')
plt.legend(fontsize=20)
plt.xlabel('Iterations',fontdict={'size':30})
plt.ylabel('R2 Score',fontdict={'size':30})
plt.subplot(222)
plt.scatter(range(len(y_test[:, 0])),y_test[:, 0],label='真值',s=200)
plt.scatter(range(len(best_predictions1_scaled)),best_predictions1_scaled,label='预测值',s=200)
plt.legend(fontsize=20)
plt.xlabel('应力值测试样例',fontdict={'size':30})
plt.ylabel('应力值',fontdict={'size':30})
plt.subplot(224)
plt.scatter(range(len(y_test[:, 1])),y_test[:, 1],label='真值',s=200)
plt.scatter(range(len(best_predictions2_scaled)),best_predictions2_scaled,label='预测值',s=200)
plt.legend(fontsize=20)
plt.xlabel('应变值测试样例',fontdict={'size':30})
plt.ylabel('应变值',fontdict={'size':30})
# fig , axes_lst = plt.subplots(2,2)
# axes_lst[1, 1].plot(r2_values_svr1, label='R2 Values SVR1')
# axes_lst[1, 1].plot(r2_values_svr2, label='R2 Values SVR2')
# plt.legend()
# axes_lst[0,0].bar(range(len(y_test[:, 0])),y_test[:, 0],label='ture')
# axes_lst[0,0].bar(range(len(best_predictions2_scaled)),best_predictions2_scaled,label='predict')
#
#
# plt.legend()
#axes_lst[1, 1].xlabel('Iterations')
#axes_lst[1, 1].ylabel('R2 Score')
plt.show()
完整算法
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import matplotlib.pyplot as plt
plt.rcParams['font.family']='SimHei'
#from a1 import a1
# 从 CSV 文件读取数据
data = pd.read_csv('数据1.csv')
#a1()
# 分离特征和目标变量
X = data.iloc[:, :11].values
y = data.iloc[:, -2:].values
# 数据标准化
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X_scaled = scaler_X.fit_transform(X)
y_scaled = scaler_y.fit_transform(y)
# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.1, random_state=42)
# 鲸鱼优化算法
def whale_optimization_algorithm(objective_function, bounds, iterations):
population_size = 10
dim = len(bounds)
population = np.random.uniform(low=bounds[:, 0], high=bounds[:, 1], size=(population_size, dim))
fitness_values = np.zeros(population_size, dtype=object) # 设置数据类型为 object
# 使用tqdm创建进度条
progress_bar = tqdm(total=iterations, desc="Optimizing", position=0)
# 保存每次优化后的 R2 值
r2_values_svr1 = []
r2_values_svr2 = []
for iteration in range(iterations):
for i in range(population_size):
fitness_values[i] = objective_function(population[i])
best_index = np.argmin(fitness_values)
best_solution = population[best_index]
a = 2 - iteration * (2 / iterations) # a decreases linearly from 2 to 0
for i in range(population_size):
r1 = np.random.random(dim) # random values in [0, 1)
r2 = np.random.random(dim)
A = 2 * a * r1 - a # Eq. (2.3)
C = 2 * r2 # Eq. (2.4)
D = np.abs(C * best_solution - population[i]) # Eq. (2.5)-Eq. (2.6)
new_solution = best_solution - A * D # Eq. (2.2)
new_solution = np.clip(new_solution, bounds[:, 0], bounds[:, 1]) # keep within bounds
population[i] = new_solution
# 计算每次优化后的 R2 值并保存
current_params = population[best_index]
current_r2_svr1, current_r2_svr2 = -fitness_values[best_index][0], -fitness_values[best_index][1]
r2_values_svr1.append(current_r2_svr1)
r2_values_svr2.append(current_r2_svr2)
# 输出当前优化参数和 R2 值
# print(f"Iteration {iteration + 1}: Params: {current_params}, R2 SVR1: {current_r2_svr1}, R2 SVR2: {current_r2_svr2}")
# 更新进度条
progress_bar.update(1)
# 关闭进度条
progress_bar.close()
return best_solution, r2_values_svr1, r2_values_svr2
# 定义优化目标函数
def optimization_function(params):
C1, epsilon1, gamma1, C2, epsilon2, gamma2 = params
# 创建 SVR 模型
svr1 = SVR(C=C1, epsilon=epsilon1, gamma=gamma1, kernel='linear')
svr2 = SVR(C=C2, epsilon=epsilon2, gamma=gamma2, kernel='linear')
# 训练 SVR 模型
svr1.fit(X_train, y_train[:, 0])
svr2.fit(X_train, y_train[:, 1])
# 在测试集上进行预测
predictions1_scaled = svr1.predict(X_test)
predictions2_scaled = svr2.predict(X_test)
# 计算 R2 Score
r2_1 = r2_score(y_test[:, 0], predictions1_scaled)
r2_2 = r2_score(y_test[:, 1], predictions2_scaled)
# 将 R2 Score 取负值,因为 WOA 是一个最小化算法
return -r2_1, -r2_2
# 定义参数空间
bounds = np.array([(0.1, 100), (0.01, 1), (0.001, 1), (0.1, 100), (0.01, 1), (0.001, 1)])
# 运行鲸鱼优化算法
best_params, r2_values_svr1, r2_values_svr2 = whale_optimization_algorithm(optimization_function, bounds, iterations=100)
# 输出最优参数
print("Best Parameters:", best_params)
# 创建最优 SVR 模型
best_svr1 = SVR(C=best_params[0], epsilon=best_params[1], gamma=best_params[2], kernel='linear')
best_svr2 = SVR(C=best_params[3], epsilon=best_params[4], gamma=best_params[5], kernel='linear')
# 训练最优 SVR 模型
best_svr1.fit(X_train, y_train[:, 0])
best_svr2.fit(X_train, y_train[:, 1])
# 在测试集上进行预测
best_predictions1_scaled = best_svr1.predict(X_test)
best_predictions2_scaled = best_svr2.predict(X_test)
# 计算最优 R2 Score
best_r2_1 = r2_score(y_test[:, 0], best_predictions1_scaled)
best_r2_2 = r2_score(y_test[:, 1], best_predictions2_scaled)
print("Best R2 Score for Target Variable 1 on Test Set:", best_r2_1)
print("Best R2 Score for Target Variable 2 on Test Set:", best_r2_2)
# inver=scaler_y.inverse_transform(y_test)
# print(inver[0:3,0:2])
#
# arr = np.zeros((9, 2))
# inver=scaler_y.inverse_transform(arr(best_predictions1_scaled,best_predictions2_scaled))
# print(inver[0:3,0:2])
#计算最优MSE
def mse(t1, t2):
return 0.5 * np.sum((t1 - t2)**2)
best_r2_1 = mse(y_test[:, 0], best_predictions1_scaled)
best_r2_2 = mse(y_test[:, 1], best_predictions2_scaled)
print("Best R2 Score for Target Variable 1 on Test Set:", best_r2_1)
print("Best R2 Score for Target Variable 2 on Test Set:", best_r2_2)
# 可视化展示
plt.subplot(221)
plt.plot(r2_values_svr1, label='应力模型WOA-SVR的训练过程')
plt.rcParams.update({'font.size': 20})
#plt.plot(r2_values_svr2, label='R2 Values SVR2')
plt.legend()
plt.xlabel('Iterations',fontdict={'size':30})
plt.ylabel('R2 Score',fontdict={'size':30})
plt.subplot(223)
#plt.plot(r2_values_svr1, label='应力模型的训练精度',s=200)
plt.plot(r2_values_svr2, label='应变模型WOA-SVR的训练过程')
plt.legend(fontsize=20)
plt.xlabel('Iterations',fontdict={'size':30})
plt.ylabel('R2 Score',fontdict={'size':30})
plt.subplot(222)
plt.scatter(range(len(y_test[:, 0])),y_test[:, 0],label='真值',s=200)
plt.scatter(range(len(best_predictions1_scaled)),best_predictions1_scaled,label='预测值',s=200)
plt.legend(fontsize=20)
plt.xlabel('应力值测试样例',fontdict={'size':30})
plt.ylabel('应力值',fontdict={'size':30})
plt.subplot(224)
plt.scatter(range(len(y_test[:, 1])),y_test[:, 1],label='真值',s=200)
plt.scatter(range(len(best_predictions2_scaled)),best_predictions2_scaled,label='预测值',s=200)
plt.legend(fontsize=20)
plt.xlabel('应变值测试样例',fontdict={'size':30})
plt.ylabel('应变值',fontdict={'size':30})
# fig , axes_lst = plt.subplots(2,2)
# axes_lst[1, 1].plot(r2_values_svr1, label='R2 Values SVR1')
# axes_lst[1, 1].plot(r2_values_svr2, label='R2 Values SVR2')
# plt.legend()
# axes_lst[0,0].bar(range(len(y_test[:, 0])),y_test[:, 0],label='ture')
# axes_lst[0,0].bar(range(len(best_predictions2_scaled)),best_predictions2_scaled,label='predict')
#
#
# plt.legend()
#axes_lst[1, 1].xlabel('Iterations')
#axes_lst[1, 1].ylabel('R2 Score')
plt.show()