免疫算法在机器学习中的应用通常涉及解决优化问题,比如特征选择、参数优化、分类器设计等。下面是一个简化的免疫算法应用在机器学习特征选择中的Python示例。这个例子假设我们正在使用免疫算法来优化特征子集,以提升一个简单分类模型(例如逻辑回归)的性能。
请注意,此代码仅为示例,实际应用中需要根据具体数据集和问题进行调整。
import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 定义免疫算法核心组件
class ImmuneAlgorithm:
def __init__(self, n_features, population_size, iterations, mutation_rate):
self.n_features = n_features
self.population_size = population_size
self.iterations = iterations
self.mutation_rate = mutation_rate
def initialize_population(self):
"""初始化特征子集种群"""
return np.random.randint(2, size=(self.population_size, self.n_features))
def evaluate_fitness(self, X_train, y_train, feature_masks):
"""评估特征子集的适应度(这里使用逻辑回归的准确率作为评价指标)"""
accuracies = []
for mask in feature_masks:
X_train_selected = X_train[:, mask == 1]
model = LogisticRegression()
model.fit(X_train_selected, y_train)
y_pred = model.predict(X_train_selected)
acc = accuracy_score(y_train, y_pred)
accuracies.append(acc)
return accuracies
def selection(self, fitness_scores):
"""选择操作,基于轮盘赌选择法"""
probabilities = fitness_scores / fitness_scores.sum()
return np.random.choice(len(fitness_scores), size=self.population_size, p=probabilities)
def mutation(self, population):
"""变异操作,随机改变某些特征的选择状态"""
for i in range(self.population_size):
for j in range(self.n_features):
if np.random.rand() < self.mutation_rate:
population[i, j] = 1 - population[i, j]
return population
def run(self, X_train, y_train):
"""运行免疫算法"""
population = self.initialize_population()
for _ in range(self.iterations):
fitness_scores = self.evaluate_fitness(X_train, y_train, population)
selected_indices = self.selection(fitness_scores)
population = population[selected_indices]
population = self.mutation(population)
# 返回最佳特征子集
best_mask = population[np.argmax(self.evaluate_fitness(X_train, y_train, population))]
return best_mask
# 使用Iris数据集作为示例
data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 应用免疫算法进行特征选择
ia = ImmuneAlgorithm(n_features=X.shape[1], population_size=50, iterations=100, mutation_rate=0.1)
best_feature_mask = ia.run(X_train, y_train)
# 使用最佳特征子集训练模型并评估
selected_X_train = X_train[:, best_feature_mask == 1]
model = LogisticRegression()
model.fit(selected_X_train, y_train)
# 测试集上评估模型
selected_X_test = X_test[:, best_feature_mask == 1]
y_pred = model.predict(selected_X_test)
test_accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy with Selected Features: {test_accuracy}")
这段代码展示了如何使用免疫算法来寻找最优的特征子集以提升模型性能。它首先定义了一个免疫算法类,包含初始化种群、评估适应度、选择和变异等基本步骤。然后,利用这个类在Iris数据集上进行特征选择,并使用逻辑回归模型评估特征子集的效果。