Section I: Brief Introduction on Sequential Backward Selection方法
The idea behind the SBS algorithm is quite simple: SBS sequentially removes features from the full feature subset until new feature subspace contains the desired number of features. In order to determine which feature is to be removed at each stage, we need to define the criterion function J that we want to minimize.The criterion calculated by the criterion function can simply be the difference in performance of the classifier before and after the removal of a particular feature. Then, the feature to be removed at each stage can simply be defined as the feature that maximizes this criterion;or in more intuitive terms,at each stage we eliminate the feature that causes the least performance loss after removal.
Personal Views:
- 每一步依据当前特征组合,选择模型训练泛化性能最佳者
- 下一步的特征组合是前一步特征空间的子集
From
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.
Section II: Code Implementation and Feature Selection
第一部分:Code Bundle of Sequential Backward Selection
from sklearn.base import clone
from itertools import combinations
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
class SBS():
def __init__(self,estimator,k_features,
scoring=accuracy_score,
test_size=0.25,random_state=1):
self.scoring=scoring
self.estimator=clone(estimator)
self.k_features=k_features
self.test_size=test_size
self.random_state=random_state
def fit