主要代码在autofeat.autofeat.AutoFeatModel#fit_transform
, 在做完特征生成后, 调用 autofeat.featsel.select_features
函数做特征筛选。
在select_features
函数, 会做featsel_runs
次特征筛选
selected_columns = []
for i in range(featsel_runs):
selected_columns.extend(run_select_features(i))
run_select_features
是一个定义在本地的Local 函数。
def run_select_features(i):
if verbose > 0:
print("[featsel] Feature selection run %i/%i" % (i+1, featsel_runs))
np.random.seed(i) # todo rng
rand_idx = np.random.permutation(df_scaled.index)[:max(10, int(0.85 * len(df_scaled)))]
return _select_features_1run(df_scaled.iloc[rand_idx], target_scaled[rand_idx], problem_type,