1. 数据预处理
- sklearn.feature_extraction
- sklearn.impute
- sklearn.preprocessing
- pandas
import pandas as pd
pdata_frame = pd.read_csv("file_path.csv", index_col=0)
print(pdata_frame.info)
col_median = pdata_frame .loc[:, "col1"].median()
pdata_frame .loc[:, "col1"] = pdata_frame .loc[:, "col1"].fillna(col_median )
cols_median = pdata_frame .loc[:, "col2"].median()
pdata_frame .loc[pdata_frame ["col2"] < 0, "col2"] = cols_median
data = pdata_frame .drop(["col3", "col5"], axis=1)
scaler = StandardScaler()
scaler.fit(data)
x_train = scaler.transform(data)
2. 特征降维
- sklearn.feature_selection
- sklearn.decomposition