一、幅度调整,归一化
- Age,Trestbps,chol,thalach,oldpeak(连续参数)属于数据比较集中的类型,可用MAXMIN进行标准化处理。
from sklearn import preprocessing
col = data[['age','trestbps','chol','thalach','oldpeak']]
min_max_scaler = preprocessing.MinMaxScaler()
col_min_max = min_max_scaler.fit_transform(col)
col_min_max = pd.DataFrame(col_min_max)
col_min_max.columns = ['age','trestbps','chol','thalach','oldpeak']
- cp,slope,restecg,thal进行onehot编码
OneHot_cp = pd.get_dummies(data.cp,prefix='cp').astype('float')
OneHot_restecg = pd.get_dummies(data.restecg,prefix='restecg').astype('float')
OneHot_slope = pd.get_dummies(data.slope,prefix='slope').astype('float')
OneHot_ca = pd.get_d