参考:CatBoost: A machine learning library to handle categorical (CAT) data automatically
数据类型:
X = train.drop(['Item_Outlet_Sales'], axis=1)
y = train.Item_Outlet_Sales
from sklearn.model_selection import train_test_split
X_train, X_validation, y_train, y_validation = train_test_split(X, y, train_size=0.7, random_state=1234)
X.dtypes
categorical_features_indices是为了识别categorical variables的。我们无需对分类变量进行任何处理。这也是Catboost的优点之一:自动处理categorical variables。
因此我们需要添加:
categorical_features_indices = np.where(X.dtypes != np.float)[0]
进行训练
from catboost import CatBoostRegressor
model&#