AutoGluon 的兼容性和扩展性,可以很容易地将科研算法开展到大规模实验中。
三大应用领域
- image(image classification、object detection)
- text(text classification)
- tabular data(tabular prediction)
两大功能
- 自动调参
- 支持PyTorch
- 支持的搜索策略包括random search、grid search、RL、Bayesian optimization等
- NAS(仅支持image classification,目前只有ENAS)
from autogluon.tabular import TabularDataset,TabularPredictor, predictor
from catboost import train
import numpy as np
# #训练
train_data = TabularDataset('/cloud_disk/users/huh/projects/csdn_send_data/autogluon/house_price_kaggle/train.csv')
id, label = 'Id','SalePrice'
# 数据预处理
large_val_cols = ['LotFrontage','LotArea','OverallQual','OverallCond','YearBuilt','YearRemodAdd','WoodDeckSF','OpenPorchSF','EnclosedPorch','3SsnPorch','MiscVal','MoSold','YrSold']
for c in large_val_cols +[label]:
train_data[c] = np.log(train_data[c]+1)
# autogluon可以做特征抽取,但适当加入一些人工预处理
# 使用multimodel这个选项来使用transformer来提取特征,并做多模型融合
# 然后做多层模型ensemble来得到更好精度
predictor = TabularPredictor(label=label).fit(train_data.drop(columns=[id]))# 无multimodel
数据集下载:
链接: https://pan.baidu.com/s/1XsKH3k4DbZ72MpU3G9WNDQ 密码: 1nf7