参考文献:
1.Python 软件包介绍
2.Python API参考
总结一下常规使用流程:
dtrain = xgb.DMatrix('train.svm.txt') #数据的文本导入
dtrain = xgb.DMatrix(data, label=label, missing = -999.0) #处理数据中的缺失值
w = np.random.rand(5, 1)
dtrain = xgb.DMatrix(data, label=label, missing = -999.0, weight=w) #在需要时设置权重
dtrain.save_binary("train.buffer") #数据的二进制存储
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
总结一下:
就是按list形式存储设定的参数,而且这个list还是可以嵌套的。
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
总结一下常规使用流程:
num_round = 10
#训练模型
bst = xgb.train( plst, dtrain, num_round, evallist )
#保存模型
bst.save_model('0001.model')
#转存模型
bst.dump_model('dump.raw.txt')
#转储模型和特征映射
bst.dump_model('dump.raw.txt','featmap.txt')
#载入模型
bst = xgb.Booster({'nthread':4}) #init model
bst.load_model("model.bin") # load data
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%