1.随机森林基本原理
随机森林的基本原理,以及数学示例,可以看以前博客:【机器学习】【随机森林-1】Random Forest算法讲解 + 示例展示数学求解过程
2.Python实现代码
随机森林可以自己实现,下面是一个在GitHub上找到的一个随机森林算法的实现代码,可以看看。
2.1代码
2.1.1RandomForest.h
/************************************************
*Random Forest Program
*Function: trian&test Random Forest Model
*Author: handspeaker@163.com
*CreateTime: 2014.7.10
*Version: V0.1
*************************************************/
#ifndef RANDOM_FOREST_H
#define RANDOM_FOREST_H
#include<math.h>
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
#include"Tree.h"
#include"Sample.h"
class RandomForest
{
public:
/*************************************************************
*treeNum: the number of trees in this forest
*maxDepth: the max Depth of one single tree
*minLeafSample:terminate criterion,the min samples in a leaf
*minInfoGain:terminate criterion,the min information
* gain in a node if it can be splitted
**************************************************************/
RandomForest(int treeNum,int maxDepth,int minLeafSample,float minInfoGain);
RandomForest(const char*modelPath);
~RandomForest();
/*************************************************************
*trainset: the trainset,every row is a sample,every column is
*a feature,the total size is SampleNum*featureNum
*labels:the labels or regression values of the trainset,
*the total size is SampleNum
*SampleNum:the total number of trainset
*featureNum:the number of features
*classNum:the class number,regressiong is 1
*isRegression:if the problem is regression(true) or classification(false)
*trainFeatureNumPerNode:the feature number used in every node while training
*************************************************/
void train(float**trainset,float*labels,int SampleNum,int featureNum,
int classNum,bool isRegressi