Section I: Brief Introduction on Random Forest Regression
The random forest algorithm is an ensemble technique that combines multiple decision trees. A random forest usually has a better generalization performance than an individual tree due to randomness, which helps to decrease the model’s varaiance. Other advantages of random forests are that they are less sensitive to outliers in the dataset and don’t require much parameter tuning. The only parameter in random forests that we typically need to experiment with is the number of trees in the ensemble. The only difference is that we use the MSE criterion to grow the individual decision trees, and the predicted target variable is calculated as the average prediction over all decision trees.
FROM
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.
代码
from sklearn import datasets
import matplotlib.pyplot as plt
import numpy as np
import warnings
warnings.filterwarnings("ignore")
plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {
'weight': 'light'}
plt.rc("font"