python随机森林回归,Python Scikit随机森林回归器错误

在尝试使用scikit-learn的随机森林回归器进行预测时,出现了一个错误。代码试图对预测结果的每个元素的第二个元素进行索引,但随机森林回归的预测返回的是浮点数,而不是包含多个元素的结构。因此,尝试对浮点数进行索引导致了错误。解决方法是理解随机森林回归的输出并正确处理它。
摘要由CSDN通过智能技术生成

I am trying to load training and test data from a csv, run the random forest regressor in scikit/sklearn, and then predict the output from the test file.

The TrainLoanData.csv file contains 5 columns; the first column is the output and the next 4 columns are the features. The TestLoanData.csv contains 4 columns - the features.

When I run the code, I get error:

predicted_probs = ["%f" % x[1] for x in predicted_probs]

IndexError: invalid index to scalar variable.

What does this mean?

Here is my code:

import numpy, scipy, sklearn, csv_io //csv_io from https://raw.github.com/benhamner/BioResponse/master/Benchmarks/csv_io.py

from sklearn import datasets

from sklearn.ensemble import RandomForestRegressor

def main():

#read in the training file

train = csv_io.read_data("TrainLoanData.csv")

#set the training responses

target = [x[0] for x in train]

#set the training features

train = [x[1:] for x in train]

#read in the test file

realtest = csv_io.read_data("TestLoanData.csv")

# random forest code

rf = RandomForestRegressor(n_estimators=10, min_samples_split=2, n_jobs=-1)

# fit the training data

print('fitting the model')

rf.fit(train, target)

# run model against test data

predicted_probs = rf.predict(realtest)

print predicted_probs

predicted_probs = ["%f" % x[1] for x in predicted_probs]

csv_io.write_delimited_file("random_forest_solution.csv", predicted_probs)

main()

解决方案

The return value from a RandomForestRegressor is an array of floats:

In [3]: rf = RandomForestRegressor(n_estimators=10, min_samples_split=2, n_jobs=-1)

In [4]: rf.fit([[1,2,3],[4,5,6]],[-1,1])

Out[4]:

RandomForestRegressor(bootstrap=True, compute_importances=False,

criterion='mse', max_depth=None, max_features='auto',

min_density=0.1, min_samples_leaf=1, min_samples_split=2,

n_estimators=10, n_jobs=-1, oob_score=False,

random_state=,

verbose=0)

In [5]: rf.predict([1,2,3])

Out[5]: array([-0.6])

In [6]: rf.predict([[1,2,3],[4,5,6]])

Out[6]: array([-0.6, 0.4])

So you're trying to index a float like (-0.6)[1], which is not possible.

As a side note, the model does not return probabilities.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值