均方根误差（RMSE）| 机器学习

最新推荐文章于 2024-07-26 09:53:57 发布

cumt30111

最新推荐文章于 2024-07-26 09:53:57 发布

阅读量4.3w

点赞数 12

文章标签：机器学习 python 深度学习人工智能神经网络

原文链接：https://www.includehelp.com/ml-ai/root-mean-square error-rmse.aspx

版权

Hello learners, welcome to yet another article on machine learning. Today we would be looking at one of the methods to determine the accuracy of our model in predicting the target values. All of you reading this article must have heard about the term RMS i.e. Root Mean Square and you might have also used RMS values in statistics as well. In machine Learning when we want to look at the accuracy of our model we take the root mean square of the error that has occurred between the test values and the predicted values mathematically:

大家好，欢迎阅读关于机器学习的另一篇文章。今天，我们将研究确定模型预测目标值准确性的方法之一。阅读本文的所有人都必须听说过RMS(即均方根)一词，并且您可能在统计数据中也使用过RMS值。在机器学习中，当我们要查看模型的准确性时，我们用数学上的平均值来计算测试值和预测值之间出现的误差的均方根：

For a single value:

对于单个值：

    Let a= (predicted value- actual value) ^2
    Let b= mean of a = a (for single value)
    Then RMSE= square root of b

For a wide set of values RMSE is defined as follows:

对于广泛的值，RMSE定义如下：

Source: https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/05/rmse.png

来源：https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/05/rmse.png

Graphically:

图形化：

As you can see in this scattered graph the red dots are the actual values and the blue line is the set of predicted values drawn by our model. Here X represents the distance between the actual value and the predicted line this line represents the error, similarly, we can draw straight lines from each red dot to the blue line. Taking mean of all those distances and squaring them and finally taking the root will give us RMSE of our model.

正如您在此分散图中所看到的，红点是实际值，蓝线是我们的模型绘制的一组预测值。这里的X表示实际值与预测线之间的距离，该线表示误差，类似地，我们可以绘制从每个红点到蓝线的直线。取所有这些距离的平均值并平方，最后求根将为我们提供模型的RMSE。

Let us write a python code to find out RMSE values of our model. We would be predicting the brain weight of the users. We would be using linear regression to train our model, the data set used in my code can be downloaded from here: headbrain6

让我们编写一个python代码来找出模型的RMSE值。我们将预测用户的大脑重量。我们将使用线性回归来训练我们的模型，我的代码中使用的数据集可以从这里下载： headbrain6

Python code:

Python代码：

# -*- coding: utf-8 -*-
"""
Created on Sun Jul 29 22:21:12 2018

@author: Raunak Goswami
"""
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#reading the data
"""
here the directory of my code and the headbrain6.csv file 
is same make sure both the files are stored in same folder or directory
""" 
data=pd.read_csv('headbrain6.csv')
data.head()
x=data.iloc[:,2:3].values
y=data.iloc[:,3:4].values

#splitting the data into training and test
from sklearn.cross_validation import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/4,random_state=0)

#fitting simple linear regression to the training set
from sklearn.linear_model import LinearRegression
regressor=LinearRegression()
regressor.fit(x_train,y_train)

#predict the test result
y_pred=regressor.predict(x_test)

#to see the relationship between the training data values
plt.scatter(x_train,y_train,c='red')
plt.show()

#to see the relationship between the predicted 
#brain weight values using scattered graph
plt.plot(x_test,y_pred)   
plt.scatter(x_test,y_test,c='red')
plt.xlabel('headsize')
plt.ylabel('brain weight')

#errorin each value
for i in range(0,60):
print("Error in value number",i,(y_test[i]-y_pred[i]))
time.sleep(1)

#combined rmse value
rss=((y_test-y_pred)**2).sum()
mse=np.mean((y_test-y_pred)**2)
print("Final rmse value is =",np.sqrt(np.mean((y_test-y_pred)**2)))

Outputs:

输出：

The RMSE value of our is coming out to be approximately 73 which is not bad. A good model should have an RMSE value less than 180. In case you have a higher RMSE value, this would mean that you probably need to change your feature or probably you need to tweak your hyperparameters. In case you want to know how did the model predicted the values, just have a look at my previous article on linear regression.

我们的RMSE值约为73，这还不错。好的模型的RMSE值应小于180。如果您具有较高的RMSE值，则意味着您可能需要更改功能或调整超参数。如果您想了解模型如何预测值，请看一下我以前关于线性回归的文章。

翻译自: https://www.includehelp.com/ml-ai/root-mean-square error-rmse.aspx

cumt30111

关注

12
点赞
踩
145

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫