这是我为简单线性回归创建的代码。这是密码,我有几个问题,我正在寻找答案。
如何从X和Y中检测和删除异常值也许一个代码示例会有所帮助?
您对模型部分的培训和评估质量有何看法?
正确的交叉验证?列车试验装置?
如何解释RMSE值?大价值观是好兆头还是坏兆头?在import pandas as pd
import numpy as np
import warnings
import matplotlib.pyplot as plt
import statsmodels.api as sm
from scipy import stats
# Import Excel File
data = pd.read_excel ("C:\\Users\\AchourAh\\Desktop\\Simple_Linear_Regression\\SP Level Simple Linear Regression\\PL32_PMM_03_09_2018_SP_Level.xlsx",'Sheet1') #Import Excel file
# Replace null values of the whole dataset with 0
data1 = data.fillna(0)
print(data1)
# Extraction of the independent and dependent variable
X = data1.iloc[0:len(data1),1].values.reshape(-1, 1) #Extract the column of the COPCOR SP we are going to check its impact
<