我试图在一个简单的线性回归例子中理解和实现这些算法。我很清楚,全批量梯度下降使用所有数据来计算梯度,而随机梯度下降只使用一个。在
全批次梯度下降:import pandas as pd
from math import sqrt
df = pd.read_csv("data.csv")
df = df.sample(frac=1)
X = df['X'].values
y = df['y'].values
m_current=0
b_current=0
epochs=100000
learning_rate=0.0001
N = float(len(y))
for i in range(epochs):
y_current = (m_current * X) + b_current
cost = sum([data**2 for data in (y-y_current)]) / N
rmse = sqrt(cost)
m_gradient = -(2/N) * sum(X * (y - y_current))
b_gradient = -(2/N) * sum(y - y_current)
m_current = m_current - (learning_rate * m_gradient)
b_current = b_current - (learning_rate * b_gradient)
print("RMSE: ", rmse)
全批次梯度下降输出RMSE: 10.5