【机器学习】李宏毅2020-作业1 Regression

最新推荐文章于 2023-09-22 20:33:57 发布

社恐患者

最新推荐文章于 2023-09-22 20:33:57 发布

阅读量609

点赞数 3

分类专栏：机器学习

本文链接：https://blog.csdn.net/qq_44714521/article/details/118552778

版权

机器学习专栏收录该内容

15 篇文章 1 订阅

订阅专栏

李宏毅2020机器学习-作业1 Regression

0 参考
1 导入需要的包
2 数据预处理
- 2.1 train.csv
- 2.2 test.csv
3 模型训练
4 验证集上验证
5 测试集上预测
保存预测结果

0 参考

2020机器学习课程主页：https://speech.ee.ntu.edu.tw/~hylee/ml/2020-spring.html
包含视频、课件、作业（要求和参考代码）等
视频：https://www.bilibili.com/video/BV1JE411g7XF
博客：
（1）https://mrsuncodes.github.io/2020/03/15/%E6%9D%8E%E5%AE%8F%E6%AF%85%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0-%E7%AC%AC%E4%B8%80%E8%AF%BE%E4%BD%9C%E4%B8%9A/
（2）https://www.cnblogs.com/HL-space/p/10676637.html

1 导入需要的包

（包说明忘记是哪个博客参考了，如果有谁知道了告诉我一声）

sys：该模块提供对解释器使用或维护的一些变量的访问，以及与解释器强烈交互的函数
pandas：一个强大的分析结构化数据的工具集
numpy： Python的一个扩展程序库，支持大量的维度数组与矩阵运算
math：数学运算的库

import sys
import pandas as pd
import numpy as np

2 数据预处理

2.1 train.csv

使用pandas的read_csv函数读取数据，并保存在变量data中。会自动忽略数据表格第一行的表头。
train.csv

data = pd.read_csv('D:/MyProgram/Python/forJUPYTER/data/train.csv', encoding='big5')

每一行数据的前三行分别是：日期、测站和测项。
因此每日24小时的数据是从第四行开始的，通过iloc函数进行数据提炼和读取。

data = data.iloc[:, 3:] # iloc函数：通过行号来取行数据
print(data)

         0     1     2     3     4     5     6     7     8     9  ...    14  \
0       14    14    14    13    12    12    12    12    15    17  ...    22   
1      1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8  ...   1.8   
2     0.51  0.41  0.39  0.37  0.35   0.3  0.37  0.47  0.78  0.74  ...  0.37   
3      0.2  0.15  0.13  0.12  0.11  0.06   0.1  0.13  0.26  0.23  ...   0.1   
4      0.9   0.6   0.5   1.7   1.8   1.5   1.9   2.2   6.6   7.9  ...   2.5   
...    ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  ...   ...   
4315   1.8   1.8   1.8   1.8   1.8   1.7   1.7   1.8   1.8   1.8  ...   1.8   
4316    46    13    61    44    55    68    66    70    66    85  ...    59   
4317    36    55    72   327    74    52    59    83   106   105  ...    18   
4318   1.9   2.4   1.9   2.8   2.3   1.9   2.1   3.7   2.8   3.8  ...   2.3   
4319   0.7   0.8   1.8     1   1.9   1.7   2.1     2     2   1.7  ...   1.3   

        15    16    17    18    19    20    21    22    23  
0       22    21    19    17    16    15    15    15    15  
1      1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8  
2     0.37  0.47  0.69  0.56  0.45  0.38  0.35  0.36  0.32  
3     0.13  0.14  0.23  0.18  0.12   0.1  0.09   0.1  0.08  
4      2.2   2.5   2.3   2.1   1.9   1.5   1.6   1.8   1.5  
...    ...   ...   ...   ...   ...   ...   ...   ...   ...  
4315   1.8     2   2.1     2   1.9   1.9   1.9     2     2  
4316   308   327    21   100   109   108   114   108   109  
4317   311    52    54   121    97   107   118   100   105  
4318   2.6   1.3     1   1.5     1   1.7   1.5     2     2  
4319   1.7   0.7   0.4   1.1   1.4   1.3   1.6   1.8     2  

[4320 rows x 24 columns]

RAINFALL参数的值NR表示没有下雨，因此可以将NR改为0，方便后续处理。

data[data == 'NR'] = 0
print(data)

         0     1     2     3     4     5     6     7     8     9  ...    14  \
0       14    14    14    13    12    12    12    12    15    17  ...    22   
1      1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8  ...   1.8   
2     0.51  0.41  0.39  0.37  0.35   0.3  0.37  0.47  0.78  0.74  ...  0.37   
3      0.2  0.15  0.13  0.12  0.11  0.06   0.1  0.13  0.26  0.23  ...   0.1   
4      0.9   0.6   0.5   1.7   1.8   1.5   1.9   2.2   6.6   7.9  ...   2.5   
...    ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  ...   ...   
4315   1.8   1.8   1.8   1.8   1.8   1.7   1.7   1.8   1.8   1.8  ...   1.8   
4316    46    13    61    44    55    68    66    70    66    85  ...    59   
4317    36    55    72   327    74    52    59    83   106   105  ...    18   
4318   1.9   2.4   1.9   2.8   2.3   1.9   2.1   3.7   2.8   3.8  ...   2.3   
4319   0.7   0.8   1.8     1   1.9   1.7   2.1     2     2   1.7  ...   1.3   

        15    16    17    18    19    20    21    22    23  
0       22    21    19    17    16    15    15    15    15  
1      1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8  
2     0.37  0.47  0.69  0.56  0.45  0.38  0.35  0.36  0.32  
3     0.13  0.14  0.23  0.18  0.12   0.1  0.09   0.1  0.08  
4      2.2   2.5   2.3   2.1   1.9   1.5   1.6   1.8   1.5  
...    ...   ...   ...   ...   ...   ...   ...   ...   ...  
4315   1.8     2   2.1     2   1.9   1.9   1.9     2     2  
4316   308   327    21   100   109   108   114   108   109  
4317   311    52    54   121    97   107   118   100   105  
4318   2.6   1.3     1   1.5     1   1.7   1.5     2     2  
4319   1.7   0.7   0.4   1.1   1.4   1.3   1.6   1.8     2  

[4320 rows x 24 columns]

然后将数据转换成二维矩阵，并保存在raw_data中。

raw_data = data.to_numpy()
print('raw_data：',raw_data)
print('raw_data的维数：',raw_data.shape)

raw_data： [['14' '14' '14' ... '15' '15' '15']
 ['1.8' '1.8' '1.8' ... '1.8' '1.8' '1.8']
 ['0.51' '0.41' '0.39' ... '0.35' '0.36' '0.32']
 ...
 ['36' '55' '72' ... '118' '100' '105']
 ['1.9' '2.4' '1.9' ... '1.5' '2' '2']
 ['0.7' '0.8' '1.8' ... '1.6' '1.8' '2']]
raw_data的维数： (4320, 24)

以月为单位，每个月形成一个 $18\times(24\times20)$ 的矩阵存放数据，共12组。
在这里插入图片描述

month_data = {}
for month in range(12):
    sample = np.empty([18, 24*20])
    for day in range(20):
        sample[:, day * 24 : (day+1) * 24] = raw_data[18*(20*month+day):18*(20*month+day+1),:]
    month_data[month] = sample
print('month_data:', month_data)

month_data: {0: array([[14.  , 14.  , 14.  , ..., 14.  , 13.  , 13.  ],
       [ 1.8 ,  1.8 ,  1.8 , ...,  1.8 ,  1.8 ,  1.8 ],
       [ 0.51,  0.41,  0.39, ...,  0.34,  0.41,  0.43],
       ...,
       [35.  , 79.  ,  2.4 , ..., 48.  , 63.  , 53.  ],
       [ 1.4 ,  1.8 ,  1.  , ...,  1.1 ,  1.9 ,  1.9 ],
       [ 0.5 ,  0.9 ,  0.6 , ...,  1.2 ,  1.2 ,  1.3 ]]), 1: array([[ 15.  ,  14.  ,  14.  , ...,   8.4 ,   8.  ,   7.6 ],
       [  1.8 ,   1.8 ,   1.7 , ...,   1.7 ,   1.7 ,   1.7 ],
       [  0.27,   0.26,   0.25, ...,   0.36,   0.35,   0.32],
       ...,
       [113.  , 109.  , 104.  , ...,  72.  ,  65.  ,  69.  ],
       [  2.3 ,   2.2 ,   2.6 , ...,   1.9 ,   2.9 ,   1.5 ],
       [  2.5 ,   2.2 ,   2.2 , ...,   0.9 ,   1.6 ,   1.1 ]]), 2: array([[ 18.  ,  18.  ,  18.  , ...,  14.  ,  13.  ,  13.  ],
       [  1.8 ,   1.8 ,   1.8 , ...,   1.8 ,   1.8 ,   1.8 ],
       [  0.39,   0.36,   0.4 , ...,   0.42,   0.47,   0.49],
       ...,
       [103.  , 128.  , 115.  , ...,  60.  ,  94.  ,  53.  ],
       [  1.7 ,   1.4 ,   1.8 , ...,   4.2 ,   3.5 ,   4.3 ],
       [  1.9 ,   0.8 ,   1.5 , ...,   3.1 ,   2.4 ,   2.4 ]]), 3: array([[ 19.  ,  18.  ,  17.  , ...,  24.  ,  24.  ,  23.  ],
       [  1.7 ,   1.7 ,   1.7 , ...,   1.8 ,   1.8 ,   1.9 ],
       [  0.42,   0.42,   0.42, ...,   0.41,   0.46,   0.42],
       ...,
       [308.  , 308.  , 320.  , ..., 331.  , 261.  , 273.  ],
       [  1.7 ,   2.2 ,   2.  , ...,   1.  ,   1.  ,   0.8 ],
       [  1.5 ,   1.5 ,   1.2 , ...,   0.6 ,   1.1 ,   0.9 ]]), 4: array([[1.90e+01, 1.90e+01, 2.00e+01, ..., 2.60e+01, 2.60e+01, 2.50e+01],
       [1.80e+00, 1.80e+00, 1.80e+00, ..., 1.60e+00, 1.60e+00, 1.60e+00],
       [4.80e-01, 4.70e-01, 4.50e-01, ..., 1.50e-01, 1.50e-01, 1.30e-01],
       ...,
       [2.90e+02, 6.90e+01, 2.50e+02, ..., 1.74e+02, 1.95e+02, 1.69e+02],
       [1.50e+00, 1.90e+00, 1.70e+00, ..., 3.10e+00, 3.10e+00, 2.90e+00],
       [4.00e-01, 5.00e-01, 1.00e+00, ..., 2.90e+00, 2.40e+00, 3.10e+00]]), 5: array([[2.60e+01, 2.50e+01, 2.50e+01, ..., 2.70e+01, 2.70e+01, 2.80e+01],
       [1.70e+00, 1.70e+00, 1.70e+00, ..., 1.60e+00, 1.60e+00, 1.60e+00],
       [3.50e-01, 3.40e-01, 3.40e-01, ..., 2.60e-01, 1.90e-01, 1.60e-01],
       ...,
       [1.18e+02, 1.22e+02, 1.19e+02, ..., 1.16e+02, 1.59e+02, 1.62e+02],
       [1.60e+00, 1.40e+00, 1.30e+00, ..., 1.70e+00, 1.00e+00, 2.40e+00],
       [1.50e+00, 1.50e+00, 1.30e+00, ..., 1.30e+00, 1.30e+00, 1.70e+00]]), 6: array([[2.60e+01, 2.50e+01, 2.60e+01, ..., 2.80e+01, 2.80e+01, 2.80e+01],
       [1.60e+00, 1.60e+00, 1.60e+00, ..., 1.60e+00, 1.60e+00, 1.70e+00],
       [1.40e-01, 1.30e-01, 1.30e-01, ..., 3.10e-01, 3.00e-01, 2.70e-01],
       ...,
       [1.06e+02, 1.24e+02, 1.17e+02, ..., 1.27e+02, 1.33e+02, 1.72e+02],
       [1.60e+00, 1.80e+00, 1.20e+00, ..., 1.60e+00, 1.40e+00, 1.70e+00],
       [2.00e+00, 2.20e+00, 1.70e+00, ..., 1.70e+00, 1.30e+00, 1.60e+00]]), 7: array([[2.80e+01, 2.80e+01, 2.80e+01, ..., 2.60e+01, 2.60e+01, 2.60e+01],
       [1.60e+00, 1.60e+00, 1.60e+00, ..., 1.70e+00, 1.70e+00, 1.70e+00],
       [2.60e-01, 2.00e-01, 1.60e-01, ..., 1.60e-01, 1.40e-01, 1.30e-01],
       ...,
       [2.04e+02, 1.77e+02, 1.72e+02, ..., 1.68e+02, 1.80e+02, 1.62e+02],
       [2.90e+00, 2.80e+00, 2.70e+00, ..., 2.90e+00, 2.80e+00, 2.50e+00],
       [3.00e+00, 2.80e+00, 2.70e+00, ..., 3.10e+00, 2.90e+00, 2.50e+00]]), 8: array([[ 25.  ,  25.  ,  25.  , ...,  26.  ,  26.  ,  26.  ],
       [  1.7 ,   1.7 ,   1.7 , ...,   1.6 ,   1.6 ,   1.7 ],
       [  0.28,   0.27,   0.26, ...,   0.28,   0.24,   0.23],
       ...,
       [ 98.  , 109.  , 108.  , ..., 163.  ,  71.  ,  55.  ],
       [  1.8 ,   1.9 ,   1.1 , ...,   1.2 ,   1.1 ,   0.7 ],
       [  1.4 ,   1.9 ,   1.7 , ...,   3.4 ,   1.  ,   0.7 ]]), 9: array([[ 25.  ,  25.  ,  25.  , ...,  23.  ,  22.  ,  22.  ],
       [  1.7 ,   1.7 ,   1.7 , ...,   1.8 ,   1.7 ,   1.7 ],
       [  0.24,   0.26,   0.27, ...,   0.42,   0.35,   0.26],
       ...,
       [ 72.  , 100.  ,  68.  , ..., 109.  , 110.  , 107.  ],
       [  1.1 ,   1.4 ,   1.1 , ...,   2.2 ,   2.4 ,   2.5 ],
       [  1.8 ,   1.2 ,   0.9 , ...,   2.1 ,   2.2 ,   2.3 ]]), 10: array([[ 22.  ,  21.  ,  21.  , ...,  19.  ,  18.  ,  18.  ],
       [  1.9 ,   1.9 ,   1.9 , ...,   1.7 ,   1.7 ,   1.7 ],
       [  0.79,   0.71,   0.61, ...,   0.36,   0.36,   0.37],
       ...,
       [100.  , 117.  , 110.  , ..., 117.  , 117.  , 114.  ],
       [  1.1 ,   1.9 ,   1.7 , ...,   2.1 ,   2.2 ,   1.9 ],
       [  0.7 ,   1.1 ,   1.2 , ...,   1.8 ,   2.1 ,   1.9 ]]), 11: array([[ 23.  ,  23.  ,  23.  , ...,  13.  ,  13.  ,  13.  ],
       [  1.6 ,   1.7 ,   1.7 , ...,   1.8 ,   1.8 ,   1.8 ],
       [  0.22,   0.2 ,   0.18, ...,   0.51,   0.57,   0.56],
       ...,
       [ 93.  ,  50.  ,  99.  , ..., 118.  , 100.  , 105.  ],
       [  1.8 ,   2.1 ,   3.2 , ...,   1.5 ,   2.  ,   2.  ],
       [  1.3 ,   0.9 ,   1.  , ...,   1.6 ,   1.8 ,   2.  ]])}

题目要求根据前9个小时的数据预测第10个小时的PM2.5值，因此对raw_data进行进一步的处理。即每9个小时的数据为一个data，参考值为第10个小时的数据。
由于每个月中的20天是连续的，因此每个月有 $20\times24-9$ 个data，每个data的维数为 $18 \times 9$ 。
这里注意一下，数据处理后的 $\pmb{\text{x}}$ 是二维矩阵，而不是三维矩阵， $\pmb{\text{x}}$ 的维数为 $\left(12\times(20\times24-9),18\times9\right)$ ，其中 $18\times9$ 使用reshape(-1,1)拉直。
在这里插入图片描述

x = np.empty([12*(24*20-9),18*9], dtype=float)
y = np.empty([12*(24*20-9), 1], dtype=float)
for month in range(12):
    for day in range(20):
        for hour in range(24):
            if day == 19 and hour>14:
                continue
            x[month*(24*20-9) + day*24 + hour, :] = month_data[month][:,day*24+hour:day*24+hour+9].reshape(1,-1)
            y[month*(24*20-9) + day*24 + hour, 0] = month_data[month][9, day*24+hour+9]
print('x:',x)
print('x的维数：', x.shape)
print('y:',y)
print('y的维数：', y.shape)

x: [[14.  14.  14.  ...  2.   2.   0.5]
 [14.  14.  13.  ...  2.   0.5  0.3]
 [14.  13.  12.  ...  0.5  0.3  0.8]
 ...
 [17.  18.  19.  ...  1.1  1.4  1.3]
 [18.  19.  18.  ...  1.4  1.3  1.6]
 [19.  18.  17.  ...  1.3  1.6  1.8]]
x的维数： (5652, 162)
y: [[30.]
 [41.]
 [44.]
 ...
 [17.]
 [24.]
 [29.]]
y的维数： (5652, 1)

数据标准化
最常见的标准方法就是 $Z$ 标准化（标准差标准化），经过处理的数据符合标准正态分布，即均值为0，标准差为1。
注意：一般来说 $z - s c o r e$ 不是归一化，而是标准化。
归一化只是标准化的一种。关于归一化和标准化
数据的标准化是将数据按比例缩放，使之落入一个小的特定区间。在某些比较和评价的指标处理中经常会用到。去除数据的的单位限制，将其转化为无量纲的纯数值，便于不同单位或量级的指标能够进行比较和加权。
数据的归一化是将数据变成 $(0, 1)$ 之间的小数，把有量纲表达式变为无量纲表达式。归一化后可以提升模型的收敛速度，提升模型的精度，防止模型梯度爆炸。
$z - s c o r e$ 标准化的转化函数为：
$x^*=\frac{x-\mu}{\sigma}$
其中， $\mu$ 是所有样本数据的均值， $\sigma$ 是所有样本数据的标准差。

mean_x = np.mean(x, axis = 0)
std_x = np.std(x, axis=0)
for i in range(len(x)):
    for j in range(len(x[0])):
        if std_x[j] != 0:
            x[i][j] = (x[i][j] - mean_x[j]) / std_x[j]

将训练集分为 train_set 和 validation_set 。在 train_set 上训练，在 validation_set 上验证模型效果。
train_set : validation_set = 8 : 2

import math

x_train_set = x[: math.floor(len(x) * 0.8), :]
y_train_set = y[: math.floor(len(y) * 0.8), :]
x_validation = x[math.floor(len(x) * 0.8):, :]
y_validation = y[math.floor(len(y) * 0.8):, :]

print('x_train_set:',x_train_set)
print('len(x_train_set):',len(x_train_set))
print('y_train_set:',y_train_set)
print('len(y_train_set):',len(y_train_set))
print('x_validation',x_validation)
print('len(x_validation):',len(x_validation))
print('y_valiadtion:',y_validation)
print('len(y_validation):',len(y_validation))

x_train_set: [[-1.35825331 -1.35883937 -1.359222   ...  0.26650729  0.2656797
  -1.14082131]
 [-1.35825331 -1.35883937 -1.51819928 ...  0.26650729 -1.13963133
  -1.32832904]
 [-1.35825331 -1.51789368 -1.67717656 ... -1.13923451 -1.32700613
  -0.85955971]
 ...
 [ 0.86929969  0.70886668  0.38952809 ...  1.39110073  0.2656797
  -0.39079039]
 [ 0.71018876  0.39075806  0.07157353 ...  0.26650729 -0.39013211
  -0.39079039]
 [ 0.3919669   0.07264944  0.07157353 ... -0.38950555 -0.39013211
  -0.85955971]]
len(x_train_set): 4521
y_train_set: [[30.]
 [41.]
 [44.]
 ...
 [ 7.]
 [ 5.]
 [14.]]
len(y_train_set): 4521
x_validation [[ 0.07374504  0.07264944  0.07157353 ... -0.38950555 -0.85856912
  -0.57829812]
 [ 0.07374504  0.07264944  0.23055081 ... -0.85808615 -0.57750692
   0.54674825]
 [ 0.07374504  0.23170375  0.23055081 ... -0.57693779  0.54674191
  -0.1095288 ]
 ...
 [-0.88092053 -0.72262212 -0.56433559 ... -0.57693779 -0.29644471
  -0.39079039]
 [-0.7218096  -0.56356781 -0.72331287 ... -0.29578943 -0.39013211
  -0.1095288 ]
 [-0.56269867 -0.72262212 -0.88229015 ... -0.38950555 -0.10906991
   0.07797893]]
len(x_validation): 1131
y_valiadtion: [[13.]
 [24.]
 [22.]
 ...
 [17.]
 [24.]
 [29.]]
len(y_validation): 1131

2.2 test.csv

header = None表示数据中没有表头。
test.csv

testdata = pd.read_csv('D:/MyProgram/Python/forJUPYTER/data/test.csv', header = None, encoding='big5')

和训练集一样，需要去除每行的说明列，以及将 NR 处理成 0，并转换成numpy矩阵。

testdata = testdata.iloc[:,2:]
testdata[testdata == 'NR'] = 0
test_data = testdata.to_numpy()
print('test_data:：', test_data)
print('test_data的维数：', test_data.shape)

test_data:： [['21' '21' '20' ... '19' '18' '17']
 ['1.7' '1.7' '1.7' ... '1.7' '1.7' '1.8']
 ['0.39' '0.36' '0.36' ... '0.34' '0.31' '0.23']
 ...
 ['76' '99' '93' ... '98' '97' '65']
 ['2.2' '3.2' '2.5' ... '5.7' '4.9' '3.6']
 ['1.7' '2.8' '2.6' ... '4.9' '5.2' '3.6']]
test_data的维数： (4320, 9)

测试集中一共有240天的数据。

test_x = np.empty([240,18*9], dtype=float)
for i in range(240):
    test_x[i,:] = test_data[18*i:18*(i+1),:].reshape(1,-1)

数据标准化。

for i in range(len(test_x)):
    for j in range(len(test_x[0])):
        if std_x[j] != 0:
            test_x[i][j] = (test_x[i][j] - mean_x[j]) / std_x[j]
test_x = np.concatenate((np.ones([240,1]),test_x), axis=1).astype(float)
print(test_x)

[[ 1.         -0.24447681 -0.24545919 ... -0.67065391 -1.04594393
   0.07797893]
 [ 1.         -1.35825331 -1.51789368 ...  0.17279117 -0.10906991
  -0.48454426]
 [ 1.          1.5057434   1.34508393 ... -1.32666675 -1.04594393
  -0.57829812]
 ...
 [ 1.          0.3919669   0.54981237 ...  0.26650729 -0.20275731
   1.20302531]
 [ 1.         -1.8355861  -1.8360023  ... -1.04551839 -1.13963133
  -1.14082131]
 [ 1.         -1.35825331 -1.35883937 ...  2.98427476  3.26367657
   1.76554849]]

3 模型训练

加一个bias列

dim = 18*9+1
w = np.zeros([dim,1])
x_train_set = np.concatenate((np.ones([len(x_train_set),1]), x_train_set), axis=1).astype(float)

定义超参数和变量

learning_rate = 10 # 学习率
iter_time = 50000 # 迭代次数
adagrad = np.zeros([dim, 1]) # adagrad算法
eps=0.0000000001 # 因为新的学习率是learning_rate/sqrt(sum_of_pre_grads**2),而adagrad=sum_of_grads**2,所以处在分母上而迭代时adagrad可能为0，所以加上一个极小数，使其不除0

损失函数：Root Mean Square Error
$\begin{aligned} L(w)&=\sqrt{\frac{1}{n}\sum_{i=0}^{n-1}\left(y_i-\hat{y}_i\right)^2}\\ y_i&=\sum_{j=0}^{m}w^jx_i^j+b=\theta\cdot x_i+b \end{aligned}$
其中， $n=12\times(20\times24-9)，m=18\times9$ ， $y_i$ 代表第 $i$ 个PM2.5预测值， $x_i^j$ 代表第 $i$ 个参数矩阵中第 $j$ 个参数值。
adagrad算法：
$\begin{aligned} w_{t+1}&=w_{t}-\frac{\eta}{\sqrt{\sum_{i=0}^{t}\left(g_i\right)^2}}g_{t}\\ g_t&=\frac{\partial L(w_t)}{\partial w_t} \end{aligned}$
这里，
$g_t=\frac{\sqrt{\frac{1}{n}\sum_{i=0}^{n-1}\left(w_tx_i+b-\hat{y}_i\right)^2}}{\partial w_t}=\frac{\sum_{i=0}^{n-1}x_i(w_tx_i+b-\hat{y}_i)}{nL(w_t)}$

for t in range(iter_time):
    loss = np.sqrt(np.sum(np.power(np.dot(x_train_set,w)-y_train_set,2))/len(x_train_set))
    if(t%100 == 0):
        print('迭代次数:%i, 损失值：%f'%(t,loss))
        gradient = (np.dot(x_train_set.transpose(), np.dot(x_train_set, w)-y_train_set))/(loss*len(x_train_set))
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
        
np.save('weight.npy', w)

迭代次数:0, 损失值：27.239592
迭代次数:100, 损失值：598.991742
迭代次数:200, 损失值：96.973083
迭代次数:300, 损失值：240.807182
迭代次数:400, 损失值：71.607934
迭代次数:500, 损失值：212.116933
迭代次数:600, 损失值：117.461546
……
迭代次数:49400, 损失值：15.226941
迭代次数:49500, 损失值：15.212356
迭代次数:49600, 损失值：15.197824
迭代次数:49700, 损失值：15.183339
迭代次数:49800, 损失值：15.168902
迭代次数:49900, 损失值：15.154507

4 验证集上验证

w = np.load('weight.npy')
x_validation = np.concatenate((np.ones([len(x_validation), 1]),x_validation), axis=1).astype(float)
ans_y = np.dot(x_validation, w)
loss = np.sqrt(np.sum(np.power(ans_y-y_validation, 2))/len(y_validation))
print('验证集上的loss：', loss)

验证集上的loss： 13.145704209176895

5 测试集上预测

w = np.load('weight.npy')
ans_y = np.dot(test_x,w)
print('测试集上PM2.5的预测结果',ans_y)

测试集上PM2.5的预测结果 [[ 4.18940895]
 [21.15516933]
 [ 3.36186293]
 [ 5.46244159]
 [28.63633728]
 ……
  [37.49079383]
 [23.43782156]
 [10.07970318]
 [29.31665506]]

保存预测结果

import csv

with open('submit.csv', mode='w', newline='') as submit_file:
    csv_writer = csv.writer(submit_file)
    header = ['id','value']
    print(header)
    csv_writer.writerow(header)
    for i in range(240):
        row = ['id_' + str(i), ans_y[i][0]]
        csv_writer.writerow(row)
        print(row)

['id', 'value']
['id_0', 4.18940895110925]
['id_1', 21.155169332936023]
['id_2', 3.361862929935379]
['id_3', 5.46244158523443]
……
['id_237', 23.437821562479797]
['id_238', 10.079703183805389]
['id_239', 29.316655061609495]

submit.csv