AI day05(2020 8/4)

线性回归

拟合函数

在这里插入图片描述

误差

符合高斯分布(正态分布)

在这里插入图片描述

真实值

在这里插入图片描述

似然函数

1.根据样本推导参数
2.根据样本估计参数可能性的函数
3.值越大越好

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
推荐地址(似然函数推导)
添加链接描述

梯度下降实现逻辑回归

梯度下降:

1. 批量梯度下降
2. 随机梯度下降
3. 小批量梯度下降(常用)

学习率(步长):

常用的调参的参数

逻辑回归

  • 二分类算法
  • 数据集可以是非线性的
  • 也可以解决多分类的
  • 一般解决分类问题,可以先做逻辑回归,再使用其他模型

Sigmoid函数

  • 值 --> 概率
    g ( z ) = 1 1 + e − z g(z) = \frac{1}{1+e^{-z}} g(z)=1+ez1

Softmax(多分类)

代码示例:

Logistic Regression

The data

我们将建立一个逻辑回归模型来预测一个学生是否被大学录取。假设你是一个大学系的管理员,你想根据两次考试的结果来决定每个申请人的录取机会。你有以前的申请人的历史数据,你可以用它作为逻辑回归的训练集。对于每一个培训例子,你有两个考试的申请人的分数和录取决定。为了做到这一点,我们将建立一个分类模型,根据考试成绩估计入学概率。

#三大件
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import os
print(os.sep)
path = 'data' + os.sep +'data48267/' +'LogiReg_data.txt'
pdData = pd.read_csv(path, header=None, names=['Exam 1', 'Exam 2', 'Admitted'])
pdData.head()
/
Exam 1Exam 2Admitted
034.62366078.0246930
130.28671143.8949980
235.84740972.9021980
360.18259986.3085521
479.03273675.3443761
pdData.shape
(100, 3)
positive = pdData[pdData['Admitted'] == 1] # returns the subset of rows such Admitted = 1, i.e. the set of *positive* examples
negative = pdData[pdData['Admitted'] == 0] # returns the subset of rows such Admitted = 0, i.e. the set of *negative* examples
positive
Exam 1Exam 2Admitted
360.18259986.3085521
479.03273675.3443761
661.10666596.5114261
775.02474646.5540141
876.09878787.4205701
984.43282043.5333931
1282.30705376.4819631
1369.36458997.7186921
1553.97105289.2073501
1669.07014452.7404701
1870.66151092.9271381
1976.97878447.5759641
2189.67677665.7993661
2477.92409168.9723601
2562.27101469.9544581
2680.19018144.8216291
3061.37928972.8078871
3185.40451957.0519841
3352.04540569.4328601
3764.17698980.9080611
4083.90239456.3080461
4294.44336865.5689221
4677.19303570.4582001
4797.77159986.7278221
4862.07306496.7688241
4991.56497488.6962931
5079.94481874.1631191
5199.27252760.9990311
5290.54671443.3906021
5697.64563468.8615731
5874.24869169.8245711
5971.79646278.4535621
6075.39561185.7599371
6640.45755197.5351851
6880.27957492.1160611
6966.74671960.9913941
7164.03932078.0316881
7272.34649496.2275931
7360.45788673.0949981
7458.84095675.8584481
7599.82785872.3692521
7647.26426988.4758651
7750.45816075.8098601
8088.91389669.8037891
8194.83450745.6943071
8267.31925766.5893531
8357.23870659.5142821
8480.36675690.9601481
8568.46852285.5943071
8775.47770290.4245391
8878.63542496.6474271
9094.09433177.1591051
9190.44855187.5087921
9374.49269284.8451371
9489.84580745.3582841
9583.48916348.3802861
9642.26170187.1038511
9799.31500968.7754091
9855.34001864.9319381
9974.77589389.5298131
negative
Exam 1Exam 2Admitted
034.62366078.0246930
130.28671143.8949980
235.84740972.9021980
545.08327756.3163720
1095.86155538.2252780
1175.01365830.6032630
1439.53833976.0368110
1767.94685546.6785740
2067.37202842.8384380
2250.53478848.8558120
2334.21206144.2095290
2793.11438938.8006700
2861.83020650.2561080
2938.78580464.9956810
3252.10798063.1276240
3440.23689471.1677480
3554.63510652.2138860
3633.91550098.8694360
3874.78925341.5734150
3934.18364075.2377200
4151.54772046.8562900
4382.36875440.6182550
4451.04775245.8227010
4562.22267652.0609920
5334.52451460.3963420
5450.28649649.8045390
5549.58667759.8089510
5732.57720095.5985480
6135.28611347.0205140
6256.25381739.2614730
6330.05882249.5929740
6444.66826266.4500860
6566.56089441.0920980
6749.07256351.8832120
7032.72283343.3071730
7860.45555642.5084090
7982.22666242.7198790
8642.07545578.8447860
8952.34800460.7695050
9255.48216135.5707030
fig, ax = plt.subplots(figsize=(10,5))
ax.scatter(positive['Exam 1'], positive['Exam 2'], s=30, c='b', marker='o', label='Admitted')
ax.scatter(negative['Exam 1'], negative['Exam 2'], s=30, c='r', marker='x', label='Not Admitted')
ax.legend()
ax.set_xlabel('Exam 1 Score')
ax.set_ylabel('Exam 2 Score')
Text(0,0.5,'Exam 2 Score')

在这里插入图片描述

The logistic regression

目标:建立分类器(求解出三个参数 $\theta_0 \theta_1 \theta_2 $)

设定阈值,根据阈值判断录取结果

要完成的模块

  • sigmoid : 映射到概率的函数

  • model : 返回预测结果值

  • cost : 根据参数计算损失

  • gradient : 计算每个参数的梯度方向

  • descent : 进行参数更新

  • accuracy: 计算精度

sigmoid 函数

g ( z ) = 1 1 + e − z g(z) = \frac{1}{1+e^{-z}} g(z)=1+ez1

def sigmoid(z):
    return 1 / (1 + np.exp(-z))
nums = np.arange(-10, 10, step=1) #creates a vector containing 20 equally spaced values from -10 to 10
fig, ax = plt.subplots(figsize=(12,4))
ax.plot(nums, sigmoid(nums), 'r')
[<matplotlib.lines.Line2D at 0x7f1064254c10>]

在这里插入图片描述

Sigmoid

  • g : R → [ 0 , 1 ] g:\mathbb{R} \to [0,1] g:R[0,1]
  • g ( 0 ) = 0.5 g(0)=0.5 g(0)=0.5
  • g ( − ∞ ) = 0 g(- \infty)=0 g()=0
  • g ( + ∞ ) = 1 g(+ \infty)=1 g(+)=1
def model(X, theta):
    
    return sigmoid(np.dot(X, theta.T))

( θ 0 θ 1 θ 2 ) × ( 1 x 1 x 2 ) = θ 0 + θ 1 x 1 + θ 2 x 2 \begin{array}{ccc} \begin{pmatrix}\theta_{0} & \theta_{1} & \theta_{2}\end{pmatrix} & \times & \begin{pmatrix}1\\ x_{1}\\ x_{2} \end{pmatrix}\end{array}=\theta_{0}+\theta_{1}x_{1}+\theta_{2}x_{2} (θ0θ1θ2)×1x1x2=θ0+θ1x1+θ2x2

print(pdData.head())
      Exam 1     Exam 2  Admitted
0  34.623660  78.024693         0
1  30.286711  43.894998         0
2  35.847409  72.902198         0
3  60.182599  86.308552         1
4  79.032736  75.344376         1
pdData.insert(0, 'Ones', 1) # in a try / except structure so as not to return an error if the block si executed several times

print(pdData.head())
   Ones     Exam 1     Exam 2  Admitted
0     1  34.623660  78.024693         0
1     1  30.286711  43.894998         0
2     1  35.847409  72.902198         0
3     1  60.182599  86.308552         1
4     1  79.032736  75.344376         1
# set X (training data) and y (target variable)
# 将表格转换为矩阵
orig_data = pdData.as_matrix() # convert the Pandas representation of the data to an array useful for further computations
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ipykernel_launcher.py:3: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  This is separate from the ipykernel package so we can avoid doing imports until
orig_data
array([[ 1.        , 34.62365962, 78.02469282,  0.        ],
       [ 1.        , 30.28671077, 43.89499752,  0.        ],
       [ 1.        , 35.84740877, 72.90219803,  0.        ],
       [ 1.        , 60.18259939, 86.3085521 ,  1.        ],
       [ 1.        , 79.03273605, 75.34437644,  1.        ],
       [ 1.        , 45.08327748, 56.31637178,  0.        ],
       [ 1.        , 61.10666454, 96.51142588,  1.        ],
       [ 1.        , 75.02474557, 46.55401354,  1.        ],
       [ 1.        , 76.0987867 , 87.42056972,  1.        ],
       [ 1.        , 84.43281996, 43.53339331,  1.        ],
       [ 1.        , 95.86155507, 38.22527806,  0.        ],
       [ 1.        , 75.01365839, 30.60326323,  0.        ],
       [ 1.        , 82.30705337, 76.4819633 ,  1.        ],
       [ 1.        , 69.36458876, 97.71869196,  1.        ],
       [ 1.        , 39.53833914, 76.03681085,  0.        ],
       [ 1.        , 53.97105215, 89.20735014,  1.        ],
       [ 1.        , 69.07014406, 52.74046973,  1.        ],
       [ 1.        , 67.94685548, 46.67857411,  0.        ],
       [ 1.        , 70.66150955, 92.92713789,  1.        ],
       [ 1.        , 76.97878373, 47.57596365,  1.        ],
       [ 1.        , 67.37202755, 42.83843832,  0.        ],
       [ 1.        , 89.67677575, 65.79936593,  1.        ],
       [ 1.        , 50.53478829, 48.85581153,  0.        ],
       [ 1.        , 34.21206098, 44.2095286 ,  0.        ],
       [ 1.        , 77.92409145, 68.97235999,  1.        ],
       [ 1.        , 62.27101367, 69.95445795,  1.        ],
       [ 1.        , 80.19018075, 44.82162893,  1.        ],
       [ 1.        , 93.1143888 , 38.80067034,  0.        ],
       [ 1.        , 61.83020602, 50.25610789,  0.        ],
       [ 1.        , 38.7858038 , 64.99568096,  0.        ],
       [ 1.        , 61.37928945, 72.80788731,  1.        ],
       [ 1.        , 85.40451939, 57.05198398,  1.        ],
       [ 1.        , 52.10797973, 63.12762377,  0.        ],
       [ 1.        , 52.04540477, 69.43286012,  1.        ],
       [ 1.        , 40.23689374, 71.16774802,  0.        ],
       [ 1.        , 54.63510555, 52.21388588,  0.        ],
       [ 1.        , 33.91550011, 98.86943574,  0.        ],
       [ 1.        , 64.17698887, 80.90806059,  1.        ],
       [ 1.        , 74.78925296, 41.57341523,  0.        ],
       [ 1.        , 34.18364003, 75.23772034,  0.        ],
       [ 1.        , 83.90239366, 56.30804622,  1.        ],
       [ 1.        , 51.54772027, 46.85629026,  0.        ],
       [ 1.        , 94.44336777, 65.56892161,  1.        ],
       [ 1.        , 82.36875376, 40.61825516,  0.        ],
       [ 1.        , 51.04775177, 45.82270146,  0.        ],
       [ 1.        , 62.22267576, 52.06099195,  0.        ],
       [ 1.        , 77.19303493, 70.4582    ,  1.        ],
       [ 1.        , 97.77159928, 86.72782233,  1.        ],
       [ 1.        , 62.0730638 , 96.76882412,  1.        ],
       [ 1.        , 91.5649745 , 88.69629255,  1.        ],
       [ 1.        , 79.94481794, 74.16311935,  1.        ],
       [ 1.        , 99.27252693, 60.999031  ,  1.        ],
       [ 1.        , 90.54671411, 43.39060181,  1.        ],
       [ 1.        , 34.52451385, 60.39634246,  0.        ],
       [ 1.        , 50.28649612, 49.80453881,  0.        ],
       [ 1.        , 49.58667722, 59.80895099,  0.        ],
       [ 1.        , 97.64563396, 68.86157272,  1.        ],
       [ 1.        , 32.57720017, 95.59854761,  0.        ],
       [ 1.        , 74.24869137, 69.82457123,  1.        ],
       [ 1.        , 71.79646206, 78.45356225,  1.        ],
       [ 1.        , 75.39561147, 85.75993667,  1.        ],
       [ 1.        , 35.28611282, 47.02051395,  0.        ],
       [ 1.        , 56.2538175 , 39.26147251,  0.        ],
       [ 1.        , 30.05882245, 49.59297387,  0.        ],
       [ 1.        , 44.66826172, 66.45008615,  0.        ],
       [ 1.        , 66.56089447, 41.09209808,  0.        ],
       [ 1.        , 40.45755098, 97.53518549,  1.        ],
       [ 1.        , 49.07256322, 51.88321182,  0.        ],
       [ 1.        , 80.27957401, 92.11606081,  1.        ],
       [ 1.        , 66.74671857, 60.99139403,  1.        ],
       [ 1.        , 32.72283304, 43.30717306,  0.        ],
       [ 1.        , 64.03932042, 78.03168802,  1.        ],
       [ 1.        , 72.34649423, 96.22759297,  1.        ],
       [ 1.        , 60.45788574, 73.0949981 ,  1.        ],
       [ 1.        , 58.84095622, 75.85844831,  1.        ],
       [ 1.        , 99.8278578 , 72.36925193,  1.        ],
       [ 1.        , 47.26426911, 88.475865  ,  1.        ],
       [ 1.        , 50.4581598 , 75.80985953,  1.        ],
       [ 1.        , 60.45555629, 42.50840944,  0.        ],
       [ 1.        , 82.22666158, 42.71987854,  0.        ],
       [ 1.        , 88.91389642, 69.8037889 ,  1.        ],
       [ 1.        , 94.83450672, 45.6943068 ,  1.        ],
       [ 1.        , 67.31925747, 66.58935318,  1.        ],
       [ 1.        , 57.23870632, 59.51428198,  1.        ],
       [ 1.        , 80.366756  , 90.9601479 ,  1.        ],
       [ 1.        , 68.46852179, 85.5943071 ,  1.        ],
       [ 1.        , 42.07545454, 78.844786  ,  0.        ],
       [ 1.        , 75.47770201, 90.424539  ,  1.        ],
       [ 1.        , 78.63542435, 96.64742717,  1.        ],
       [ 1.        , 52.34800399, 60.76950526,  0.        ],
       [ 1.        , 94.09433113, 77.15910509,  1.        ],
       [ 1.        , 90.44855097, 87.50879176,  1.        ],
       [ 1.        , 55.48216114, 35.57070347,  0.        ],
       [ 1.        , 74.49269242, 84.84513685,  1.        ],
       [ 1.        , 89.84580671, 45.35828361,  1.        ],
       [ 1.        , 83.48916274, 48.3802858 ,  1.        ],
       [ 1.        , 42.26170081, 87.10385094,  1.        ],
       [ 1.        , 99.31500881, 68.77540947,  1.        ],
       [ 1.        , 55.34001756, 64.93193801,  1.        ],
       [ 1.        , 74.775893  , 89.5298129 ,  1.        ]])
cols = orig_data.shape[1]
print(cols)
X = orig_data[:,0:cols-1]
y = orig_data[:,cols-1:cols]
print(X)
print('*'*15)
print(y)
# convert to numpy arrays and initalize the parameter array theta
#X = np.matrix(X.values)
#y = np.matrix(data.iloc[:,3:4].values) #np.array(y.values)
theta = np.zeros([1, 3])
theta
4
[[ 1.         34.62365962 78.02469282]
 [ 1.         30.28671077 43.89499752]
 [ 1.         35.84740877 72.90219803]
 [ 1.         60.18259939 86.3085521 ]
 [ 1.         79.03273605 75.34437644]
 [ 1.         45.08327748 56.31637178]
 [ 1.         61.10666454 96.51142588]
 [ 1.         75.02474557 46.55401354]
 [ 1.         76.0987867  87.42056972]
 [ 1.         84.43281996 43.53339331]
 [ 1.         95.86155507 38.22527806]
 [ 1.         75.01365839 30.60326323]
 [ 1.         82.30705337 76.4819633 ]
 [ 1.         69.36458876 97.71869196]
 [ 1.         39.53833914 76.03681085]
 [ 1.         53.97105215 89.20735014]
 [ 1.         69.07014406 52.74046973]
 [ 1.         67.94685548 46.67857411]
 [ 1.         70.66150955 92.92713789]
 [ 1.         76.97878373 47.57596365]
 [ 1.         67.37202755 42.83843832]
 [ 1.         89.67677575 65.79936593]
 [ 1.         50.53478829 48.85581153]
 [ 1.         34.21206098 44.2095286 ]
 [ 1.         77.92409145 68.97235999]
 [ 1.         62.27101367 69.95445795]
 [ 1.         80.19018075 44.82162893]
 [ 1.         93.1143888  38.80067034]
 [ 1.         61.83020602 50.25610789]
 [ 1.         38.7858038  64.99568096]
 [ 1.         61.37928945 72.80788731]
 [ 1.         85.40451939 57.05198398]
 [ 1.         52.10797973 63.12762377]
 [ 1.         52.04540477 69.43286012]
 [ 1.         40.23689374 71.16774802]
 [ 1.         54.63510555 52.21388588]
 [ 1.         33.91550011 98.86943574]
 [ 1.         64.17698887 80.90806059]
 [ 1.         74.78925296 41.57341523]
 [ 1.         34.18364003 75.23772034]
 [ 1.         83.90239366 56.30804622]
 [ 1.         51.54772027 46.85629026]
 [ 1.         94.44336777 65.56892161]
 [ 1.         82.36875376 40.61825516]
 [ 1.         51.04775177 45.82270146]
 [ 1.         62.22267576 52.06099195]
 [ 1.         77.19303493 70.4582    ]
 [ 1.         97.77159928 86.72782233]
 [ 1.         62.0730638  96.76882412]
 [ 1.         91.5649745  88.69629255]
 [ 1.         79.94481794 74.16311935]
 [ 1.         99.27252693 60.999031  ]
 [ 1.         90.54671411 43.39060181]
 [ 1.         34.52451385 60.39634246]
 [ 1.         50.28649612 49.80453881]
 [ 1.         49.58667722 59.80895099]
 [ 1.         97.64563396 68.86157272]
 [ 1.         32.57720017 95.59854761]
 [ 1.         74.24869137 69.82457123]
 [ 1.         71.79646206 78.45356225]
 [ 1.         75.39561147 85.75993667]
 [ 1.         35.28611282 47.02051395]
 [ 1.         56.2538175  39.26147251]
 [ 1.         30.05882245 49.59297387]
 [ 1.         44.66826172 66.45008615]
 [ 1.         66.56089447 41.09209808]
 [ 1.         40.45755098 97.53518549]
 [ 1.         49.07256322 51.88321182]
 [ 1.         80.27957401 92.11606081]
 [ 1.         66.74671857 60.99139403]
 [ 1.         32.72283304 43.30717306]
 [ 1.         64.03932042 78.03168802]
 [ 1.         72.34649423 96.22759297]
 [ 1.         60.45788574 73.0949981 ]
 [ 1.         58.84095622 75.85844831]
 [ 1.         99.8278578  72.36925193]
 [ 1.         47.26426911 88.475865  ]
 [ 1.         50.4581598  75.80985953]
 [ 1.         60.45555629 42.50840944]
 [ 1.         82.22666158 42.71987854]
 [ 1.         88.91389642 69.8037889 ]
 [ 1.         94.83450672 45.6943068 ]
 [ 1.         67.31925747 66.58935318]
 [ 1.         57.23870632 59.51428198]
 [ 1.         80.366756   90.9601479 ]
 [ 1.         68.46852179 85.5943071 ]
 [ 1.         42.07545454 78.844786  ]
 [ 1.         75.47770201 90.424539  ]
 [ 1.         78.63542435 96.64742717]
 [ 1.         52.34800399 60.76950526]
 [ 1.         94.09433113 77.15910509]
 [ 1.         90.44855097 87.50879176]
 [ 1.         55.48216114 35.57070347]
 [ 1.         74.49269242 84.84513685]
 [ 1.         89.84580671 45.35828361]
 [ 1.         83.48916274 48.3802858 ]
 [ 1.         42.26170081 87.10385094]
 [ 1.         99.31500881 68.77540947]
 [ 1.         55.34001756 64.93193801]
 [ 1.         74.775893   89.5298129 ]]
***************
[[0.]
 [0.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [0.]
 [0.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [0.]
 [0.]
 [0.]
 [1.]
 [0.]
 [0.]
 [1.]
 [0.]
 [1.]
 [0.]
 [0.]
 [0.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [0.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [0.]
 [0.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]
 [1.]]





array([[0., 0., 0.]])
X[:5]
array([[ 1.        , 34.62365962, 78.02469282],
       [ 1.        , 30.28671077, 43.89499752],
       [ 1.        , 35.84740877, 72.90219803],
       [ 1.        , 60.18259939, 86.3085521 ],
       [ 1.        , 79.03273605, 75.34437644]])
y[:5]
array([[0.],
       [0.],
       [0.],
       [1.],
       [1.]])
theta
array([[0., 0., 0.]])
X.shape, y.shape, theta.shape
((100, 3), (100, 1), (1, 3))

损失函数

将对数似然函数去负号

D ( h θ ( x ) , y ) = − y log ⁡ ( h θ ( x ) ) − ( 1 − y ) log ⁡ ( 1 − h θ ( x ) ) D(h_\theta(x), y) = -y\log(h_\theta(x)) - (1-y)\log(1-h_\theta(x)) D(hθ(x),y)=ylog(hθ(x))(1y)log(1hθ(x))
求平均损失
J ( θ ) = 1 n ∑ i = 1 n D ( h θ ( x i ) , y i ) J(\theta)=\frac{1}{n}\sum_{i=1}^{n} D(h_\theta(x_i), y_i) J(θ)=n1i=1nD(hθ(xi),yi)

def cost(X, y, theta):
    # np.multiply() 数组和矩阵对应位置相乘
    left = np.multiply(-y, np.log(model(X, theta)))
    print(left)
    right = np.multiply(1 - y, np.log(1 - model(X, theta)))
    print(right)
    return np.sum(left - right) / (len(X))
cost(X, y, theta)
[[0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.        ]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]
 [0.69314718]]
[[-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.69314718]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]
 [-0.        ]]





0.6931471805599453

计算梯度

∂ J ∂ θ j = − 1 m ∑ i = 1 n ( y i − h θ ( x i ) ) x i j \frac{\partial J}{\partial \theta_j}=-\frac{1}{m}\sum_{i=1}^n (y_i - h_\theta (x_i))x_{ij} θjJ=m1i=1n(yihθ(xi))xij

def gradient(X, y, theta):
    grad = np.zeros(theta.shape)
    print('start')
    print(grad.shape)
    print(grad)
    # ravel() 扁平化函数
    # grad为梯度下降的参数(此处为w0,w1,w2)
    error = (model(X, theta)- y).ravel()
    for j in range(len(theta.ravel())): #for each parmeter
        term = np.multiply(error, X[:,j])
        grad[0, j] = np.sum(term) / len(X)
        print('for grad')
        print(grad)
    print('grad:')
    print(grad)
    print('grad end'+'-'*20)
    return grad

Gradient descent

比较3中不同梯度下降方法

STOP_ITER = 0
STOP_COST = 1
STOP_GRAD = 2

def stopCriterion(type, value, threshold):
    #设定三种不同的停止策略
    if type == STOP_ITER:        return value > threshold
    elif type == STOP_COST:      return abs(value[-1]-value[-2]) < threshold
    elif type == STOP_GRAD:      return np.linalg.norm(value) < threshold
import numpy.random
#洗牌
def shuffleData(data):
    np.random.shuffle(data)
    cols = data.shape[1]
    X = data[:, 0:cols-1]
    y = data[:, cols-1:]
    return X, y
import time

def descent(data, theta, batchSize, stopType, thresh, alpha):
    #梯度下降求解
    
    init_time = time.time()
    i = 0 # 迭代次数
    k = 0 # batch
    X, y = shuffleData(data)
    grad = np.zeros(theta.shape) # 计算的梯度
    costs = [cost(X, y, theta)] # 损失值

    
    while True:
        grad = gradient(X[k:k+batchSize], y[k:k+batchSize], theta)
        k += batchSize #取batch数量个数据
        if k >= n: 
            k = 0 
            X, y = shuffleData(data) #重新洗牌
        theta = theta - alpha*grad # 参数更新
        costs.append(cost(X, y, theta)) # 计算新的损失
        i += 1 

        if stopType == STOP_ITER:       value = i
        elif stopType == STOP_COST:     value = costs
        elif stopType == STOP_GRAD:     value = grad
        if stopCriterion(stopType, value, thresh): break
    
    return theta, i-1, costs, grad, time.time() - init_time
def runExpe(data, theta, batchSize, stopType, thresh, alpha):
    #import pdb; pdb.set_trace();
    theta, iter, costs, grad, dur = descent(data, theta, batchSize, stopType, thresh, alpha)
    name = "Original" if (data[:,1]>2).sum() > 1 else "Scaled"
    name += " data - learning rate: {} - ".format(alpha)
    if batchSize==n: strDescType = "Gradient"
    elif batchSize==1:  strDescType = "Stochastic"
    else: strDescType = "Mini-batch ({})".format(batchSize)
    name += strDescType + " descent - Stop: "
    if stopType == STOP_ITER: strStop = "{} iterations".format(thresh)
    elif stopType == STOP_COST: strStop = "costs change < {}".format(thresh)
    else: strStop = "gradient norm < {}".format(thresh)
    name += strStop
    print ("***{}\nTheta: {} - Iter: {} - Last cost: {:03.2f} - Duration: {:03.2f}s".format(
        name, theta, iter, costs[-1], dur))
    fig, ax = plt.subplots(figsize=(12,4))
    ax.plot(np.arange(len(costs)), costs, 'r')
    ax.set_xlabel('Iterations')
    ax.set_ylabel('Cost')
    ax.set_title(name.upper() + ' - Error vs. Iteration')
    return theta

不同的停止策略

设定迭代次数
#选择的梯度下降方法是基于所有样本的
n=100
runExpe(orig_data, theta, n, STOP_ITER, thresh=5000, alpha=0.000001)
grad--------------------

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-tW5Z07Dj-1596542635969)(output_40_2.png)]

根据损失值停止

设定阈值 1E-6, 差不多需要110 000次迭代

runExpe(orig_data, theta, n, STOP_COST, thresh=0.000001, alpha=0.001)
***Original data - learning rate: 0.001 - Gradient descent - Stop: costs change < 1e-06
Theta: [[-5.13364014  0.04771429  0.04072397]] - Iter: 109901 - Last cost: 0.38 - Duration: 21.67s





array([[-5.13364014,  0.04771429,  0.04072397]])

在这里插入图片描述

根据梯度变化停止

设定阈值 0.05,差不多需要40 000次迭代

runExpe(orig_data, theta, n, STOP_GRAD, thresh=0.05, alpha=0.001)
***Original data - learning rate: 0.001 - Gradient descent - Stop: gradient norm < 0.05
Theta: [[-2.37033409  0.02721692  0.01899456]] - Iter: 40045 - Last cost: 0.49 - Duration: 8.06s





array([[-2.37033409,  0.02721692,  0.01899456]])

在这里插入图片描述

对比不同的梯度下降方法

Stochastic descent
runExpe(orig_data, theta, 1, STOP_ITER, thresh=5000, alpha=0.001)
***Original data - learning rate: 0.001 - Stochastic descent - Stop: 5000 iterations
Theta: [[-0.38651143  0.06743607 -0.07215581]] - Iter: 5000 - Last cost: 1.13 - Duration: 0.34s





array([[-0.38651143,  0.06743607, -0.07215581]])

在这里插入图片描述

有点爆炸。。。很不稳定,再来试试把学习率调小一些

runExpe(orig_data, theta, 1, STOP_ITER, thresh=15000, alpha=0.000002)
***Original data - learning rate: 2e-06 - Stochastic descent - Stop: 15000 iterations
Theta: [[-0.0020209   0.01004422  0.00097837]] - Iter: 15000 - Last cost: 0.63 - Duration: 1.00s





array([[-0.0020209 ,  0.01004422,  0.00097837]])

在这里插入图片描述

速度快,但稳定性差,需要很小的学习率

Mini-batch descent
runExpe(orig_data, theta, 16, STOP_ITER, thresh=15000, alpha=0.001)
***Original data - learning rate: 0.001 - Mini-batch (16) descent - Stop: 15000 iterations
Theta: [[-1.03594432  0.02756836  0.00581629]] - Iter: 15000 - Last cost: 0.59 - Duration: 1.30s





array([[-1.03594432,  0.02756836,  0.00581629]])

在这里插入图片描述

浮动仍然比较大,我们来尝试下对数据进行标准化
将数据按其属性(按列进行)减去其均值,然后除以其方差。最后得到的结果是,对每个属性/每列来说所有数据都聚集在0附近,方差值为1

from sklearn import preprocessing as pp

scaled_data = orig_data.copy()
scaled_data[:, 1:3] = pp.scale(orig_data[:, 1:3])

runExpe(scaled_data, theta, n, STOP_ITER, thresh=5000, alpha=0.001)
***Scaled data - learning rate: 0.001 - Gradient descent - Stop: 5000 iterations
Theta: [[0.3080807  0.86494967 0.77367651]] - Iter: 5000 - Last cost: 0.38 - Duration: 0.83s





array([[0.3080807 , 0.86494967, 0.77367651]])

在这里插入图片描述

它好多了!原始数据,只能达到达到0.61,而我们得到了0.38个在这里!
所以对数据做预处理是非常重要的

runExpe(scaled_data, theta, n, STOP_GRAD, thresh=0.02, alpha=0.001)
***Scaled data - learning rate: 0.001 - Gradient descent - Stop: gradient norm < 0.02
Theta: [[1.0707921  2.63030842 2.41079787]] - Iter: 59422 - Last cost: 0.22 - Duration: 10.50s





array([[1.0707921 , 2.63030842, 2.41079787]])

在这里插入图片描述

更多的迭代次数会使得损失下降的更多!

theta = runExpe(scaled_data, theta, 1, STOP_GRAD, thresh=0.002/5, alpha=0.001)
***Scaled data - learning rate: 0.001 - Stochastic descent - Stop: gradient norm < 0.0004
Theta: [[1.14794829 2.79256769 2.56686015]] - Iter: 72622 - Last cost: 0.22 - Duration: 4.87s

在这里插入图片描述

随机梯度下降更快,但是我们需要迭代的次数也需要更多,所以还是用batch的比较合适!!!

runExpe(scaled_data, theta, 16, STOP_GRAD, thresh=0.002*2, alpha=0.001)
***Scaled data - learning rate: 0.001 - Mini-batch (16) descent - Stop: gradient norm < 0.004
Theta: [[1.1731359  2.84100721 2.60671868]] - Iter: 4185 - Last cost: 0.21 - Duration: 0.35s





array([[1.1731359 , 2.84100721, 2.60671868]])

在这里插入图片描述

精度

#设定阈值
def predict(X, theta):
    return [1 if x >= 0.5 else 0 for x in model(X, theta)]
scaled_X = scaled_data[:, :3]
y = scaled_data[:, 3]
predictions = predict(scaled_X, theta)
correct = [1 if ((a == 1 and b == 1) or (a == 0 and b == 0)) else 0 for (a, b) in zip(predictions, y)]
accuracy = (sum(map(int, correct)) % len(correct))
print ('accuracy = {0}%'.format(accuracy))
accuracy = 89%
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值