The achinevement of logistics regression

1 The achinevement of logistics regression

1.1 mathmatics formulas

在这里插入图片描述

​ picture 1.1 recognition processing
s i g m o i d ( w T x + b ) = 1 1 + e − ( w T + b ) sigmoid(w^Tx+b)=\frac{1}{1+e^{-(w^T+b)}} sigmoid(wTx+b)=1+e(wT+b)1

L ( a ( i ) , y ( i ) ) = − y ( i ) l o g ( a ( i ) ) − ( 1 − y ( i ) l o g ( 1 − a ( i ) ) ) L(a^{(i)},y^{(i)})=-y^{(i)}log(a^{(i)})-(1-y^{(i)}log(1-a^{(i)})) L(a(i),y(i))=y(i)log(a(i))(1y(i)log(1a(i)))

J = 1 m ∑ i = 1 m L ( a ( i ) , y ( i ) ) J=\frac{1}{m}\sum_{i=1}^{m}L(a^{(i)},y^{(i)}) J=m1i=1mL(a(i),y(i))

1.2 Build step

data pre-processing

define model structure(To receive data and processing data)

initialize arguments

updating Iteration processing:

  1. compute lossfunction L(forward propagation)

  2. compute gradient grad(backward propagation)

  3. update arguments (W ,b)

1.3 Algorithm Code

1.data pre-processing

the shape of train_set_x_orig data(picture) is (m_train,num_px,num_px,3 )

m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

m_train= the number of pictures

num_px * num_px * 3 = dimensions

then flatten the data: dimension(a,b,c,d) to (b * c * d,a)

X_flatten = X.reshape(X.shape[0],-1)
##### -1 means automatically supplement

finally standardize the data: divide 255

train_set_x = train_set_x_flatten/255.

2.initialize arguments

We need W matrix and b(single value) to realize the sigmoid function

first, get a sigmoid function:

def sigmoid(x):
   s = 1/(1+np.exp(-x))
   return x

the initialize the arguements which is w and b

def initialize_with_zeros(dim):                      #dim means the w column, must macth  X(num_p*num_p*3)
     w  =  np.zeros((dim, 1))
      b = 0
    return w, b

3.forward and backward propagation

forward propagation means building Loss function and get dw and db

backward propagation means to use dw and db to computer new arguments

build Loss function:
s i g m o i d ( w T x + b ) = 1 1 + e − ( w T + b ) sigmoid(w^Tx+b)=\frac{1}{1+e^{-(w^T+b)}} sigmoid(wTx+b)=1+e(wT+b)1

L ( a ( i ) , y ( i ) ) = − y ( i ) l o g ( a ( i ) ) − ( 1 − y ( i ) l o g ( 1 − a ( i ) ) ) L(a^{(i)},y^{(i)})=-y^{(i)}log(a^{(i)})-(1-y^{(i)}log(1-a^{(i)})) L(a(i),y(i))=y(i)log(a(i))(1y(i)log(1a(i)))

J = 1 m ∑ i = 1 m L ( a ( i ) , y ( i ) ) J=\frac{1}{m}\sum_{i=1}^{m}L(a^{(i)},y^{(i)}) J=m1i=1mL(a(i),y(i))

dw and db is:
∂ J ∂ w = 1 m X ( A − Y ) T \frac{\partial J}{\partial w}=\frac{1}{m}X(A-Y)^T wJ=m1X(AY)T

∂ J ∂ b = 1 m ∑ i = 1 m ( a ( i ) − y ( i ) ) \frac{\partial J}{\partial b}=\frac{1}{m}\sum _{i=1}^m(a^{(i)}-y^{(i)}) bJ=m1i=1m(a(i)y(i))

def propagate(w, b ,X ,Y):
   m = X.shape[1]                                                           # num of the train data
   A= sigmoid(np.dot(w.T,X)+b)                                  # sigmoid
  cost =-1/m*np.sum(Y*log(A)+(1-Y)*np.log(1-A)) #loss function
    dw = 1/m*np.dot(X,(A-Y).T)
    db = 1/m*np.sum(A-Y)
    grads = {"dw": dw, "db":db}               #tranform to dictionary  for  using conveniently 
    np.squeeze(cost)                             #delete redundancy dimensions

use dw and db to optimize the arguments:

def optimize(w,b,X,Y,num_iterations, learning_rate):
    for i in range(num_iterations):
        costs=[]           #creat a list
        grads, cost =propagate(w,b,X,Y)
        dw =grad["dw"]
        db = grad["db"]
        w = w-learning_rate *dw
        b = b -learning_rate *db
        if(i%100==0) print("iteration %i: %f",%(i,cost))
        
     params={"w":w ,"b":b}
    grads = {"dw":dw,"db",b}

4. predict the data(w,b,x)

use trained arguments: w , b to predict the data

def predict(w,b,X):
    m= X.shape[1]
    Y_prediction = np.zeros((1,m))  #creat a matrix to store prediction data
    A = sigmoid(np.dot(w.T,X)+b)
    for i in range(A.shape[i]):
        if A[0,1] <=0.5:
            Y_prediction[0,i] = 0
         else:
            Y_prediction[0,i] = 1
    return Y-prediction

5.Combine as a whole model

def model(X_train,Y_train,X_test,Y_test,num_iterations=2000,learning_rate=0.5):
    w, b=initialize_with_zeros(X_train.shape[0])
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    w = parameters["w"]
    b = parameters["b"]
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
     d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d

1.4 result analyze

1. plot the loss function and the gradient

costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

在这里插入图片描述

the loss is decreasing means the program is learning the arguments. When we increase the iterations .Probably the accuracy of training set is improving, whereas the accuracy of test set is reducing. this is overfitting

  1. choose learning rate
learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations')

legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
learning rate is: 0.01
train accuracy: 99.52153110047847 %
test accuracy: 68.0 %

-------------------------------------------------------

learning rate is: 0.001
train accuracy: 88.99521531100478 %
test accuracy: 64.0 %

-------------------------------------------------------

learning rate is: 0.0001
train accuracy: 68.42105263157895 %
test accuracy: 36.0 %

-------------------------------------------------------

在这里插入图片描述
解释

  • 不同的学习率会带来不同的损失,因此会有不同的预测结果。
  • 如果学习率太大(0.01),则成本可能会上下波动。 它甚至可能会发散(尽管在此示例中,使用0.01最终仍会以较高的损失值获得收益)。
  • 较低的损失并不意味着模型效果很好。当训练精度比测试精度高很多时,就会发生过拟合情况。
  • 在深度学习中,我们通常建议你:
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值