The achinevement of logistics regression

最新推荐文章于 2024-07-23 22:38:20 发布

PiggyCh

最新推荐文章于 2024-07-23 22:38:20 发布

阅读量126

点赞数 1

分类专栏：机器学习文章标签：深度学习机器学习

原文链接：https://www.kesci.com/mw/project/5dd7a246f41512002ceb3d6b

版权

机器学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1 The achinevement of logistics regression

1.1 mathmatics formulas

在这里插入图片描述

picture 1.1 recognition processing
$sigmoid(w^Tx+b)=\frac{1}{1+e^{-(w^T+b)}}$

$L(a^{(i)},y^{(i)})=-y^{(i)}log(a^{(i)})-(1-y^{(i)}log(1-a^{(i)}))$

$J=\frac{1}{m}\sum_{i=1}^{m}L(a^{(i)},y^{(i)})$

1.2 Build step

data pre-processing

define model structure(To receive data and processing data)

initialize arguments

updating Iteration processing:

compute lossfunction L(forward propagation)
compute gradient grad(backward propagation)
update arguments (W ,b)

1.3 Algorithm Code

1.data pre-processing

the shape of train_set_x_orig data(picture) is (m_train,num_px,num_px,3 )

m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

m_train= the number of pictures

num_px * num_px * 3 = dimensions

then flatten the data: dimension(a,b,c,d) to (b * c * d,a)

X_flatten = X.reshape(X.shape[0],-1)
##### -1 means automatically supplement

finally standardize the data: divide 255

train_set_x = train_set_x_flatten/255.

2.initialize arguments

We need W matrix and b(single value) to realize the sigmoid function

first, get a sigmoid function:

def sigmoid(x):
   s = 1/(1+np.exp(-x))
   return x

the initialize the arguements which is w and b

def initialize_with_zeros(dim):                      #dim means the w column, must macth  X(num_p*num_p*3)
     w  =  np.zeros((dim, 1))
      b = 0
    return w, b

3.forward and backward propagation

forward propagation means building Loss function and get dw and db

backward propagation means to use dw and db to computer new arguments

build Loss function:
$sigmoid(w^Tx+b)=\frac{1}{1+e^{-(w^T+b)}}$

$L(a^{(i)},y^{(i)})=-y^{(i)}log(a^{(i)})-(1-y^{(i)}log(1-a^{(i)}))$

$J=\frac{1}{m}\sum_{i=1}^{m}L(a^{(i)},y^{(i)})$

dw and db is:
$\frac{\partial J}{\partial w}=\frac{1}{m}X(A-Y)^T$

$\frac{\partial J}{\partial b}=\frac{1}{m}\sum _{i=1}^m(a^{(i)}-y^{(i)})$

def propagate(w, b ,X ,Y):
   m = X.shape[1]                                                           # num of the train data
   A= sigmoid(np.dot(w.T,X)+b)                                  # sigmoid
  cost =-1/m*np.sum(Y*log(A)+(1-Y)*np.log(1-A)) #loss function
    dw = 1/m*np.dot(X,(A-Y).T)
    db = 1/m*np.sum(A-Y)
    grads = {"dw": dw, "db":db}               #tranform to dictionary  for  using conveniently 
    np.squeeze(cost)                             #delete redundancy dimensions

use dw and db to optimize the arguments:

def optimize(w,b,X,Y,num_iterations, learning_rate):
    for i in range(num_iterations):
        costs=[]           #creat a list
        grads, cost =propagate(w,b,X,Y)
        dw =grad["dw"]
        db = grad["db"]
        w = w-learning_rate *dw
        b = b -learning_rate *db
        if(i%100==0) print("iteration %i: %f",%(i,cost))
        
     params={"w":w ,"b":b}
    grads = {"dw":dw,"db",b}

4. predict the data(w,b,x)

use trained arguments: w , b to predict the data

def predict(w,b,X):
    m= X.shape[1]
    Y_prediction = np.zeros((1,m))  #creat a matrix to store prediction data
    A = sigmoid(np.dot(w.T,X)+b)
    for i in range(A.shape[i]):
        if A[0,1] <=0.5:
            Y_prediction[0,i] = 0
         else:
            Y_prediction[0,i] = 1
    return Y-prediction

5.Combine as a whole model

def model(X_train,Y_train,X_test,Y_test,num_iterations=2000,learning_rate=0.5):
    w, b=initialize_with_zeros(X_train.shape[0])
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    w = parameters["w"]
    b = parameters["b"]
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
     d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d

1.4 result analyze

1. plot the loss function and the gradient

costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

在这里插入图片描述

the loss is decreasing means the program is learning the arguments. When we increase the iterations .Probably the accuracy of training set is improving, whereas the accuracy of test set is reducing. this is overfitting

choose learning rate

learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations')

legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

learning rate is: 0.01
train accuracy: 99.52153110047847 %
test accuracy: 68.0 %

-------------------------------------------------------

learning rate is: 0.001
train accuracy: 88.99521531100478 %
test accuracy: 64.0 %

-------------------------------------------------------

learning rate is: 0.0001
train accuracy: 68.42105263157895 %
test accuracy: 36.0 %

-------------------------------------------------------

在这里插入图片描述
解释：

不同的学习率会带来不同的损失，因此会有不同的预测结果。
如果学习率太大（0.01），则成本可能会上下波动。它甚至可能会发散（尽管在此示例中，使用0.01最终仍会以较高的损失值获得收益）。
较低的损失并不意味着模型效果很好。当训练精度比测试精度高很多时，就会发生过拟合情况。
在深度学习中，我们通常建议你：
- 选择好能最小化损失函数的学习率。
- 如果模型过度拟合，请使用其他方法来减少过度拟合。（我们将在后面的教程中讨论。）
  本文reference:https://www.kesci.com/mw/project/5dd7a246f41512002ceb3d6b

PiggyCh

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
The achinevement of logistics regression

1 The achinevement of logistics regression1.1 mathmatics formulas picture 1.1 recognition processingsigmoid(wTx+b)=11+e−(wT+b)sigmoid(w^Tx+b)=\frac{1}{1+e^{-(w^T+b)}}si
复制链接

扫一扫

专栏目录