3天带你创造属于自己的深度学习框架(1)

29 篇文章 0 订阅
10 篇文章 0 订阅

通过构建线性回归-理解Loss函数-梯度下降与函数拟合

Load Dataset

from sklearn.datasets import load_boston
data = load_boston()
X, y = data['data'], data['target']
X[1]
y[1]
X.shape
len(y)
%matplotlib inline
import matplotlib.pyplot as plt
plt.scatter(X[:, 5], y)

目标:就是要找一个“最佳”的直线,来拟合卧室和房价的关系

import random 
k, b = random.randint(-100, 100), random.randint(-100, 100)def func(x):
    return k*x + b
X_rm = X[:, 5]
y_hat = [func(x) for x in X_rm]
plt.scatter(X[:, 5], y)
plt.plot(X_rm, y_hat)

随机画了一根直线,结果发现,离得很远?🙁

def draw_room_and_price():
    plt.scatter(X[:, 5], y)
def price(x, k, b):
    return k*x + b
​
k, b = random.randint(-100, 100), random.randint(-100, 100)
​
price_by_random_k_and_b = [price(r, k, b) for r in X_rm]
print('the random k : {}, b: {}'.format(k, b))
draw_room_and_price()
plt.scatter(X_rm, price_by_random_k_and_b)

the random k : 2, b: 23

目标是想找到最“好”的K和b?
我们需要一个标准去衡量这个东西到底好不好
y_true, 𝑦̂
衡量y_true, 𝑦̂ -> 损失函数

y_true = [1, 4, 1, 4,1, 4, 1,4]
y_hat = [2, 3, 1, 4, 1, 41, 31, 3]

L1-Loss
𝑙𝑜𝑠𝑠=1𝑛∑𝑖𝑛|𝑦𝑡𝑟𝑢𝑒−𝑖−𝑦𝑖^|

y_ture = [3, 4, 4]
y_hat_1 = [1, 1, 4]
y_hat_2 = [3, 4, 0]

L1-Loss 值是多少呢? |3 - 1| + |4-1|+ |4 -4| = 2 + 2 + 0 = 4
𝑦2^ L1-Loss |3-3| + |4-4|+|4-0| = 4
𝑙𝑜𝑠𝑠=1𝑛∑𝑖𝑛(𝑦𝑖−𝑦𝑖^)2

def loss(y, y_hat):
    sum_ = sum([(y_i - y_hat_i) ** 2 for y_i, y_hat_i in zip(y, y_hat)])
    return sum_ / len(y)
y_ture = [3, 4, 4]
y_hat_1 = [1, 1, 4]
y_hat_2 = [3, 4, 0]

print(loss(y_ture, y_hat_1))
print(loss(y_ture, y_hat_2))
def price(x, k, b):
    return k*x + b
​
k, b = random.randint(-100, 100), random.randint(-100, 100)
​
price_by_random_k_and_b = [price(r, k, b) for r in X_rm]
print('the random k : {}, b: {}'.format(k, b))
draw_room_and_price()
plt.scatter(X_rm, price_by_random_k_and_b)
​
cost = loss(list(y), price_by_random_k_and_b)print('The Loss of k: {}, b: {} is {}'.format(k, b, cost))

the random k : -48, b: 53
The Loss of k: -48, b: 53 is 75196.97500135966

Loss 一件事情你只要知道如何评价它好与坏 基本上就完成了一般了工作了
最简单的方法,我们随机生成若干组k和b,然后找到最佳的一组k和b


27.99445172348382

def price(x, k, b):
    return k*x + b
​
trying_times = 5000
​
best_k, best_b = None, None
​
min_cost = float('inf')
​
losses = []for i in range(trying_times):
    k = random.random() * 100 - 200
    b = random.random() * 100 - 200
    price_by_random_k_and_b = [price(r, k, b) for r in X_rm]
    #draw_room_and_price()
    #plt.scatter(X_rm, price_by_random_k_and_b)
​
    cost = loss(list(y), price_by_random_k_and_b)
    if cost < min_cost: 
        min_cost = cost
        best_k, best_b = k, b
        print('在第{}, k和b更新了'.format(i))
        losses.append(min_cost)

We could add a visualize

min_cost
best_k, best_b
def plot_by_k_and_b(k, b):
    price_by_random_k_and_b = [price(r, k, b) for r in X_rm]
    draw_room_and_price()
    plt.scatter(X_rm, price_by_random_k_and_b)
plot_by_k_and_b(best_k, best_b)

2-nd 方法 进行方向的调整
k的变化有两种: 增大和减小
b的变化也有两种:增大和减小
k, b这一组值我们进行变化,就有4种组合:

当,k和b沿着某个方向𝑑𝑛变化的时候,如何,loss下降了,那么,k和b接下来就继续沿着𝑑𝑛这个方向走,否则,我们就换一个方向

directions = [
    (+1, -1),
    (+1, +1),
    (-1, -1),
    (-1, +1)
]
​
​
def price(x, k, b):
    return k*x + b
​
trying_times = 10000
​
best_k = random.random() * 100 - 200
best_b = random.random() * 100 - 200
​
next_direction = random.choice(directions)
​
min_cost = float('inf')
​
losses = []
​
scala = 0.3for i in range(trying_times):
    current_direction = next_direction
    k_direction, b_direction = current_direction
    
    current_k = best_k + k_direction * scala
    current_b = best_b + b_direction * scala
    
    price_by_random_k_and_b = [price(r, current_k, current_b) for r in X_rm]
​
    cost = loss(list(y), price_by_random_k_and_b)
    
    if cost < min_cost: 
        min_cost = cost
        best_k, best_b = current_k,current_b
        print('在第{}, k和b更新了'.format(i))
        losses.append((i, min_cost))
        next_direction = current_direction
    else:
        next_direction = random.choice(list(set(directions) - {current_direction}))
len(losses)
min_cost

3-rd 梯度下降
我们能不能每一次的时候,都按照能够让它Loss减小方向走?
都能够找到一个方向
𝑙𝑜𝑠𝑠=1𝑛∑𝑖𝑛(𝑦𝑖−𝑦̂ )∗∗2
𝑙𝑜𝑠𝑠=1𝑛∑𝑖𝑛(𝑦𝑖−(𝑘∗𝑥𝑖+𝑏))2
∂𝑙𝑜𝑠𝑠∂𝑘=−2𝑛∑(𝑦𝑖−(𝑘𝑥𝑖+𝑏))𝑥𝑖
∂𝑙𝑜𝑠𝑠∂𝑏=−2𝑛∑(𝑦𝑖−(𝑘𝑥𝑖+𝑏))
∂𝑙𝑜𝑠𝑠∂𝑘=−2𝑛∑(𝑦𝑖−𝑦̂ 𝑖)𝑥𝑖
∂𝑙𝑜𝑠𝑠∂𝑏=−2𝑛∑(𝑦𝑖−𝑦̂ 𝑖)
Type Markdown and LaTeX: 𝛼2

def partial_k(x, y, y_hat):
    gradient = 0 
    
    for x_i, y_i, y_hat_i in zip(list(x), list(y), list(y_hat)):
        gradient += (y_i - y_hat_i) * x_i
    
    return -2 / len(y) * gradient
​
def partial_b(y, y_hat):
    gradient = 0
    
    for y_i, y_hat_i in zip(list(y), list(y_hat)):
        gradient += (y_i - y_hat_i)
        
    return -2 / len(y) * gradient
def price(x, k, b): 
    # Operation : CNN, RNN, LSTM, Attention 比KX+B更复杂的对应关系
    return k*x + b
​
trying_times = 50000
​
min_cost = float('inf')
​
losses = []
​
scala = 0.3
​
k, b = random.random() * 100 - 200, random.random() * 100 - 200
# 参数初始化问题! Weight Initizalition 问题!
​
best_k, best_b = None, None
​
learning_rate = 1e-3  # Optimizer Ratefor i in range(trying_times):
    price_by_random_k_and_b = [price(r, k, b) for r in X_rm]
​
    cost = loss(list(y), price_by_random_k_and_b)
    
    if cost < min_cost: 
       # print('在第{}, k和b更新了'.format(i))
        min_cost = cost
        best_k, best_b = k, b
        losses.append((i, min_cost))
​
    k_gradient = partial_k(X_rm, y, price_by_random_k_and_b) # 变化的方向
    b_gradient = partial_b(y, price_by_random_k_and_b)
    
    k = k + (-1 * k_gradient) * learning_rate
    ## 优化器: Optimizer 
    ## Adam 动量 momentum
    b = b + (-1 * b_gradient) * learning_rate

封装成一块一块儿的,别人用的时候,不需要重新在开始写了

len(losses)

print(min_cost)

best_k, best_b

def square(x): 
    return 10 * x**2 + 5 * x + 5
import numpy as np
_X = np.linspace(-100, 100)
_y = [square(x) for x in _X]
plt.plot(_X, _y)


plot_by_k_and_b(k=best_k, b=best_b)


plot_by_k_and_b(k=best_k, b=best_b)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值