机器学习-梯度下降实验

苏苏芒

已于 2024-04-29 17:33:22 修改

阅读量434

点赞数 6

分类专栏：机器学习文章标签：机器学习人工智能深度学习

于 2024-04-26 22:49:24 首次发布

本文链接：https://blog.csdn.net/sususu_2001/article/details/138226976

版权

机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

文章目录

一、吴恩达机器学习-梯度下降实验
二、问题与解决方法
三、总结

一、吴恩达机器学习-梯度下降实验

plt_gradients(x_train,y_train, compute_cost, compute_gradient)
plt.show()

在这里插入图片描述

# plot cost versus iteration  
fig, (ax1, ax2) = plt.subplots(1, 2, constrained_layout=True, figsize=(12,4))
ax1.plot(J_hist[:100])
ax2.plot(1000 + np.arange(len(J_hist[1000:])), J_hist[1000:])
ax1.set_title("Cost vs. iteration(start)");  ax2.set_title("Cost vs. iteration (end)")
ax1.set_ylabel('Cost')            ;  ax2.set_ylabel('Cost') 
ax1.set_xlabel('iteration step')  ;  ax2.set_xlabel('iteration step') 
plt.show()

在这里插入图片描述

fig, ax = plt.subplots(1,1, figsize=(12, 6))
plt_contour_wgrad(x_train, y_train, p_hist, ax)

在这里插入图片描述

fig, ax = plt.subplots(1,1, figsize=(12, 4))
plt_contour_wgrad(x_train, y_train, p_hist, ax, w_range=[180, 220, 0.5], b_range=[80, 120, 0.5],
            contours=[1,5,10,20],resolution=0.5)

在这里插入图片描述

# initialize parameters
w_init = 0
b_init = 0
# set alpha to a large value
iterations = 10
tmp_alpha = 8.0e-1
# run gradient descent
w_final, b_final, J_hist, p_hist = gradient_descent(x_train ,y_train, w_init, b_init, tmp_alpha, 
                                                    iterations, compute_cost, compute_gradient)

在这里插入图片描述

plt_divergence(p_hist, J_hist,x_train, y_train)
plt.show()

在这里插入图片描述

二、问题与解决方法

1、溢出错误

plt_divergence(p_hist, J_hist,x_train, y_train)
plt.show()

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In[62], line 1
----> 1 plt_divergence(p_hist, J_hist,x_train, y_train)
      2 plt.show()

File D:\wenjian\baiduyundownload\week1\work\lab_utils_uni.py:305, in plt_divergence(p_hist, J_hist, x_train, y_train)
    303 for i in range(len(w_array)):
    304     tmp_w = w_array[i]
--> 305     cost[i] = compute_cost(x_train, y_train, tmp_w, fix_b)
    307 ax.plot(w_array, cost)
    308 ax.plot(x,v, c=dlmagenta)

OverflowError: Python int too large to convert to C long

在这里插入图片描述

报错原因

一开始运行很多not found报错，把所有文件放到一个文件夹内就可以了，但是这里报错不太一样，是溢出错误。
在Python中，整数类型int是动态的，可以表示任意大的整数。但是C语言中的int有范围，有些函数是用C写的而且没有针对大整数做调整的话，如果传入参数大于C语言的int上限就会出错。
把出错的函数搬过来。

def plt_divergence(p_hist, J_hist, x_train,y_train):

    x=np.zeros(len(p_hist))
    y=np.zeros(len(p_hist))
    v=np.zeros(len(p_hist))
    for i in range(len(p_hist)):
        x[i] = p_hist[i][0]
        y[i] = p_hist[i][1]
        v[i] = J_hist[i]

    fig = plt.figure(figsize=(12,5))
    plt.subplots_adjust( wspace=0 )
    gs = fig.add_gridspec(1, 5)
    fig.suptitle(f"Cost escalates when learning rate is too large")
    #===============
    #  First subplot
    #===============
    ax = fig.add_subplot(gs[:2], )

    # Print w vs cost to see minimum
    fix_b = 100
    w_array = np.arange(-70000, 70000, 1000)
    cost = np.zeros_like(w_array)

    for i in range(len(w_array)):
        tmp_w = w_array[i]
        cost[i] = compute_cost(x_train, y_train, tmp_w, fix_b)

    ax.plot(w_array, cost)
    ax.plot(x,v, c=dlmagenta)
    ax.set_title("Cost vs w, b set to 100")
    ax.set_ylabel('Cost')
    ax.set_xlabel('w')
    ax.xaxis.set_major_locator(MaxNLocator(2))

    #===============
    # Second Subplot
    #===============

    tmp_b,tmp_w = np.meshgrid(np.arange(-35000, 35000, 500),np.arange(-70000, 70000, 500))
    z=np.zeros_like(tmp_b)
    for i in range(tmp_w.shape[0]):
        for j in range(tmp_w.shape[1]):
            z[i][j] = compute_cost(x_train, y_train, tmp_w[i][j], tmp_b[i][j] )

    ax = fig.add_subplot(gs[2:], projection='3d')
    ax.plot_surface(tmp_w, tmp_b, z,  alpha=0.3, color=dlblue)
    ax.xaxis.set_major_locator(MaxNLocator(2))
    ax.yaxis.set_major_locator(MaxNLocator(2))

    ax.set_xlabel('w', fontsize=16)
    ax.set_ylabel('b', fontsize=16)
    ax.set_zlabel('\ncost', fontsize=16)
    plt.title('Cost vs (b, w)')
    # Customize the view angle
    ax.view_init(elev=20., azim=-65)
    ax.plot(x, y, v,c=dlmagenta)

    return

在这里插入图片描述

解决方法

改写lab_utils_uni.py中的函数，改完后记得保存，重新运行后还是不行，把jupyter关了再重新打开后就好了。
很懵，以我一个菜鸟的角度来说不知道为啥需要重新打开，可能是与浏览器缓存有关？反正最后解决了，确实是与溢出有关
还有个导入模块使用3d的代码我也加了，不知道需不需要，如下

from mpl_toolkits.mplot3d import Axes3D

w_array = np.arange(-70000, 70000, 1000, dtype=np.int64) 
cost = np.zeros_like(w_array, dtype=np.int64)
z=np.zeros_like(tmp_b, dtype=np.int64)

在这里插入图片描述

2、‘ls’ 不是内部或外部命令:

!ls -al

‘ls’ 不是内部或外部命令，也不是可运行的程序
或批处理文件。
在这里插入图片描述

报错原因

实际上ls是Linux的命令，Windows本身是没有这条命令的（对应Windows上的命令是dir）所以可以执行dir命令,和ls等效

解决措施

执行dir命令,和ls等效

三、总结

代码细节还没都弄懂，课程里说可以先继续往下学
所以什么是有意义的人生？什么事自己喜欢的生活？在调bug的时候突然有一瞬间觉得当初为什么要考研？为什么没考公务员？是喜欢看这些一眼下去看不懂的东西吗？
调完代码后，觉得心情舒畅了不少，又默默给自己打气，看不懂的东西总会看懂的，纵使山高路远,莫道止步不前。
“理想主义的花，最终会盛开在浪漫主义的地里，如果有一天，你发现我在平庸面前低下了头，请向我开炮”