《The Matrix Calculus You Need For Deep Learning》读书笔记

书籍简介

《The Matrix Calculus You Need For Deep Learning》是旧金山大学的Terence Parr教授(ANTLR之父,fast.ai创始人)和Jeremy Howard共同推出的一篇免费教程,可以帮助你快速入门深度学习中的矩阵微积分相关知识。该教程简洁明了,通俗易懂,只需要一点微积分和神经网络的基础知识就可以直接开始学习啦!

本教程涵盖的内容

教程先快速回顾了标量求导法则、向量微积分和偏导数的概念,然后从Jacobian矩阵的推广开始介绍如何计算矩阵的导数,最后推导单个神经元输出的梯度以及神经网络损失函数的梯度。
目录

内容总结

1.引言

导数是机器学习中的一个重要组成部分,特别是深度学习,它通过优化损失函数来对神经网络进行训练。不过它们需要的不是之前所学的标量微积分,而是所谓的矩阵微积分——线性代数和多变量微积分的“联姻”。
标量求导我们已经很熟悉的,常用的有指数法则、乘积法则和链式法则。需要注意的是这里我们已经可以引入算子的概念,即可认为 d d x \frac{d}{dx} dxd是将一个函数映射到它的导数的微分算子,这就意味着 d d x f ( x ) \frac{d}{dx}f(x) dxdf(x) d f ( x ) d x \frac{df(x)}{dx} dxdf(x)表示相同的概念。
进一步考虑多变量的情况。多变量函数对单个变量求导得到的是偏导数(用 ∂ ∂ x \frac{\partial}{\partial x} x表示)。将所有的偏导数放在一个行向量内,这个向量即称为函数 f ( x , y ) f(x,y) f(x,y)的梯度,即
∇ f ( x , y ) = [ ∂ f ( x , y ) ∂ x , ∂ f ( x , y ) ∂ y ] \nabla f(x,y)=[\frac{\partial f(x,y)}{\partial x},\frac{\partial f(x,y)}{\partial y}] f(x,y)=[xf(x,y),yf(x,y)].
再进一步,考虑多函数多变量的情况。除了 f ( x , y ) f(x,y) f(x,y)之外,再加上一个函数 g ( x , y ) g(x,y) g(x,y)。对于这两个函数,我们可以将它们的梯度组合成一个矩阵,称为Jacobian矩阵,矩阵的每一行对应一个函数的梯度:
J = [ ∇ f ( x , y ) ∇ g ( x , y ) ] = [ ∂ f ( x , y ) ∂ x ∂ f ( x , y ) ∂ y ∂ g ( x , y ) ∂ x ∂ g ( x , y ) ∂ y ] J=\begin{bmatrix}\nabla f(x,y)\\ \nabla g(x,y) \end{bmatrix}=\begin{bmatrix}\frac{\partial f(x,y)}{\partial x} & \frac{\partial f(x,y)}{\partial y} \\ \frac{\partial g(x,y)}{\partial x}&\frac{\partial g(x,y)}{\partial y} \end{bmatrix} J=[f(x,y)g(x,y)]=[xf(x,y)xg(x,y)yf(x,y)yg(x,y)].
这样我们就得到了本教程的核心内容:矩阵微积分!

2.Jacobian的推广

将参数用向量表示,即 x = [ x 1 x 2 . . . x n ] T \bold{x}=[ x_1 x_2 ... x_n]^T x=[x1x2...xn]T.
将函数同样用向量表示,即 y = f ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f m ( x ) ] T \bold{y}=\bold{f(x)}=[f_1(\bold x) f_2(\bold x) ... f_m(\bold x)]^T y=f(x)=[f1(x)f2(x)...fm(x)]T,表示由m个标量函数构成的函数向量。
通常Jacobian矩阵即是 m ∗ n m*n mn个偏导数的集合,也就是相对于 x \bold x x的m个梯度的堆积:
∂ y ∂ x = [ ∇ f 1 ( x ) ∇ f 2 ( x ) . . . ∇ f m ( x ) ] = [ ∂ x f 1 ( x ) ∂ x f 2 ( x ) . . . ∂ x f m ( x ) ] = [ ∂ x 1 f 1 ( x ) ∂ x 2 f 1 ( x ) . . . ∂ x n f 1 ( x ) ∂ x 1 f 2 ( x ) ∂ x 2 f 2 ( x ) . . . ∂ x n f 2 ( x ) . . . ∂ x 1 f m ( x ) ∂ x 2 f m ( x ) . . . ∂ x n f m ( x ) ] \frac{\partial {\bold y}}{\partial {\bold x}}=\begin{bmatrix} \nabla f_1(\bold {x}) \\ \nabla f_2(\bold {x}) \\ ... \\ \nabla f_m(\bold {x}) \end{bmatrix}=\begin{bmatrix} \frac{\partial}{\bold {x}}f_1(\bold x) \\ \frac{\partial}{\bold {x}}f_2(\bold x) \\ ... \\ \frac{\partial}{\bold {x}}f_m(\bold x)\end{bmatrix}\\=\begin{bmatrix} \frac{\partial}{x_1}f_1(\bold x) & \frac{\partial}{x_2}f_1(\bold x) & ... & \frac{\partial}{x_n}f_1(\bold x)\\ \frac{\partial}{x_1}f_2(\bold x) & \frac{\partial}{x_2}f_2(\bold x) & ... & \frac{\partial}{x_n}f_2(\bold x)\\ &...& \\ \frac{\partial}{x_1}f_m(\bold x) & \frac{\partial}{x_2}f_m(\bold x) & ... & \frac{\partial}{x_n}f_m(\bold x)\end{bmatrix} xy=f1(x)f2(x)...fm(x)=xf1(x)xf2(x)...xfm(x)=x1f1(x)x1f2(x)x1fm(x)x2f1(x)x2f2(x)...x2fm(x).........xnf1(x)xn

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
As an AI language model, I can suggest the following steps to learn deep learning: 1. Learn the basics of machine learning: Before diving into deep learning, it is essential to understand the basics of machine learning, including supervised and unsupervised learning, regression, clustering, and classification. 2. Learn programming languages: To work with deep learning, one needs to have a strong foundation in programming languages such as Python and R, as most of the deep learning libraries are written in these languages. 3. Understand the mathematics behind deep learning: Deep learning involves a lot of math, including linear algebra, calculus, and probability. Understanding these concepts will help you better understand the algorithms used in deep learning. 4. Choose a deep learning framework: Popular deep learning frameworks include Tensorflow, Keras, PyTorch, and Caffe. Choose one and learn it. 5. Practice with datasets: Work with datasets to understand how deep learning works in practice. Kaggle is a great platform to get started with real-world datasets. 6. Read research papers: Read research papers to stay up-to-date with the latest advancements in deep learning. 7. Join communities: Join online communities such as Reddit, Discord, or GitHub to connect with other deep learning enthusiasts and learn from them. 8. Build projects: Building projects is the best way to learn deep learning. Start with simple projects and gradually move on to more complex ones. Remember, deep learning is a vast field, and it takes time and effort to master it. Keep practicing, and you will get there.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值