摘要Summaries--课时三(Lesson 3)

  • Introduction and survey
  • “Lesson 0” How to fast.ai
    • Where is Lesson 0 video?
    • What does it to do with the book ‘meta learning’ and fastai course?
  • How to do a fastai lesson?
    • Watch with note
    • Run the notebook and experiment
    • Reproduce the notes from the codes
    • Repeat with a different dataset
  • How to not self-study?
    • physical and virtual study group
    • study with people on forum
    • Learn with social interactions is better than self-study
  • Highest voted student work
    • Many interesting projects to check out
  • Jeremy’s Pets breeds detector
    • Jeremy’s Pets repository
    • What you should do with this App example?
  • Paperspace: your DL workstation in cloud!
    • Does Jeremy speak highly of it? and Why?
  • JupyterLab: real beginner friendly
    • Why JupyterLab is so good for beginners to take advantage of?
  • Make a better pet detector
    • After training, we should think about how to improve it
  • Comparison of all (image) models
    • Did anyone compared most of the image models and shared the finding?
    • Where to find the notebook for comparison?
    • Which 3 criteria are used for comparison?
  • Try out new models
    • How to select and try out models with high scores
    • Where is the train.ipynb file?
    • How to try models on TIMM?
    • How to compare them by loss?
    • Why this model is actually impressive?
    • What can the name of a model tell us?
    • Why Jeremy only train 3 epochs? 18:58
  • Get the categories of a model
    • How to get labels or categories info from the model?
    • The rest is we learnt from last lecture.
  • What’s in the model
    • What two things are stored in the model?
  • What does model architecture look like?
  • Parameters of a model
    • How to zoom in on a layer of a model?
    • How to check out the parameters of a layer?
    • What does a layer’s parameters look like?
  • The investigating questions
    • What are the weights/numbers?
    • How can they figure out something important?
    • Where is the notebook on how neuralnet work
  • Create a general quadratic function
    • How to create a general function to output any specific quadratic function by changing 3 parameters?
    • How to generate result from a specific quadratic function by changing 1 parameter?
    • Why do we create such a general (quadratic) function with multiple unknown parameters rather than directly writing a particular quadratic function with specific coefficients?
  • Fit a function by good hands and eyes
    • What does fit a function mean? (search better parameters based on dataset)
    • How to create a random dataset?
    • How to fit a general quadratic function to the dataset by changing 3 parameters with jupyter widgets by hand?
    • What is the limitation of this manual/visual approach?
    • where is this notebook
  • Loss: fit a function better without good eyes
    • Why do we need loss or loss function?
    • What is mean squared error?
    • How does loss help the hand/visual approach to be more accurate and robust?
  • Automate the search of parameters for better loss
    • How do we know which way and by how much to update parameters in order to improve on loss?
    • Can you find enough derivative material on Khanacademy?
    • What exactly do you need to know about derivative for now according to Jeremy? 34:26
    • What is the slope or gradient?
    • Does pytorch do derivative or calc slope/gradient for us?
    • How to create a function to output sme loss on a general quadratic function? 35:02
    • What do you need to know about tensor related to derivatives for now according to Jeremy? 36:02
    • How to create a rank 1 tensor (a list to store numbers) to store parameters of the quadratic function? 36:49
    • How to ask pytorch to prepare the calculation of gradients for these parameters? 37:10
    • How to actually calculate gradients for each parameter based on the loss achieved by this specific function (3 specific parameters) against the whole dataset? 37:38
    • In other words, this time when we calculate loss we can easily get the gradient for each parameter as well.
    • What does the gradient value mean for each parameter? 38:34
    • How to update parameters into new values with the gradients produced by the loss? 39:18
    • How to automate the process above to find better parameters to achieve better loss? 41:05
    • Why this automation is called gradient descent?
    • notebook
  • The mathematical functions
    • Besides dataset, loss function, derivative, what is also very crucial in finding/calculating those parameters?
    • Why we can’t simply use quadratic functions for it?
  • ReLu: Rectified linear function
    • Real world powerful models demands complex parameters and also complex functions, how complex a function can we come up?
    • Is it possible to come up an infinitely complex function by simply doing addition of extremely simple functions?
    • What could such extremely simple function look like?
    • What is rectified linear function? How simple it is? What is linear and which part is rectified?
    • What does rectified linear function look like in plot?
    • How to adjust the 2 parameters of the function by hand with widget?
    • What the function could look like under different parameters? 44:46
  • Infinitely complex function
    • How powerful can the addition of extremely simple functions be?
    • How to create a double rectified linear function (double relu) and adjust 4 parameters by hand with widget?
    • How much more flexible does this double relu function look compared to a single rectified linear function?
    • Can you imagine how complex can a function be when millions of rectified linear functions are added?
  • 2 circles to an owl
    • a very concise summarization of sewing fundamental ideas together for deep learning
  • A chart of all image models compared
    • Can it be done with brute force computation with simple code?
    • Does Jeremy look for the model comparison chart for best models?
    • What is the wrong way of using the comparison chart by students? 50:45
    • How does Jeremy use the chart?
    • how does Jeremy decides which models to try out step by step?
  • Do I have enough data?
    • Did you already build a model and train on your own dataset?
    • Is the result good enough for you?
    • What is the mistake the DL industry often make on this issue? 52:55
    • What is Jeremy’s suggestion?
    • How and what could semi-supervised learning and data augmentation be helpful?
    • What about labeled and unlabeled data?
  • Interpret gradients in unit?
    • How much does the loss go down when parameter a increase by unit of 1? 55:24
  • Learning rate
    • Why we don’t update parameter values in large steps?
    • Why does Jeremy draw a quadratic function to refer to the model when zooming in very close into the complex function?
    • What would happen when update parameters with large values? 57:19
    • Does large drop on loss necessarily demand large value increase of parameter according to the quadratic nature?
    • What is learning rate? Why we need it to be small? How to pick a good value of it? 58:07
    • What would happen if your learning rate is too big?
    • What would happen when too small?
  • break
  • Matrix multiplication
    • When the model requires millions of rectified linear functions, how to calculate fast enough?
    • What is actually needed from linear algebra to do DL 1:01:33
    • How easy it is to do matrix multiplication? 1:01:51
    • What are the dataset and parameters in the matrix multiplication?
    • Does matrix multiplication do the rectified part for you?
    • What GPU is good at? 1:03:49
  • Build a regression model in spreadsheet
    • Intro to Titanic Competition on Kaggle 1:05:01
    • What is the dataset 1:05:18
    • What to do with the train.csv file?
    • How to clean the dataset a little bit?
    • How to transform the dataset for matrix multiplication? 1:07:17
    • How to prepare parameters for matrix multiplication? 1:08:50
    • What’s wrong with the much larger value of the column ‘Fare’ compared to other columns? 1:09:35
    • What to do with the values of ‘Fare’ and similarly the values of ‘Age’?
    • What is normalizing the data?
    • Does fastai do all these normalizations for us? Will we learn how fastai does it in the future?
    • Why to apply log to values of ‘Fare’? 1:10:59
    • Why do we need values to be evenly distributed?
    • How to do mmult on dataset and parameters in spreadsheet? 1:11:56
    • How to use mmult instead of addition to add a constant?
    • What does the result of our model look like? 1:13:41
    • Does Jeremy simply use a linear regression for the model, not even a relu?
    • Can we solve regression with gradient descent? How do we do it?
  • Build a neuralnet by adding two regression models
    • What does it take to turn a regression model into a neuralnet?
    • Why we don’t add up the results of two linear functions?
    • Why we only add the results together after they are rectified?
    • What does the model prediction look like?
    • Now we need to update the parameters for two linear functions, not just one.
  • Matrix multiplication makes training faster
    • How to make the training to do mmult rather than addition of linear multiplications in spreadsheet?
  • Watch out! it’s chapter 4
    • Please do try out Titanic competition
    • Why chapter 4 drove away most of people?
    • Ways to work out the spreadsheet yourself
  • Create dummy variables of 3 classes
    • Do we only need 2 columns/classes for a dummy variable with 3 classes?
  • Taste NLP
    • What do Natural Language Processing models do?
    • What project opportunities do non-En-speaker students have?
    • What tasks can NLP do? 1:25:57
  • fastai NLP library vs Hugging Face library
    • How do these two libraries differ?
    • Why we use transformer library in this lecture?
  • Homework to prepare you for the next lesson
  • 11
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值