麻省理工出版 2023年最新深度学习综述手册

这本手册详细介绍了深度学习的各个方面,从基础的监督学习、不监督学习和强化学习,到深度学习模型的训练、性能测量和改进。手册还包括了特定架构如卷积网络、残差网络和Transformer的详细讨论,以及生成对抗网络、变分自编码器、正则化流和扩散模型等现代深度生成模型。此外,手册还探讨了深度学习的伦理问题,强调了技术进步带来的潜在风险和责任。

目录

        1.Introduction

  • 1.1 Supervised learning

  • 1.2 Unsupervised learning

  • 1.3 Reinforcement learning

  • 1.4 Ethics

  • 1.5 Structure of book

  • 1.6 Other books

  • 1.7 How to read this book

        2.Supervised learning

  • 2.1 Supervised learning overview

  • 2.2 Linear regression example

  • 2.3 Summary

    3.Shallow neural networks
  • 3.1 Neural network example

  • 3.2 Universal approximation theorem

  • 3.3 Multivariate inputs and outputs

  • 3.4 Shallow neural networks: general case

  • 3.5 Terminology

  • 3.6 Summary

        4.Deep neural networks

  • 4.1 Composing neural networks

  • 4.2 From composing networks to deep networks

  • 4.3 Deep neural networks

  • 4.4 Matrix notation

  • 4.5 Shallow vs. deep neural networks

  • 4.6 Summary

        5.Loss functions

  • 5.1 Maximum likelihood

  • 5.2 Recipe for constructing loss functions

  • 5.3 Example 1: univariate regression

  • 5.4 Example 2: binary classification

  • 5.5 Example 3: multiclass classification

  • 5.6 Multiple outputs

  • 5.7 Cross-entropy loss

  • 5.8 Summary

        6.Fitting models

  • 6.1 Gradient descent

  • 6.2 Stochastic gradient descent

  • 6.3 Momentum

  • 6.4 Adam

  • 6.5 Training algorithm hyperparameters

  • 6.6 Summary

        7.Gradients and initialization

  • 7.1 Problem definitions

  • 7.2 Computing derivatives

  • 7.3 Toy example

  • 7.4 Backpropagation algorithm

  • 7.5 Parameter initialization

  • 7.6 Example training code

  • 7.7 Summary

        8.Measuring performance

  • 8.1 Training a simple model

  • 8.2 Sources of error

  • 8.3 Reducing error

  • 8.4 Double descent

  • 8.5 Choosing hyperparameters

  • 8.6 Summary

        9.Regularization

  • 9.1 Explicit regularization

  • 9.2 Implicit regularization

  • 9.3 Heuristics to improve performance

  • 9.4 Summary

        10.Convolutional networks

  • 10.1 Invariance and equivariance

  • 10.2 Convolutional networks for 1D inputs

  • 10.3 Convolutional networks for 2D inputs

  • 10.4 Downsampling and upsampling

  • 10.5 Applications

  • 10.6 Summary

        11.Residual networks

  • 11.1 Sequential processing

  • 11.2 Residual connections and residual blocks

  • 11.3 Exploding gradients in residual networks

  • 11.4 Batch normalization

  • 11.5 Common residual architectures

  • 11.6 Why do nets with residual connections perform so well?

  • 11.7 Summary

        12.Transformers

  • 12.1 Processing text data

  • 12.2 Dot-product self-attention

  • 12.3 Extensions to dot-product self-attention

  • 12.4 Transformers

  • 12.5 Transformers for natural language processing

  • 12.6 Encoder model example: BERT

  • 12.7 Decoder model example: GPT3

  • 12.8 Encoder-decoder model example: machine translation

  • 12.9 Transformers for long sequences

  • 12.10 Transformers for images

  • 12.11 Summary

        13.Graph neural networks

  • 13.1 What is a graph?

  • 13.2 Graph representation

  • 13.3 Graph neural networks, tasks, and loss functions

  • 13.4 Graph convolutional networks

  • 13.5 Example: graph classification

  • 13.6 Inductive vs. transductive models

  • 13.7 Example: node classification

  • 13.8 Layers for graph convolutional networks

  • 13.9 Edge graphs

  • 13.10 Summary

        13.14.Unsupervised learning

  • 14.1 Taxonomy of unsupervised learning models

  • 14.2 What makes a good generative model?

  • 14.3 Quantifying performance

  • 14.4 Summary

        15.Generative Adversarial Networks

  • 15.1 Discrimination as a signal

  • 15.2 Improving stability

  • 15.3 Progressive growing, minibatch discrimination, and truncation

  • 15.4 Conditional generation

  • 15.5 Image translation

  • 15.6 StyleGAN

  • 15.7 Summary

        16.Normalizing flows

  • 16.1 1D example

  • 16.2 General case

  • 16.3 Invertible network layers

  • 16.4 Multi-scale flows

  • 16.5 Applications

  • 16.6 Summary

        17.Variational autoencoders

  • 17.1 Latent variable models

  • 17.2 Nonlinear latent variable model

  • 17.3 Training

  • 17.4 ELBO properties

  • 17.5 Variational approximation

  • 17.6 The variational autoencoder

  • 17.7 The reparameterization trick

  • 17.8 Applications

  • 17.9 Summary

        18.Diffusion models

  • 18.1 Overview

  • 18.2 Encoder (forward process)

  • 18.3 Decoder model (reverse process)

  • 18.4 Training

  • 18.5 Reparameterization of loss function

  • 18.6 Implementation

  • 18.7 Summary

        19.Reinforcement learning

  • 19.1 Markov decision processes, returns, and policies

  • 19.2 Expected return

  • 19.3 Tabular reinforcement learning

  • 19.4 Fitted Q-learning

  • 19.5 Policy gradient methods

  • 19.6 Actor-critic methods

  • 19.7 Offline reinforcement learning

  • 19.8 Summary

        20.Why does deep learning work?

  • 20.1 The case against deep learning

  • 20.2 Factors that influence fitting performance

  • 20.3 Properties of loss functions

  • 20.4 Factors that determine generalization

  • 20.5 Do we need so many parameters?

  • 20.6 Do networks have to be deep?

  • 20.7 Summary

        21.Deep learning and ethics

  • 21.1 Value alignment

  • 21.2 Intentional misuse

  • 21.3 Other social, ethical, and professional issues

  • 21.4 Case study

  • 21.5 The value-free ideal of science

  • 21.6 Responsible AI research as a collective action problem

  • 21.7 Ways forward

  • 21.8 Summary

  • 附录 A. Notation
  • B. Mathematics
  • C. Probability
  • Bibliography
  • Index
  • 19
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值