斯坦福CS231n课程: 视觉识别中的卷积神经网络 Convolutional Neural Networks for Visual Recognition

本课程深入探讨深度学习在计算机视觉中的应用,重点讲解卷积神经网络在图像分类等任务中的端到端模型训练。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

truck
car
cat
ship
horse
*This network is running live in your browser

Course Description

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the  ImageNet Challenge.

Course Instructors

Fei-Fei Li
 
Andrej Karpathy
Justin Johnson

Teaching Assistants

Serena Yeung
 
Subhasis Das
 
Song Han
 
Albert Haque
 
Bharath Ramsundar
Hieu Pham
 
Irwan Bello
 
Namrata Anand
 
Lane McIntosh
 
Catherine Dong
Kyle Griswold
 

Class Time and Location

Winter quater (January - March, 2016).
Lecture: Monday, Wednesday 3:00-4:20
Bishop Auditorium in Lathrop Building ( map)

Office Hours

Mon 9-11am in Gates 392 with Albert
Mon 1-3pm in Fairchild D202 with Lane
Mon 6-7pm in Gates 260 with Andrej
Tue 10:25-11:25 in Huang (basement) with Song
Tue 10:30-12:30 in Huang (basement) with Kyle
Tue 5-7pm in Gates B24A with Namrata
Wed 10-12pm in Gates 498 with Serena
Wed 12-2pm in Gates 359 with Subhasis
Wed 7-8pm in Gates 259 with Justin
Thr 10:25-11:25am in Huang (basement) with Song
Thr 3:30-5:30pm in Huang B007 with Irwan
Thr 6-8pm in Gates 260 with Catherine
Thr 1-3pm in Clark S361 with Bharath
Fri 2-4pm in Gates B24A wtih Hieu

Grading Policy

Assignment #1: 15%
Assignment #2: 15%
Assignment #3: 15%
Midterm: 15%
Final Project: 40%

Course Discussions

Stanford students:  Piazza 
Online discussions for non-Stanford students: Reddit on  r/cs231n 
Our Twitter account:  @cs231n

Assignment Details

See the  Assignment Page for more details on how to hand in your assignments.

Course Project Details

See the  Project Page for more details on the course project.

Prerequisites

  • Proficiency in Python, high-level familiarity in C/C++
    All class assignments will be in Python (and use numpy) (we provide a tutorial here for those who aren't as familiar with Python), but some of the deep learning libraries we may look at later in the class are written in C++. If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Javascript) you will probably be fine.
  • College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51)
    You should be comfortable taking derivatives and understanding matrix vector operations and notation.
  • Basic Probability and Statistics (e.g. CS 109 or other stats course)
    You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc.
  • Equivalent knowledge of CS229 (Machine Learning)
    We will be formulating cost functions, taking derivatives and performing optimization with gradient descent.

FAQ

Is this the first time this class is offered?
This class was first offered in Winter 2015, and has been slightly tweaked for the current Winter 2016 offering. The class is designed to introduce students to deep learning in context of Computer Vision. We will place a particular emphasis on Convolutional Neural Networks, which are a class of deep learning models that have recently given dramatic improvements in various visual recognition tasks. You can read more about it in this recent  New York Times article.
Can I follow along from the outside?
We'd be happy if you join us! We plan to make the course materials widely available: The assignments, course notes, lecture videos and slides will be available online. We won't be able to give you course credit.
Can I take this course on credit/no cred basis?
Yes. Credit will be given to those who would have otherwise earned a C- or above.
Can I audit or sit in?
In general we are very open to sitting-in guests if you are a member of the Stanford community (registered student, staff, and/or faculty). Out of courtesy, we would appreciate that you first email us or talk to the instructor after the first class you attend. If the class is too full and we're running out of space, we would ask that you please allow registered students to attend.
Can I work in groups for the Final Project?
Yes, in groups of up to two people.
I have a question about the class. What is the best way to reach the course staff?
Stanford students please use an internal class forum on Piazza so that other students may benefit from your questions and our answers. If you have a personal matter, email us at the class mailing list  cs231n-winter1516-staff@lists.stanford.edu.
Can I combine the Final Project with another course?
Yes, you may. There are a couple of courses concurrently offered with CS231n that are natural choices, such as CS231a (Computer Vision, by Prof. Silvio Savarese) and CS228 (Graphical Models, by Prof. Stefano Ermon). If you are taking some combination of these classes, please speak to the instructors to receive permission to combine the Final Project assignments.


These notes accompany the Stanford CS class  CS231n: Convolutional Neural Networks for Visual Recognition
For questions/concerns/bug reports regarding contact  Justin Johnson regarding the assignments, or contact Andrej Karpathy regarding the course notes. You can also submit a pull request directly to our  git repo
We encourage the use of the  hypothes.is extension to annote comments and discuss these notes inline.
Winter 2016 Assignments
Module 0: Preparation
Module 1: Neural Networks
Image Classification: Data-driven Approach, k-Nearest Neighbor, train/val/test splits
L1/L2 distances, hyperparameter search, cross-validation
Linear classification: Support Vector Machine, Softmax
parameteric approach, bias trick, hinge loss, cross-entropy loss, L2 regularization, web demo
Optimization: Stochastic Gradient Descent
optimization landscapes, local search, learning rate, analytic/numerical gradient
Backpropagation, Intuitions
chain rule interpretation, real-valued circuits, patterns in gradient flow
Neural Networks Part 1: Setting up the Architecture
model of a biological neuron, activation functions, neural net architecture, representational power
Neural Networks Part 2: Setting up the Data and the Loss
preprocessing, weight initialization, batch normalization, regularization (L2/dropout), loss functions
Neural Networks Part 3: Learning and Evaluation
gradient checks, sanity checks, babysitting the learning process, momentum (+nesterov), second-order methods, Adagrad/RMSprop, hyperparameter optimization, model ensembles
Module 2: Convolutional Neural Networks
Convolutional Neural Networks: Architectures, Convolution / Pooling Layers
layers, spatial arrangement, layer patterns, layer sizing patterns, AlexNet/ZFNet/VGGNet case studies, computational considerations
Understanding and Visualizing Convolutional Neural Networks
tSNE embeddings, deconvnets, data gradients, fooling ConvNets, human comparisons



from: http://vision.stanford.edu/teaching/cs231n/index.html
http://cs231n.github.io/
深度学习-面向视觉识别卷积神经网络,2016斯坦福大学公开课。课程介绍: 计算机视觉在社会中已经逐渐普及,并广泛运用于搜索检索、图像理解、手机应用、地图导航、医疗制药、无人机和无人驾驶汽车等领域。而这些应用的核心技术就是图像分类、图像定位和图像探测等视觉识别任务。近期神经网络(也就是“深度学习”)方法上的进展极大地提升了这些代表当前发展水平的视觉识别系统的性能。 本课程将深入讲解深度学习框架的细节问题,聚焦面向视觉识别任务(尤其是图像分类任务)的端到端学习模型。在10周的课程中,学生们将会学习如何实现、训练和调试他们自己的神经网络,并建立起对计算机视觉领域的前沿研究方向的细节理解。最终的作业将包括训练一个有几百万参数的卷积神经网络,并将其应用到最大的图像分类数据库(ImageNet)上。我们将会聚焦于教授如何确定图像识别问题,学习算法(比如反向传播算法),对网络的训练和精细调整(fine-tuning)中的工程实践技巧,指导学生动手完成课程作业和最终的课程项目。本课程的大部分背景知识和素材都来源于ImageNet Challenge竞赛。 主讲人: 李飞飞,斯坦福大学计算机科学系副教授。担任斯坦福大学人工智能实验室和视觉实验室主任,主要研究方向为机器学习、计算机视觉、认知计算神经学。她在TED上的演讲,如何教计算机理解图片。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值