Deep Learning in Computer Vision


In recent years, Deep Learning has become a dominant Machine Learning tool for a wide variety of domains. One of its biggest successes has been in Computer Vision where the performance in problems such object and action recognition has been improved dramatically. In this course, we will be reading up on various Computer Vision problems, the state-of-the-art techniques involving different neural architectures and brainstorming about promising new directions.

Please sign up here in the beginning of class.

This class is a graduate seminar course in computer vision. The class will cover a diverse set of topics in Computer Vision and various Neural Network architectures. It will be an interactive course where we will discuss interesting topics on demand and latest research buzz. The goal of the class is to learn about different domains of vision, understand, identify and analyze the main challenges, what works and what doesn't, as well as to identify interesting new directions for future research.

Prerequisites: Courses in computer vision and/or machine learning (e.g., CSC320, CSC420, CSC411) are highly recommended (otherwise you will need some additional reading), and basic programming skills are required for projects.

 back to top

  • Time and Location

    Winter 2016

    Day: Tuesday
    Time: 9am-11am
    Room: ES B149 (Earth Science Building at 5 Bancroft Avenue)

    Instructor

    Sanja Fidler

    Email: fidler@cs dot toronto dot edu
    Homepagehttp://www.cs.toronto.edu/~fidler
    Office hours: by appointment (send email)
When emailing me, please put CSC2523 in the subject line.

Forum

This class uses piazza. On this webpage, we will post announcements and assignments. The students will also be able to postquestions and discussions in a forum style manner, either to their instructors or to their peers.

 back to top

We will have an invited speaker for this course:


  • Raquel Urtasun
    Assistant Professor, University of Toronto
    Talk title:  Deep Structured Models

as well as several invited lectures / tutorials:

  • Yuri Burda, Postdoctoral Fellow, University of Toronto:    Lecture on Variational Autoencoders
  • Ryan Kiros, PhD student, University of Toronto:    Lecture on Recurrent Neural Networks and Neural Language Models
  • Jimmy Ba, PhD student, University of Toronto:    Lecture on Neural Programming
  • Yukun Zhu, Msc student, University of Toronto:    Lecture on Convolutional Neural Networks
  • Elman Mansimov, Research Assistant, University of Toronto:    Lecture on Image Generation with Neural Networks
  • Emilio Parisotto, Msc student, University of Toronto:    Lecture on Deep Reinforcement Learning
  • Renjie Liao, PhD student, University of Toronto:    Lecture on Highway and Residual Networks

Each student will need to write two paper reviews each week, present once or twice in class (depending on enrollment), participate in class discussions, and complete a project (done individually or in pairs).


The final grade will consist of the following  
Participation (attendance, participation in discussions, reviews) 15%
Presentation (presentation of papers in class) 25%
Project (proposal, final report) 60%

 back to top

The first class will present a short overview of neural network architectures, however, the details will be covered when reading on particular topics. Readings will touch on a diverse set of topics in Computer Vision. The course will be interactive -- we will add interesting topics on demand and latest research buzz.


 back to top

Date Topic   Reading / Material   Speaker Slides
Jan 12 Admin & Introduction(s)       Sanja Fidler admin
Convolutional Neural Networks
Jan 19 Convolutional Neural Nets(tutorial)   Resources: Stanford's cs231 class, VGG's Practical CNNTutorial
Code: CNN Tutorial for TensorFlowTutorial for caffe, CNNTutorial for Theano
  Yukun Zhu
(invited)
[pdf]
[code]
  Image Segmentation   Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs   [PDF] [code]
L-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L Yuille
  Shenlong Wang [pdf]
[code]
Jan 26 Very Deep Networks   Highway Networks  [PDF] [code]
Rupesh Kumar Srivastava, Klaus Greff, Jurgen Schmidhuber

Deep Residual Learning for Image Recognition  [PDF]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
  Renjie Liao
(invited)
[pdf]
  Object Detection   Rich feature hierarchies for accurate object detection and semantic segmentation   [PDF] [code]
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks   [PDF] [code (Matlab)] [code (Python)]
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
  Kaustav Kundu [pdf]
Feb 2 Stereo
Siamese Networks
  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches  [PDF] [code]
Jure Žbontar, Yann LeCun

Learning to Compare Image Patches via Convolutional Neural Networks  [PDF] [code]
Sergey Zagoruyko, Nikos Komodakis
  Wenjie Luo [pdf]
  Depth from Single Image   Designing Deep Networks for Surface Normal Estimation   [PDF]
Xiaolong Wang, David Fouhey, Abhinav Gupta
  Mian Wei [pptx]  [pdf]
Feb 9 Image Generation   Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks   [PDF]
Alec Radford, Luke Metz, Soumith Chintala

Generating Images from Captions with Attention   [PDF]
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov
  Elman Mansimov
(invited)
[pdf]
  Domain Adaptation, Zero-shot Learning   Simultaneous Deep Transfer Across Domains and Tasks   [PDF]
Eric Tzeng, Judy Hoffman, Trevor Darrell

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions   [PDF]
Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov
  Lluis Castrejon [pdf]
Recurrent Neural Networks
Feb 23 RNNs and Neural Language Models   Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models   [PDF] [code]
Ryan Kiros, Ruslan Salakhutdinov, Richard Zemel

Skip-Thought Vectors   [PDF] [code]
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler
  Jamie Kiros
(invited)
 
Mar 1 Modeling Words   Efficient Estimation of Word Representations in Vector Space  [PDF] [code]
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean
  Eleni Triantafillou
[pdf]
  Describing Videos   Sequence to Sequence -- Video to Text   [PDF]
Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko
  Erin Grant
[pdf]
  Image-based QA   Ask Your Neurons: A Neural-based Approach to Answering Questions about Images   [PDF]
Mateusz Malinowski, Marcus Rohrbach, Mario Fritz
  Yunpeng Li
[pdf]
Mar 8 Variational Autoencoders   Auto-Encoding Variational Bayes   [PDF]
Diederik P Kingma, Max Welling

Tutorial: Bayesian Reasoning and Deep Learning   [PDF]
Shakir Mohamed
  Yura Burda
(invited)
[pdf]
  Text-based QA   End-To-End Memory Networks   [PDF]
Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus
  Marina Samuel
[pdf]
  Neural Reasoning   Recursive Neural Networks Can Learn Logical Semantics   [PDF]
Samuel R. Bowman, Christopher Potts, Christopher D. Manning
  Rodrigo Toro Icarte
 
Mar 15 Neural Programming     Jimmy Ba
(invited)
 
  Conversation Models   A Neural Conversational Model   [PDF]
Oriol Vinyals, Quoc Le
  Caner Berkay Antmen
 
  Sentiment Analysis   Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank   [PDF]
Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng and Christopher Potts
  Zhicong Lu
 
  Video Representations   Unsupervised Learning of Video Representations using LSTMs  [PDF]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
  Kamyar Ghasemipour
 
  Visual Attention   Recurrent Models of Visual Attention   [PDF]
Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu
  Matthew Shepherd
 
  Direction Following (Robotics)   Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences   [PDF]
Hongyuan Mei, Mohit Bansal, Matthew R. Walter
  Alan Yusheng Wu
 

 back to top

Tutorials, related courses:
  •   Introduction to Neural Networks, CSC321 course at University of Toronto
  •   Course on Convolutional Neural Networks, CS231n course at Stanford University
  •   Course on Probabilistic Graphical Models, CSC412 course at University of Toronto, advanced machine learning course

Software:
  •   Caffe: Deep learning for image classification
  •   Tensorflow: Open Source Software Library for Machine Intelligence (good software for deep learning)
  •   Theano: Deep learning library
  •   mxnet: Deep Learning library
  •   Torch: Scientific computing framework with wide support for machine learning algorithms
  •   LIBSVM: A Library for Support Vector Machines (Matlab, Python)
  •   scikit: Machine learning in Python

Popular datasets:

Online demos:

Main conferences:
  •   NIPS (Neural Information Processing Systems)
  •   ICML (International Conference on Machine Learning)
  •   ICLR (International Conference on Learning Representations)
  •   AISTATS (International Conference on Artificial Intelligence and Statistics)
  •   CVPR (IEEE Conference on Computer Vision and Pattern Recognition)
  •   ICCV (International Conference on Computer Vision)
  •   ECCV (European Conference on Computer Vision)
  •   ACL (Association for Computational Linguistics)
  •   EMNLP (Conference on Empirical Methods in Natural Language Processing)


 back to top


  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值