深度学习论文

最新推荐文章于 2024-07-06 17:01:59 发布

置顶 gdtop818

最新推荐文章于 2024-07-06 17:01:59 发布

阅读量1.1w

点赞数 21

分类专栏：深度学习论文系列博客深度学习paper paper_deep_learning

本文链接：https://blog.csdn.net/weixin_37993251/article/details/87906276

版权

深度学习论文系列博客同时被 3 个专栏收录

53 篇文章 29 订阅

订阅专栏

paper_deep_learning

42 篇文章 9 订阅

订阅专栏

深度学习paper

21 篇文章 6 订阅

订阅专栏

一、ImageNet Evolution

以下五篇论文是深度学习的破冰著作，见证了卷积神经网络越来越深，效果越来越好，其中ResNet更是在原始网络结构上有了新的突破~~

[Nature15] Deep Learning：摘自Yann LeCun和Youshua Bengio以及Geoffrey Hinton三人合著发表在nature2015的论文

[NeurIPS12] ImageNet Classification with Deep Convolutional Neural Networks：AlexNet 多伦多大学Alex Krizhevskyh和Geoffrey Hinton在ILSVRC12取得冠军后发表的论文

[ICLR15] Very deep convolutional networks for large-scale image recognition：

VGGNet Oxford的Karen Simonyan教授在ImageNet比赛上取得冠军后发表的论文

[CVPR15] Going deeper with convolutions：GoogLeNet Google的科研人员在ILSVRC14比赛上取得冠军后发表的论文

[CVPR15] Deep residual learning for image recognition：ResNet MSRA的何凯明在ILSVRC15比赛上取得冠军后发表的论文

二、Speech Recognition Evolution （语音识别，RNN, DRNN）

[IEEESignal12] Deep neural networks for acoustic modeling in speech recognition：Geoffrey对于12年以来语音模型识别的总结

[IEEEAcoustic13] SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS：Geoffrey和Alex Graves发表关于End-to-End Deep RNN在Speech Recognition方向上的论文

[ICML14] Towards End-to-End Speech Recognition with Recurrent Neural Networks：Alex Gravesz在End-to-End Deep RNN在Speech Recognition方向上的论文续作

[CS15] Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition：Google的Has¸im Sak使用RNN用在Acoustic Models for Speech Recognition方向上的论文

三、Model（模型及方法，Dropout, BatchNorm）

[JMLR14] Dropout: A Simple Way to Prevent Neural Networks from Overfitting：Geoffrey Hinton和Alex Krizhevsky使用一种新的regularization方法Dropout

[ICML15] Batch Norm: Accelerating Deep Network Training by Reducing Internal Covariate Shift：Google的Ioffe和Szegedy使用一种新的regularization方法或者说是新的initialization的方法Batch Normalization

[arXiv16] Layer Normalization：Geoffrey针对RNN等模型研究了对于Batch Norm的变种Layer Normalization

[ICLR16] Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER：UW的陈天奇和Goodfellow合著针对知识迁移的加速学习Net2Net

[ICLR16] Network Morphism：Buffalo大学和MSRA合作研究一个新型的网络，能从父网络中继承知识并且短时间训练成一个更强的网络，称为network morphism

四、Optimization（优化方法, 动量，DeePhi）

[ICML13] Momentum: On the importance of initialization and momentum in deep learning：多伦多大学和Google的Ilya Sutskever合作研究一种在梯度下降过程中优化下降迭代速度的方式，属于AdaptiveLearning的一种。

[ICLR15] ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION：多伦多大学Jimmy Lei Ba和OpenAI的Ilya Diederik P. Kingma合作研究一种在梯度下降过程中优化下降迭代速度的方式，结合AdaGradRMSProp，和属于AdaptiveLearning的一种。

[ICML12] Building High-level Features Using Large Scale Unsupervised Learning：Andrew Y. Ng和Jeff Dean合作研究使用大量数据模型的无监督学习方式

[25] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2 (2015). [pdf] (ICLR best paper, new direction to make NN running fast,DeePhi Tech Startup) ⭐️⭐️⭐️⭐️⭐️

[26] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv preprint arXiv:1602.07360 (2016). [pdf] (Also a new direction to optimize NN,DeePhi Tech Startup) ⭐️⭐️⭐️⭐️

五、Unsupervised Learning / Deep Generative Model（无监督学习，GAN, DCGAN, VAE, PixelRNN, PixelCNN）

[27] Le, Quoc V. "Building high-level features using large scale unsupervised learning." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (Milestone, Andrew Ng, Google Brain Project, Cat) ⭐️⭐️⭐️⭐️

[28] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013). [pdf](VAE) ⭐️⭐️⭐️⭐️

[NIPS14]Generative adversarial nets

[arXiv15]Unsupervised representation learning with deep convolutional generative adversarial networks. DCGAN

[31] Gregor, Karol, et al. "DRAW: A recurrent neural network for image generation." arXiv preprint arXiv:1502.04623 (2015). [pdf] (VAE with attention, outstanding work) ⭐️⭐️⭐️⭐️⭐️

[ICML16] PixelRNN：Pixel Recurrent Neural Networks：DeepMind研究一种新的生成模型。

[NIPS16] PixelCNN：Conditional Image Generation with PixelCNN Decoders：DeepMind的Alex Graves等领导研究一种新的生成Image模型。

六、 RNN / Sequence-to-Sequence Model（LSTM, S2S）

[CS13] LSTM Generating：Generating Sequences With Recurrent Neural Networks：UT的Alex Graves等领导研究通过LSTM来生成不同风格的文本和手写体handwriting。

[arXiv15] GRU & S2S：Learning Phrase Representations using RNN Encoder–Decoder for SMT：蒙特利尔大学的Yoshua Bengio领导的第一篇使用S2S架构并应用在Statistical Machine Translation的论文。

[NIPS14] Sequence to Sequence Learning with Neural Networks：Google的Ilya Sutskever领导研究一种S2S的end to end学习方法。

[ICLR15] NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE：蒙特利尔大学的KyungHyun Cho和Yoshua Bengio领导研究的新的机器翻译的方法JOINTLY LEARNING TO ALIGN AND TRANSLATE。

[ICML15] A Neural Conversational Model：Google的Oriol及Quoc V.Le领导研究的chatbox S2S。

七、NLP(Natural Language Processing)（自然语言理解）

[AISTATS15] Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing：蒙特利尔大学的Bengio领导关于Joint Learning用于Open-Text研究语义分析及意义表示的论文。

[NIPs13] Distributed Representations of Words and Phrases and their Compositionality：Google的Jeffrey Dean、Greg Corrado、Tomas Mikolov及Ilya Sutskeve发表的word2vec论文，引用高达12000次，同时也是CS224nLecture1的推荐readings。

[3] Sutskever, et al. "Sequence to sequence learning with neural networks." ANIPS(2014) [pdf] ⭐️⭐️⭐️

[ICLR13] Efficient Estimation of Word Representations in Vector Space：Google的Jeffrey Dean、Greg Corrado、Tomas Mikolov发表的第一排word2vec论文，引用高达10000次，同时也是CS224nLecture1的推荐readings。[github]

[JMLR16] Ask Me Anything: Dynamic Memory Networks for Natural Language Processing：CA的Richard Socher发表的人机问答的论文。

[AAAI16] Character-Aware Neural Language Models：Harvard的Yoon Kim发表的character级别的语言模型。[github]

[ICLR16] TOWARDS AI-COMPLETE QUESTION ANSWERING: A SET OF PREREQUISITE TOY TASKS：FAIR的Tomas Mikolov领导发表的人机问答的论文。

[NIPS15] Teaching Machines to Read and Comprehend：DeepMind的Karl Moritz Hermann等发表的人机问答的论文。

[arXiv17] Very Deep Convolutional Networks for Text Classification：Yann Le Cun等发表的文本分类的论文。

[EACL] Bag of Tricks for Efficient Text Classification：FAIR的Tomas Mikolov等发表的在文本分类的Bag技巧的论文。FastText [github]

八、Object Detection（目标识别，RCNN, FastRNN, YOLO,R-FCN, MaskRNN）

[NIPS13] Deep Neural Networks for Object Detection：计划完成深度学习入门的126篇论文第三十七篇，Google的Christian Szegedy等发表的在文本分类的Bag技巧的论文。

[CVPR14] R-CNN: Rich feature hierarchies for accurate object detection and semantic segmentation：计划完成深度学习入门的126篇论文第三十八篇，Berkeley的Ross Girshick等发表的层级特征feature hierarchies用于object detection和semantic segmentation方向的论文。[github]

[ECCV14] SPPNet: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition：计划完成深度学习入门的126篇论文第三十九篇，Kaiming he以及孙剑等完成的对VIsual Recognition中Spatial Pyramid Pooling的研究。[github]

[ICCV15] Fast R-CNN：计划完成深度学习入门的126篇论文第四十篇，微软的Ross Girshick研究的Obeject Detection的模型。[github]

[NIPS15] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks：计划完成深度学习入门的126篇论文第四十一篇，MSRA的Kaiming He及Ross Girshick, 和Jian Sun等完成的对前作Fast R-CNN的改进。[github]

[CVPR16] YOLO(You Only Look Once): Unified, Real-Time Object Detection：计划完成深度学习入门的126篇论文第四十三篇，微软的Ross Girshick及UW的学者研究Obeject Detection的模型。（本篇算是YOLO v1，截止目前出到了v3版本）[官网]

[ECCV16] SSD: Single Shot MultiBox Detector：计划完成深度学习入门的126篇论文第四十四篇，Google的Christian Szegedy及Umich，UNC等学者研究的一种新的目标检测方法，可以看做是和YOLO single-shot思想相似的方法。[github]

[NIPS16] R-FCN: Object Detection via Region-based Fully Convolutional Networks：计划完成深度学习入门的126篇论文第四十五篇，MSRA的Jian Sun和Kaiming He研究的一种新的目标检测方法。[github]

[ICCV17] Mask R-CNN：计划完成深度学习入门的126篇论文第四十二篇，微软的Ross Girshick研究的Obeject Detection的模型。[github]

Task

2.7 Deep Transfer Learning / Lifelong Learning / especially for RL（深度迁移学习）

[53] Bengio, Yoshua. "Deep Learning of Representations for Unsupervised and Transfer Learning." ICML Unsupervised and Transfer Learning 27 (2012): 17-36. [pdf] (A Tutorial) ⭐️⭐️⭐️

[54] Silver, Daniel L., Qiang Yang, and Lianghao Li. "Lifelong Machine Learning Systems: Beyond Learning Algorithms." AAAI Spring Symposium: Lifelong Machine Learning. 2013. [pdf] (A brief discussion about lifelong learning) ⭐️⭐️⭐️

[55] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015). [pdf] (Godfather's Work) ⭐️⭐️⭐️⭐️

[56] Rusu, Andrei A., et al. "Policy distillation." arXiv preprint arXiv:1511.06295 (2015). [pdf] (RL domain) ⭐️⭐️⭐️

[57] Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. "Actor-mimic: Deep multitask and transfer reinforcement learning." arXiv preprint arXiv:1511.06342 (2015). [pdf] (RL domain) ⭐️⭐️⭐️

[58] Rusu, Andrei A., et al. "Progressive neural networks." arXiv preprint arXiv:1606.04671 (2016). [pdf] (Outstanding Work, A novel idea) ⭐️⭐️⭐️⭐️⭐️

4/9 更新未完待续

2.5 Neural Turing Machine（强化学习）

[39] Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014). [pdf] (Basic Prototype of Future Computer) ⭐️⭐️⭐️⭐️⭐️

[40] Zaremba, Wojciech, and Ilya Sutskever. "Reinforcement learning neural Turing machines." arXiv preprint arXiv:1505.00521 362 (2015). [pdf] ⭐️⭐️⭐️

[41] Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory networks." arXiv preprint arXiv:1410.3916 (2014). [pdf]⭐️⭐️⭐️

[42] Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015. [pdf] ⭐️⭐️⭐️⭐️

[43] Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. "Pointer networks." Advances in Neural Information Processing Systems. 2015. [pdf] ⭐️⭐️⭐️⭐️

[44] Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature (2016). [pdf](Milestone,combine above papers' ideas) ⭐️⭐️⭐️⭐️⭐️

2.6 Deep Reinforcement Learning（深度强化学习）

[45] Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013). [pdf]) (First Paper named deep reinforcement learning) ⭐️⭐️⭐️⭐️

[46] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533. [pdf] (Milestone) ⭐️⭐️⭐️⭐️⭐️

[47] Wang, Ziyu, Nando de Freitas, and Marc Lanctot. "Dueling network architectures for deep reinforcement learning." arXiv preprint arXiv:1511.06581 (2015). [pdf] (ICLR best paper,great idea) ⭐️⭐️⭐️⭐️

[48] Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." arXiv preprint arXiv:1602.01783 (2016). [pdf] (State-of-the-art method) ⭐️⭐️⭐️⭐️⭐️

[49] Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." arXiv preprint arXiv:1509.02971 (2015). [pdf] (DDPG) ⭐️⭐️⭐️⭐️

[50] Gu, Shixiang, et al. "Continuous Deep Q-Learning with Model-based Acceleration." arXiv preprint arXiv:1603.00748 (2016). [pdf] (NAF) ⭐️⭐️⭐️⭐️

[51] Schulman, John, et al. "Trust region policy optimization." CoRR, abs/1502.05477 (2015). [pdf] (TRPO) ⭐️⭐️⭐️⭐️

[52] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489. [pdf] (AlphaGo) ⭐️⭐️⭐️⭐️⭐️

2.8 One Shot Deep Learning

[59] Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. "Human-level concept learning through probabilistic program induction." Science 350.6266 (2015): 1332-1338. [pdf] (No Deep Learning,but worth reading) ⭐️⭐️⭐️⭐️⭐️

[60] Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. "Siamese Neural Networks for One-shot Image Recognition."(2015) [pdf] ⭐️⭐️⭐️

[61] Santoro, Adam, et al. "One-shot Learning with Memory-Augmented Neural Networks." arXiv preprint arXiv:1605.06065 (2016). [pdf] (A basic step to one shot learning) ⭐️⭐️⭐️⭐️

[62] Vinyals, Oriol, et al. "Matching Networks for One Shot Learning." arXiv preprint arXiv:1606.04080 (2016). [pdf] ⭐️⭐️⭐️

[63] Hariharan, Bharath, and Ross Girshick. "Low-shot visual object recognition." arXiv preprint arXiv:1606.02819 (2016). [pdf](A step to large data) ⭐️⭐️⭐️⭐️

3 Applications

3.3 Visual Tracking

[1] Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." Advances in neural information processing systems. 2013. [pdf] (First Paper to do visual tracking using Deep Learning,DLT Tracker) ⭐️⭐️⭐️

[2] Wang, Naiyan, et al. "Transferring rich feature hierarchies for robust visual tracking." arXiv preprint arXiv:1501.04587 (2015). [pdf] (SO-DLT) ⭐️⭐️⭐️⭐️

[3] Wang, Lijun, et al. "Visual tracking with fully convolutional networks." Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf] (FCNT) ⭐️⭐️⭐️⭐️

[4] Held, David, Sebastian Thrun, and Silvio Savarese. "Learning to Track at 100 FPS with Deep Regression Networks." arXiv preprint arXiv:1604.01802 (2016). [pdf] (GOTURN,Really fast as a deep learning method,but still far behind un-deep-learning methods) ⭐️⭐️⭐️⭐️

[5] Bertinetto, Luca, et al. "Fully-Convolutional Siamese Networks for Object Tracking." arXiv preprint arXiv:1606.09549 (2016). [pdf] (SiameseFC,New state-of-the-art for real-time object tracking) ⭐️⭐️⭐️⭐️

[6] Martin Danelljan, Andreas Robinson, Fahad Khan, Michael Felsberg. "Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking." ECCV (2016) [pdf] (C-COT) ⭐️⭐️⭐️⭐️

[7] Nam, Hyeonseob, Mooyeol Baek, and Bohyung Han. "Modeling and Propagating CNNs in a Tree Structure for Visual Tracking." arXiv preprint arXiv:1608.07242 (2016). [pdf] (VOT2016 Winner,TCNN) ⭐️⭐️⭐️⭐️

3.4 Image Caption（图像抓取）

[1] Farhadi,Ali,etal. "Every picture tells a story: Generating sentences from images". In Computer VisionECCV 2010. Springer Berlin Heidelberg:15-29, 2010. [pdf] ⭐️⭐️⭐️

[2] Kulkarni, Girish, et al. "Baby talk: Understanding and generating image descriptions". In Proceedings of the 24th CVPR, 2011. [pdf]⭐️⭐️⭐️⭐️

[3] Vinyals, Oriol, et al. "Show and tell: A neural image caption generator". In arXiv preprint arXiv:1411.4555, 2014. [pdf]⭐️⭐️⭐️

[4] Donahue, Jeff, et al. "Long-term recurrent convolutional networks for visual recognition and description". In arXiv preprint arXiv:1411.4389 ,2014. [pdf]

[5] Karpathy, Andrej, and Li Fei-Fei. "Deep visual-semantic alignments for generating image descriptions". In arXiv preprint arXiv:1412.2306, 2014. [pdf]⭐️⭐️⭐️⭐️⭐️

[6] Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. "Deep fragment embeddings for bidirectional image sentence mapping". In Advances in neural information processing systems, 2014. [pdf]⭐️⭐️⭐️⭐️

[7] Fang, Hao, et al. "From captions to visual concepts and back". In arXiv preprint arXiv:1411.4952, 2014. [pdf]⭐️⭐️⭐️⭐️⭐️

[8] Chen, Xinlei, and C. Lawrence Zitnick. "Learning a recurrent visual representation for image caption generation". In arXiv preprint arXiv:1411.5654, 2014. [pdf]⭐️⭐️⭐️⭐️

[9] Mao, Junhua, et al. "Deep captioning with multimodal recurrent neural networks (m-rnn)". In arXiv preprint arXiv:1412.6632, 2014. [pdf]⭐️⭐️⭐️

[10] Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention". In arXiv preprint arXiv:1502.03044, 2015. [pdf]⭐️⭐️⭐️⭐️⭐️

3.5 Machine Translation

Some milestone papers are listed in RNN / Seq-to-Seq topic.

[1] Luong, Minh-Thang, et al. "Addressing the rare word problem in neural machine translation." arXiv preprint arXiv:1410.8206 (2014). [pdf] ⭐️⭐️⭐️⭐️

[2] Sennrich, et al. "Neural Machine Translation of Rare Words with Subword Units". In arXiv preprint arXiv:1508.07909, 2015. [pdf]⭐️⭐️⭐️

[3] Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015). [pdf] ⭐️⭐️⭐️⭐️

[4] Chung, et al. "A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation". In arXiv preprint arXiv:1603.06147, 2016. [pdf]⭐️⭐️

[5] Lee, et al. "Fully Character-Level Neural Machine Translation without Explicit Segmentation". In arXiv preprint arXiv:1610.03017, 2016. [pdf]⭐️⭐️⭐️⭐️⭐️

[6] Wu, Schuster, Chen, Le, et al. "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". In arXiv preprint arXiv:1609.08144v2, 2016. [pdf] (Milestone) ⭐️⭐️⭐️⭐️

3.6 Robotics

[1] Koutník, Jan, et al. "Evolving large-scale neural networks for vision-based reinforcement learning." Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013. [pdf] ⭐️⭐️⭐️

[2] Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." Journal of Machine Learning Research 17.39 (2016): 1-40. [pdf] ⭐️⭐️⭐️⭐️⭐️

[3] Pinto, Lerrel, and Abhinav Gupta. "Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours." arXiv preprint arXiv:1509.06825 (2015). [pdf] ⭐️⭐️⭐️

[4] Levine, Sergey, et al. "Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection." arXiv preprint arXiv:1603.02199 (2016). [pdf] ⭐️⭐️⭐️⭐️

[5] Zhu, Yuke, et al. "Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning." arXiv preprint arXiv:1609.05143 (2016). [pdf] ⭐️⭐️⭐️⭐️

[6] Yahya, Ali, et al. "Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search." arXiv preprint arXiv:1610.00673 (2016). [pdf] ⭐️⭐️⭐️⭐️

[7] Gu, Shixiang, et al. "Deep Reinforcement Learning for Robotic Manipulation." arXiv preprint arXiv:1610.00633 (2016). [pdf]⭐️⭐️⭐️⭐️

[8] A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell."Sim-to-Real Robot Learning from Pixels with Progressive Nets." arXiv preprint arXiv:1610.04286 (2016). [pdf] ⭐️⭐️⭐️⭐️

[9] Mirowski, Piotr, et al. "Learning to navigate in complex environments." arXiv preprint arXiv:1611.03673 (2016). [pdf]⭐️⭐️⭐️⭐️

3.7 Art（艺术项目，深度梦境，风格迁移）

[1] Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks". Google Research. [html] (Deep Dream) ⭐️⭐️⭐️⭐️

[2] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "A neural algorithm of artistic style." arXiv preprint arXiv:1508.06576 (2015). [pdf] (Outstanding Work, most successful method currently) ⭐️⭐️⭐️⭐️⭐️

[3] Zhu, Jun-Yan, et al. "Generative Visual Manipulation on the Natural Image Manifold." European Conference on Computer Vision. Springer International Publishing, 2016. [pdf] (iGAN) ⭐️⭐️⭐️⭐️

[4] Champandard, Alex J. "Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks." arXiv preprint arXiv:1603.01768 (2016). [pdf] (Neural Doodle) ⭐️⭐️⭐️⭐️

[5] Zhang, Richard, Phillip Isola, and Alexei A. Efros. "Colorful Image Colorization." arXiv preprint arXiv:1603.08511 (2016). [pdf]⭐️⭐️⭐️⭐️

[6] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155 (2016). [pdf] ⭐️⭐️⭐️⭐️

[7] Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. "A learned representation for artistic style." arXiv preprint arXiv:1610.07629 (2016). [pdf] ⭐️⭐️⭐️⭐️

[8] Gatys, Leon and Ecker, et al."Controlling Perceptual Factors in Neural Style Transfer." arXiv preprint arXiv:1611.07865 (2016). [pdf] (control style transfer over spatial location,colour information and across spatial scale)⭐️⭐️⭐️⭐️

[9] Ulyanov, Dmitry and Lebedev, Vadim, et al. "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images." arXiv preprint arXiv:1603.03417(2016). [pdf] (texture generation and style transfer) ⭐️⭐️⭐️⭐️

3.8 Object Segmentation

[1] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” in CVPR, 2015. [pdf]⭐️⭐️⭐️⭐️⭐️

[2] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. "Semantic image segmentation with deep convolutional nets and fully connected crfs." In ICLR, 2015. [pdf] ⭐️⭐️⭐️⭐️⭐️

[3] Pinheiro, P.O., Collobert, R., Dollar, P. "Learning to segment object candidates." In: NIPS. 2015. [pdf] ⭐️⭐️⭐️⭐️

[4] Dai, J., He, K., Sun, J. "Instance-aware semantic segmentation via multi-task network cascades." in CVPR. 2016 [pdf]⭐️⭐️⭐️

[5] Dai, J., He, K., Sun, J. "Instance-sensitive Fully Convolutional Networks." arXiv preprint arXiv:1603.08678 (2016). [pdf]⭐️⭐️⭐️

gdtop818

关注

21
点赞
踩
210

收藏

觉得还不错? 一键收藏
6
评论
深度学习论文

一、ImageNet Evolution以下五篇论文是深度学习的破冰著作，见证了卷积神经网络越来越深，效果越来越好，其中ResNet更是在原始网络结构上有了新的突破~~[Nature15] Deep Learning：摘自Yann LeCun和Youshua Bengio以及GeoffreyHinton三人合著发表在nature2015的论文[NeurIPS12] ImageNet ...
复制链接

扫一扫

专栏目录

深度学习论文

一、ImageNet Evolution

[Nature15] Deep Learning：摘自Yann LeCun和Youshua Bengio以及Geoffrey Hinton三人合著发表在nature2015的论文

[NeurIPS12] ImageNet Classification with Deep Convolutional Neural Networks：AlexNet 多伦多大学Alex Krizhevskyh和Geoffrey Hinton在ILSVRC12取得冠军后发表的论文

[ICLR15] Very deep convolutional networks for large-scale image recognition：

VGGNet Oxford的Karen Simonyan教授在ImageNet比赛上取得冠军后发表的论文

[CVPR15] Going deeper with convolutions：GoogLeNet Google的科研人员在ILSVRC14比赛上取得冠军后发表的论文

[CVPR15] Deep residual learning for image recognition：ResNet MSRA的何凯明在ILSVRC15比赛上取得冠军后发表的论文

二、Speech Recognition Evolution （语音识别，RNN, DRNN）

[IEEESignal12] Deep neural networks for acoustic modeling in speech recognition：Geoffrey对于12年以来语音模型识别的总结

[IEEEAcoustic13] SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS：Geoffrey和Alex Graves发表关于End-to-End Deep RNN在Speech Recognition方向上的论文

[ICML14] Towards End-to-End Speech Recognition with Recurrent Neural Networks：Alex Gravesz在End-to-End Deep RNN在Speech Recognition方向上的论文续作

[CS15] Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition：Google的Has¸im Sak使用RNN用在Acoustic Models for Speech Recognition方向上的论文

三、Model（模型及方法，Dropout, BatchNorm）

[JMLR14] Dropout: A Simple Way to Prevent Neural Networks from Overfitting：Geoffrey Hinton和Alex Krizhevsky使用一种新的regularization方法Dropout

[ICML15] Batch Norm: Accelerating Deep Network Training by Reducing Internal Covariate Shift：Google的Ioffe和Szegedy使用一种新的regularization方法或者说是新的initialization的方法Batch Normalization

[arXiv16] Layer Normalization：Geoffrey针对RNN等模型研究了对于Batch Norm的变种Layer Normalization

[ICLR16] Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER：UW的陈天奇和Goodfellow合著针对知识迁移的加速学习Net2Net

[ICLR16] Network Morphism：Buffalo大学和MSRA合作研究一个新型的网络，能从父网络中继承知识并且短时间训练成一个更强的网络，称为network morphism

四、Optimization（优化方法, 动量，DeePhi）

[ICML13] Momentum: On the importance of initialization and momentum in deep learning：多伦多大学和Google的Ilya Sutskever合作研究一种在梯度下降过程中优化下降迭代速度的方式，属于AdaptiveLearning的一种。

[ICLR15] ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION：多伦多大学Jimmy Lei Ba和OpenAI的Ilya Diederik P. Kingma合作研究一种在梯度下降过程中优化下降迭代速度的方式，结合AdaGradRMSProp，和属于AdaptiveLearning的一种。

[ICML12] Building High-level Features Using Large Scale Unsupervised Learning：Andrew Y. Ng和Jeff Dean合作研究使用大量数据模型的无监督学习方式

五、Unsupervised Learning / Deep Generative Model（无监督学习，GAN, DCGAN, VAE, PixelRNN, PixelCNN）

[NIPS14]Generative adversarial nets

[arXiv15]Unsupervised representation learning with deep convolutional generative adversarial networks. DCGAN

[ICML16] PixelRNN：Pixel Recurrent Neural Networks：DeepMind研究一种新的生成模型。

[NIPS16] PixelCNN：Conditional Image Generation with PixelCNN Decoders：DeepMind的Alex Graves等领导研究一种新的生成Image模型。

六、 RNN / Sequence-to-Sequence Model（LSTM, S2S）

[CS13] LSTM Generating：Generating Sequences With Recurrent Neural Networks：UT的Alex Graves等领导研究通过LSTM来生成不同风格的文本和手写体handwriting。

[arXiv15] GRU & S2S：Learning Phrase Representations using RNN Encoder–Decoder for SMT：蒙特利尔大学的Yoshua Bengio领导的第一篇使用S2S架构并应用在Statistical Machine Translation的论文。

[NIPS14] Sequence to Sequence Learning with Neural Networks：Google的Ilya Sutskever领导研究一种S2S的end to end学习方法。

[ICLR15] NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE：蒙特利尔大学的KyungHyun Cho和Yoshua Bengio领导研究的新的机器翻译的方法JOINTLY LEARNING TO ALIGN AND TRANSLATE。

[ICML15] A Neural Conversational Model：Google的Oriol及Quoc V.Le领导研究的chatbox S2S。

七、NLP(Natural Language Processing)（自然语言理解）

[AISTATS15] Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing：蒙特利尔大学的Bengio领导关于Joint Learning用于Open-Text研究语义分析及意义表示的论文。

[NIPs13] Distributed Representations of Words and Phrases and their Compositionality：Google的Jeffrey Dean、Greg Corrado、Tomas Mikolov及Ilya Sutskeve发表的word2vec论文，引用高达12000次，同时也是CS224nLecture1的推荐readings。

[ICLR13] Efficient Estimation of Word Representations in Vector Space：Google的Jeffrey Dean、Greg Corrado、Tomas Mikolov发表的第一排word2vec论文，引用高达10000次，同时也是CS224nLecture1的推荐readings。[github]

[JMLR16] Ask Me Anything: Dynamic Memory Networks for Natural Language Processing：CA的Richard Socher发表的人机问答的论文。

[AAAI16] Character-Aware Neural Language Models：Harvard的Yoon Kim发表的character级别的语言模型。[github]

[ICLR16] TOWARDS AI-COMPLETE QUESTION ANSWERING: A SET OF PREREQUISITE TOY TASKS：FAIR的Tomas Mikolov领导发表的人机问答的论文。

[NIPS15] Teaching Machines to Read and Comprehend：DeepMind的Karl Moritz Hermann等发表的人机问答的论文。

[arXiv17] Very Deep Convolutional Networks for Text Classification：Yann Le Cun等发表的文本分类的论文。

[EACL] Bag of Tricks for Efficient Text Classification：FAIR的Tomas Mikolov等发表的在文本分类的Bag技巧的论文。FastText [github]

八、Object Detection（目标识别，RCNN, FastRNN, YOLO,R-FCN, MaskRNN）

[NIPS13] Deep Neural Networks for Object Detection：计划完成深度学习入门的126篇论文第三十七篇，Google的Christian Szegedy等发表的在文本分类的Bag技巧的论文。

[ECCV14] SPPNet: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition：计划完成深度学习入门的126篇论文第三十九篇，Kaiming he以及孙剑等完成的对VIsual Recognition中Spatial Pyramid Pooling的研究。[github]

[ICCV15] Fast R-CNN：计划完成深度学习入门的126篇论文第四十篇，微软的Ross Girshick研究的Obeject Detection的模型。[github]

[NIPS15] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks：计划完成深度学习入门的126篇论文第四十一篇，MSRA的Kaiming He及Ross Girshick, 和Jian Sun等完成的对前作Fast R-CNN的改进。[github]

[CVPR16] YOLO(You Only Look Once): Unified, Real-Time Object Detection：计划完成深度学习入门的126篇论文第四十三篇，微软的Ross Girshick及UW的学者研究Obeject Detection的模型。（本篇算是YOLO v1，截止目前出到了v3版本）[官网]

[ECCV16] SSD: Single Shot MultiBox Detector：计划完成深度学习入门的126篇论文第四十四篇，Google的Christian Szegedy及Umich，UNC等学者研究的一种新的目标检测方法，可以看做是和YOLO single-shot思想相似的方法。[github]

[NIPS16] R-FCN: Object Detection via Region-based Fully Convolutional Networks：计划完成深度学习入门的126篇论文第四十五篇，MSRA的Jian Sun和Kaiming He研究的一种新的目标检测方法。[github]

[ICCV17] Mask R-CNN：计划完成深度学习入门的126篇论文第四十二篇，微软的Ross Girshick研究的Obeject Detection的模型。[github]

Task

2.7 Deep Transfer Learning / Lifelong Learning / especially for RL（深度迁移学习）

4/9 更新 未完待续

2.5 Neural Turing Machine（强化学习）

2.6 Deep Reinforcement Learning（深度强化学习）

3 Applications

3.3 Visual Tracking

3.4 Image Caption（图像抓取）

3.5 Machine Translation

3.6 Robotics

3.7 Art（艺术项目，深度梦境，风格迁移）

3.8 Object Segmentation

4/9 更新未完待续