参考文献列表
[1] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2009)
中文翻译:[1] 邓, J., 董, W., Socher, R., 李, L.J., 李, K., Fei-Fei, L.: ImageNet:一个大规模的层次化图像数据库。在:计算机视觉与模式识别会议 (CVPR). (2009)
[2] Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images.
中文翻译:[2] Krizhevsky, A., Hinton, G.: 从微小图像中学习多层特征。
[3] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: Proc. of European Conf. on Computer Vision (ECCV). (2014)
中文翻译:[3] 林, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: 微软COCO:上下文中的常见物体。在:欧洲计算机视觉会议 (ECCV). (2014)
[4] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11) (1998) 2278–2324
中文翻译:[4] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: 基于梯度的学习应用于文档识别。IEEE学报 86(11) (1998) 2278–2324
[5] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2016)
中文翻译:[5] 何, K., 张, X., 任, S., 孙, J.: 深度残差学习用于图像识别。在:计算机视觉与模式识别会议 (CVPR). (2016)
[6] Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
中文翻译:[6] Zagoruyko, S., Komodakis, N.: 宽残差网络。arXiv预印本 arXiv:1605.07146 (2016)
[7] Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431 (2016)
中文翻译:[7] 谢, S., Girshick, R., Dollár, P., Tu, Z., 何, K.: 深度神经网络的聚合残差变换。arXiv预印本 arXiv:1611.05431 (2016)
[8] Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proc. of Association for the Advancement of Artificial Intelligence (AAAI). (2017)
中文翻译:[8] Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4、Inception-ResNet及残差连接对学习的影响。在:人工智能促进协会会议 (AAAI). (2017)
[9] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
中文翻译:[9] Simonyan, K., Zisserman, A.: 大规模图像识别的超深卷积网络。arXiv预印本 arXiv:1409.1556 (2014)
[10] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2015)
中文翻译:[10] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: 深入卷积。在:计算机视觉与模式
[11] Chollet, F.: Xception: Deep learning with depthwise separable convolutions. arXiv preprint arXiv:1610.02357 (2016)
中文翻译:[11] Chollet, F.: Xception:使用深度可分离卷积的深度学习。arXiv预印本 arXiv:1610.02357 (2016)
[12] Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Proc. of Neural Information Processing Systems (NIPS). (2014)
中文翻译:[12] Mnih, V., Heess, N., Graves, A., 等:视觉注意力的循环模型。在:神经信息处理系统会议 (NIPS). (2014)
[13] Ba, J., Mnih, V., Kavukcuoglu, K.: Multiple object recognition with visual attention. (2014)
中文翻译:[13] Ba, J., Mnih, V., Kavukcuoglu, K.: 基于视觉注意力的多目标识别。 (2014)
[14] Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. (2014)
中文翻译:[14] Bahdanau, D., Cho, K., Bengio, Y.: 通过联合学习对齐和翻译的神经机器翻译。 (2014)
[15] Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: Neural image caption generation with visual attention. (2015)
中文翻译:[15] Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: 展示、关注和讲述:基于视觉注意力的神经图像字幕生成。 (2015)
[16] Gregor, K., Danihelka, I., Graves, A., Rezende, D.J., Wierstra, D.: Draw: A recurrent neural network for image generation. (2015)
中文翻译:[16] Gregor, K., Danihelka, I., Graves, A., Rezende, D.J., Wierstra, D.: DRAW:用于图像生成的循环神经网络。 (2015)
[17] Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Proc. of Neural Information Processing Systems (NIPS). (2015)
中文翻译:[17] Jaderberg, M., Simonyan, K., Zisserman, A., 等:空间变换网络。在:神经信息处理系统会议 (NIPS). (2015)
[18] Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2017) 618–626
中文翻译:[18] Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM:基于梯度定位的深度网络可视化解释。在:IEEE计算机视觉与模式识别会议 (CVPR). (2017) 618–626
[19] Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proc. of Neural Information Processing Systems (NIPS). (2012)
中文翻译:[19] Krizhevsky, A., Sutskever, I., Hinton, G.E.: 使用深度卷积神经网络的ImageNet分类。在:神经信息处理系统会议 (NIPS). (2012)
[20] Han, D., Kim, J., Kim, J.: Deep pyramidal residual networks. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2017)
中文翻译:[20] Han, D., Kim, J., Kim, J.: 深度金字塔残差网络。在:计算机视觉与模式识别会议 (CVPR). (2017)
[21] Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. arXiv preprint arXiv:1608.06993 (2016)
中文翻译:[21] Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: 密集连接的卷积网络。arXiv预印本 arXiv:1608.06993 (2016)
[22] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2016)
中文翻译:[22] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: 重新思考用于计算机视觉的Inception架构。在:计算机视觉与模式识别会议 (CVPR). (2016)
[23] Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. In: IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI). (1998)
中文翻译:[23] Itti, L., Koch, C., Niebur, E.: 基于显著性的视觉注意力模型用于快速场景分析。在:IEEE模式分析与机器智能学报 (TPAMI). (1998)
[24] Rensink, R.A.: The dynamic representation of scenes. In: Visual cognition 7.1-3. (2000)
中文翻译:[24] Rensink, R.A.: 场景的动态表示。在:视觉认知 7.1-3. (2000)
[25] Corbetta, M., Shulman, G.L.: Control of goal-directed and stimulus-driven attention in the brain. In: Nature reviews neuroscience 3.3. (2002)
中文翻译:[25] Corbetta, M., Shulman, G.L.: 大脑中目标导向和刺激驱动注意力的控制。在:自然神经科学综述 3.3. (2002)
[26] Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Proc. of Neural Information Processing Systems (NIPS). (2010)
中文翻译:[26] Larochelle, H., Hinton, G.E.: 使用三阶玻尔兹曼机学习结合中心凹视觉信息。在:神经信息处理系统会议 (NIPS). (2010)
[27] Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. arXiv preprint arXiv:1704.06904 (2017)
中文翻译:[27] Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: 用于图像分类的残差注意力网络。arXiv预印本 arXiv:1704.06904 (2017)
[28] Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017)
中文翻译:[28] Hu, J., Shen, L., Sun, G.: 挤压-激励网络。arXiv预印本 arXiv:1709.01507 (2017)
[29] Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Chua, T.S.: Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2017)
中文翻译:[29] Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Chua, T.S.: Sca-CNN:卷积网络中的空间和通道注意力用于图像字幕生成。在:计算机视觉与模式识别会议 (CVPR). (2017)
[30] Sanghyun, W., Soonmin, H., So, K.I.: Stairnet: Top-down semantic aggregation for accurate one-shot detection. In: Proc. of Winter Conference on Applications of Computer Vision (WACV). (2018)
中文翻译:[30] Sanghyun, W., Soonmin, H., So, K.I.: StairNet:用于准确单次检测的自上而下的语义聚合。在:冬季计算机视觉应用会议 (WACV). (2018)
[31] Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proc. of European Conf. on Computer Vision (ECCV). (2014)
中文翻译:[31] Zeiler, M.D., Fergus, R.: 可视化和理解卷积网络。在:欧洲计算机视觉会议 (ECCV). (2014)
[32] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, IEEE (2016) 2921–2929
中文翻译:[32] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: 学习用于判别定位的深度特征。在:IEEE计算机视觉与模式识别会议 (CVPR). (2016) 2921–2929
[33] Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In: ICLR. (2017)
中文翻译:[33] Zagoruyko, S., Komodakis, N.: 更加关注注意力:通过注意力转移提升卷积神经网络的性能。在:国际学习表征会议 (ICLR). (2017)
[34] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
中文翻译:[34] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNet:用于移动视觉应用的高效卷积神经网络。arXiv预印本 arXiv:1704.04861 (2017)
[35] Pytorch. http://pytorch.org/ Accessed: 2017-11-08.
中文翻译:Pytorch. http://pytorch.org/ 访问日期:2017-11-08.
[36] He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Proc. of European Conf. on Computer Vision (ECCV). (2016)
中文翻译:[36] He, K., Zhang, X., Ren, S., Sun, J.: 深度残差网络中的恒等映射。在:欧洲计算机视觉会议 (ECCV). (2016)
[37] Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: Proc. of European Conf. on Computer Vision (ECCV). (2016)
中文翻译:[37] Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: 具有随机深度的深度网络。在:欧洲计算机视觉会议 (ECCV). (2016)
[38] Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proc. of Computer Vision and Pattern Recognition (CVPR). (2016)
中文翻译:[38] Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: 内外网络:使用跳跃池化和循环神经网络在上下文中检测物体。在:计算机视觉与模式识别会议 (CVPR). (2016)
[39] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: Single shot multibox detector. In: Proc. of European Conf. on Computer Vision (ECCV). (2016)
中文翻译:[39] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD:单次多框检测器。在:欧洲计算机视觉会议 (ECCV). (2016)
[40] Chen, X., Gupta, A.: An implementation of faster rcnn with study for region sampling. arXiv preprint arXiv:1702.02138 (2017)
中文翻译:[40] Chen, X., Gupta, A.: Faster R-CNN的实现及区域采样研究。arXiv预印本 arXiv:1702.02138 (2017)
[41] Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Proc. of Neural Information Processing Systems (NIPS). (2015)
中文翻译:[41] Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN:使用区域提议网络实现实时目标检测。在:神经信息处理系统会议 (NIPS). (2015)