2018年11月_匠人_C

12月 11月 10月 08月 07月 04月 03月 01月

原创 Multi-Person Pose Estimation

论文：Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields（openpose）https://arxiv.org/pdf/1611.08050.pdfRMPE: Regional Multi-Person Pose Estimation（AlphaPose）http://mvig.sjtu.edu.cn/re...

2018-11-29 11:03:36 545

原创 Facial Expression Recognition

Deep Facial Expression Recognition: A Survey:https://arxiv.org/pdf/1804.08348.pdfImage based Static Facial Expression Recognition with Multiple Deep Network Learning:https://www.microsoft.com/en-us...

2018-11-27 15:26:06 3111

原创和为K的子数组

leetcode题目地址给定一个整数数组和一个整数 k，你需要找到该数组中和为 k 的连续的子数组的个数。示例 1 :输入:nums = [1,1,1], k = 2输出: 2 , [1,1] 与 [1,1] 为两种不同的情况。说明 :数组的长度为 [1, 20,000]。数组中元素的范围是 [-1000, 1000] ，且整数 k 的范围是 [-1e7, 1e7]。首先想到...

2018-11-06 10:17:16 1665

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Abstract. Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and platform characterics. Thus, this work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network de- sign. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state- of-the-art in terms of speed and accuracy tradeoff.

2018-12-03

An Extremely Efficient Convolutional Neural Network for Mobile Devices

Abstract We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two new operations, pointwise group convolution and channel shuf- fle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior perfor- mance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 7.8%) than recent MobileNet [12] on Ima- geNet classification task, under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves ∼13× actual speedup over AlexNet while main- taining comparable accuracy.

2018-12-03

Fine-Grained Head Pose Estimation Without Keypoints

Abstract Estimating the head pose of a person is a crucial prob- lem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment. Traditionally head pose is computed by estimating some keypoints from the tar- get face and solving the 2D to 3D correspondence problem with a mean human head model. We argue that this is a fragile method because it relies entirely on landmark detec- tion performance, the extraneous head model and an ad-hoc fitting step. We present an elegant and robust way to deter- mine pose by training a multi-loss convolutional neural net- work on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles (yaw, pitch and roll) di- rectly from image intensities through joint binned pose clas- sification and regression. We present empirical tests on common in-the-wild pose benchmark datasets which show state-of-the-art results. Additionally we test our method on a dataset usually used for pose estimation using depth and start to close the gap with state-of-the-art depth pose meth- ods. We open-source our training and testing code as well as release our pre-trained models 1 .

2018-12-03

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Abstract In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art perfor- mance of mobile models on multiple tasks and bench- marks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottle- neck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demon- strate that this improves performance and provide an in- tuition that led to this design. Finally, our approach allows decoupling of the in- put/output domains from the expressiveness of the trans- formation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

2018-12-03

DSFD: Dual Shot Face Detector

Abstract Recently, Convolutional Neural Network (CNN) has achieved great success in face detection. However, it re- mains a challenging problem for the current face detection methods owing to high degree of variability in scale, pose, occlusion, expression, appearance and illumination. In this paper, we propose a novel face detection network named Dual Shot face Detector(DSFD), which inherits the archi- tecture of SSD and introduces a Feature Enhance Module (FEM) for transferring the original feature maps to extend the single shot detector to dual shot detector. Specially, Pro- gressive Anchor Loss (PAL) computed by using two set of anchors is adopted to effectively facilitate the features. Ad- ditionally, we propose an Improved Anchor Matching (IAM) method by integrating novel data augmentation techniques and anchor design strategy in our DSFD to provide better initialization for the regressor. Extensive experiments on popular benchmarks: WIDER FACE (easy: 0.966, medium: 0.957, hard: 0.904) and FDDB ( discontinuous: 0.991, continuous: 0.862) demonstrate the superiority of DSFD over the state-of-the-art face detectors (e.g., PyramidBox and SRN). Code will be made available upon publication.

2018-12-03

cascade r-cnn paper

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection per- formance tends to degrade with increasing the IoU thresh- olds. Two main factors are responsible for this: 1) over- fitting during training, due to exponentially vanishing pos- itive samples, and 2) inference-time mismatch between the IoUs for which the detector is optimal and those of the in- put hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is proposed to address these prob- lems. It consists of a sequence of detectors trained with increasing IoU thresholds, to be sequentially more selec- tive against close false positives. The detectors are trained stage by stage, leveraging the observation that the out- put of a detector is a good distribution for training the next higher quality detector. The resampling of progres- sively improved hypotheses guarantees that all detectors have a positive set of examples of equivalent size, reduc- ing the overfitting problem. The same cascade procedure is applied at inference, enabling a closer match between the hypotheses and the detector quality of each stage. A simple implementation of the Cascade R-CNN is shown to surpass all single-model object detectors on the challeng- ing COCO dataset. Experiments also show that the Cas- cade R-CNN is widely applicable across detector architec- tures, achieving consistent gains independently of the base- line detector strength. The code will be made available at https://github.com/zhaoweicai/cascade-rcnn.

2018-12-03

深度学习最新中文版 pdf

深度学习书籍 2017年9月的最新高清pdf版， beta版第一章引言 1 1.1 本书面向的读者 . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2 深度学习的历史趋势 . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1 神经网络的众多名称和命运变迁 . . . . . . . . . . . 12 1.2.2 与日俱增的数据量 . . . . . . . . . . . . . . . . . . . 17 1.2.3 与日俱增的模型规模 . . . . . . . . . . . . . . . . . . 19 1.2.4 与日俱增的精度、复杂度和对现实世界的冲击 . . . . 22

2018-01-15

How to Write makefile

makefile文件的编写，How to Write makefile.pdf 。全英文版本。

2015-07-07

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

原创 Multi-Person Pose Estimation

原创 Facial Expression Recognition

原创 和为K的子数组

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

An Extremely Efficient Convolutional Neural Network for Mobile Devices

Fine-Grained Head Pose Estimation Without Keypoints

MobileNetV2: Inverted Residuals and Linear Bottlenecks

DSFD: Dual Shot Face Detector

cascade r-cnn paper

深度学习 最新中文版 pdf

How to Write makefile

空空如也

原创和为K的子数组

深度学习最新中文版 pdf