2018年03月_匠人_C

12月 11月 10月 08月 07月 04月 03月 01月

原创 include/darknet.h:25:43: fatal error: opencv2/highgui/highgui_c.h: No such file or directory

yolo9000工程使用opencv进行编译时出现如下错误： include/darknet.h:25:43: fatal error: opencv2/highgui/highgui_c.h: No such file or directory解决办法：需要安装opencv sudo apt-get install libopencv-dev参考：https://groups.g...

2018-03-21 10:52:19 7448 1

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Abstract. Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and platform characterics. Thus, this work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network de- sign. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state- of-the-art in terms of speed and accuracy tradeoff.

2018-12-03

An Extremely Efficient Convolutional Neural Network for Mobile Devices

Abstract We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two new operations, pointwise group convolution and channel shuf- fle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior perfor- mance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 7.8%) than recent MobileNet [12] on Ima- geNet classification task, under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves ∼13× actual speedup over AlexNet while main- taining comparable accuracy.

2018-12-03

Fine-Grained Head Pose Estimation Without Keypoints

Abstract Estimating the head pose of a person is a crucial prob- lem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment. Traditionally head pose is computed by estimating some keypoints from the tar- get face and solving the 2D to 3D correspondence problem with a mean human head model. We argue that this is a fragile method because it relies entirely on landmark detec- tion performance, the extraneous head model and an ad-hoc fitting step. We present an elegant and robust way to deter- mine pose by training a multi-loss convolutional neural net- work on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles (yaw, pitch and roll) di- rectly from image intensities through joint binned pose clas- sification and regression. We present empirical tests on common in-the-wild pose benchmark datasets which show state-of-the-art results. Additionally we test our method on a dataset usually used for pose estimation using depth and start to close the gap with state-of-the-art depth pose meth- ods. We open-source our training and testing code as well as release our pre-trained models 1 .

2018-12-03

Focal Loss for Dense Object Detection

Abstract The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object lo- cations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the ex- treme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelm- ing the detector during training. To evaluate the effective- ness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of pre- vious one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

2018-12-03

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Abstract In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art perfor- mance of mobile models on multiple tasks and bench- marks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottle- neck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demon- strate that this improves performance and provide an in- tuition that led to this design. Finally, our approach allows decoupling of the in- put/output domains from the expressiveness of the trans- formation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

2018-12-03

DSFD: Dual Shot Face Detector

Abstract Recently, Convolutional Neural Network (CNN) has achieved great success in face detection. However, it re- mains a challenging problem for the current face detection methods owing to high degree of variability in scale, pose, occlusion, expression, appearance and illumination. In this paper, we propose a novel face detection network named Dual Shot face Detector(DSFD), which inherits the archi- tecture of SSD and introduces a Feature Enhance Module (FEM) for transferring the original feature maps to extend the single shot detector to dual shot detector. Specially, Pro- gressive Anchor Loss (PAL) computed by using two set of anchors is adopted to effectively facilitate the features. Ad- ditionally, we propose an Improved Anchor Matching (IAM) method by integrating novel data augmentation techniques and anchor design strategy in our DSFD to provide better initialization for the regressor. Extensive experiments on popular benchmarks: WIDER FACE (easy: 0.966, medium: 0.957, hard: 0.904) and FDDB ( discontinuous: 0.991, continuous: 0.862) demonstrate the superiority of DSFD over the state-of-the-art face detectors (e.g., PyramidBox and SRN). Code will be made available upon publication.

2018-12-03

cascade r-cnn paper

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection per- formance tends to degrade with increasing the IoU thresh- olds. Two main factors are responsible for this: 1) over- fitting during training, due to exponentially vanishing pos- itive samples, and 2) inference-time mismatch between the IoUs for which the detector is optimal and those of the in- put hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is proposed to address these prob- lems. It consists of a sequence of detectors trained with increasing IoU thresholds, to be sequentially more selec- tive against close false positives. The detectors are trained stage by stage, leveraging the observation that the out- put of a detector is a good distribution for training the next higher quality detector. The resampling of progres- sively improved hypotheses guarantees that all detectors have a positive set of examples of equivalent size, reduc- ing the overfitting problem. The same cascade procedure is applied at inference, enabling a closer match between the hypotheses and the detector quality of each stage. A simple implementation of the Cascade R-CNN is shown to surpass all single-model object detectors on the challeng- ing COCO dataset. Experiments also show that the Cas- cade R-CNN is widely applicable across detector architec- tures, achieving consistent gains independently of the base- line detector strength. The code will be made available at https://github.com/zhaoweicai/cascade-rcnn.

2018-12-03

Android development for External Displays.pdf

Table of Contents Headings formatted in bold-italic have changed since the last version. • Preface ◦ The Extract and the Book ................................................................... iii ◦ Getting Help ......................................................................................... iii ◦ Source Code And Its License ................................................................ iv ◦ Acknowledgments ................................................................................ iv • Supporting External Displays ◦ Prerequisites ........................................................................................... 1 ◦ A History of external displays ............................................................... 1 ◦ What is a Presentation? ........................................................................ 2 ◦ Playing with External Displays ............................................................. 3 ◦ Detecting Displays ................................................................................ 9 ◦ A Simple Presentation ......................................................................... 10 ◦ A Simpler Presentation ....................................................................... 16 ◦ Presentations and Configuration Changes ........................................ 21 ◦ Presentations as Fragments ................................................................ 22 ◦ Another Sample Project: Slides .......................................................... 32 ◦ Device Support for Presentation ........................................................ 39 ◦ Hey, What About Chromecast? ......................................................... 40 • Where To Now? ◦ The Full Book ....................................................................................... 41 ◦ Searches ................................................................................................ 42 ◦ Questions. Sometimes, With Answers. .............................................. 42

2018-07-31

深度学习最新中文版 pdf

深度学习书籍 2017年9月的最新高清pdf版， beta版第一章引言 1 1.1 本书面向的读者 . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2 深度学习的历史趋势 . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1 神经网络的众多名称和命运变迁 . . . . . . . . . . . 12 1.2.2 与日俱增的数据量 . . . . . . . . . . . . . . . . . . . 17 1.2.3 与日俱增的模型规模 . . . . . . . . . . . . . . . . . . 19 1.2.4 与日俱增的精度、复杂度和对现实世界的冲击 . . . . 22

2018-01-15

How to Write makefile

makefile文件的编写，How to Write makefile.pdf 。全英文版本。

2015-07-07

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

原创 include/darknet.h:25:43: fatal error: opencv2/highgui/highgui_c.h: No such file or directory

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

An Extremely Efficient Convolutional Neural Network for Mobile Devices

Fine-Grained Head Pose Estimation Without Keypoints

Focal Loss for Dense Object Detection

MobileNetV2: Inverted Residuals and Linear Bottlenecks

DSFD: Dual Shot Face Detector

cascade r-cnn paper

Android development for External Displays.pdf

深度学习 最新中文版 pdf

How to Write makefile

空空如也

深度学习最新中文版 pdf