Darknet: Open Source Neural Networks in C - Tiny Darknet

最新推荐文章于 2024-05-15 09:47:19 发布

Yongqiang Cheng

最新推荐文章于 2024-05-15 09:47:19 发布

阅读量884

点赞数 2

世上没有白读的书，每一页都算数。

本文链接：https://blog.csdn.net/chengyq116/article/details/85640009

版权

Darknet 专栏收录该内容

40 篇文章 2 订阅

订阅专栏

Darknet: Open Source Neural Networks in C - Tiny Darknet

https://pjreddie.com/darknet/

在这里插入图片描述

Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. You can find the source on GitHub or you can read more about what Darknet can do right here:
https://github.com/pjreddie/darknet

1. Tiny Darknet

Image classification made tiny.

I’ve heard a lot of people talking about SqueezeNet.

SqueezeNet is cool but it’s JUST optimizing for parameter count. When most high quality images are 10 MB or more why do we care if our models are 5 MB or 50 MB? If you want a small model that’s actually FAST, why not check out the Darknet reference network? It’s only 28 MB but more importantly, it’s only 800 million floating point operations. The original Alexnet is 2.3 billion. Darknet is 2.9 times faster and it’s small and it’s 4% more accurate.
[SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size]
[Darknet Reference Model]
[ImageNet Classification with Deep Convolutional Neural Networks]

So what about SqueezeNet? Sure the weights are only 4.8 MB but a forward pass is still 2.2 billion operations. Alexnet was a great first pass at classification but we shouldn’t be stuck back in the days when networks this bad are also this slow!

stick [stɪk]：vt. 刺，戳，伸出，粘贴 vi. 坚持，伸出，粘住 n. 棍，手杖，呆头呆脑的人 过去式 stuck 过去分词 stuck

But anyway, people are super into SqueezeNet so if you really insist on small networks, use this:

1.1 Tiny Darknet

Model	            Top-1	Top-5	Ops	    Size
AlexNet	            57.0	80.3	2.27 Bn	238 MB
Darknet Reference	61.1	83.0	0.81 Bn	28 MB
SqueezeNet	        57.5	80.3	2.17 Bn	4.8 MB
Tiny Darknet	    58.7	81.7	0.98 Bn	4.0 MB

The real winner here is clearly the Darknet reference model but if you insist on wanting a small model, use Tiny Darknet. Or train your own, it should be easy!

Here’s how to use it in Darknet (and also how to install Darknet):

git clone https://github.com/pjreddie/darknet
cd darknet
make
wget https://pjreddie.com/media/files/tiny.weights
./darknet classify cfg/tiny.cfg tiny.weights data/dog.jpg

1.1.1 tiny.cfg

[net]
# Train
# batch=128
# subdivisions=1
# Test
batch=1
subdivisions=1
height=224
width=224
channels=3
momentum=0.9
decay=0.0005
max_crop=320
......

1.1.2 Makefile

GPU=1
CUDNN=1
OPENCV=0
OPENMP=1
DEBUG=0
......

strong@foreverstrong:~/darknet_work/darknet_180906/darknet$ make clean
strong@foreverstrong:~/darknet_work/darknet_180906/darknet$ make

1.1.3 classify and classifier

./darknet classify ./cfg/tiny.cfg ./tiny.weights ./data/dog.jpg
./darknet classifier predict ./cfg/imagenet1k.data ./cfg/tiny.cfg ./tiny.weights ./data/dog.jpg

Hopefully you see something like this:

data/dog.jpg: Predicted in 0.160994 seconds.
malamute: 0.167168
Eskimo dog: 0.065828
dogsled: 0.063020
standard schnauzer: 0.051153
Siberian husky: 0.037506

strong@foreverstrong:~/darknet_work/darknet_180906/darknet$ wget https://pjreddie.com/media/files/tiny.weights
--2019-01-03 10:30:45--  https://pjreddie.com/media/files/tiny.weights
Resolving pjreddie.com (pjreddie.com)... 128.208.3.39
Connecting to pjreddie.com (pjreddie.com)|128.208.3.39|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4185968 (4.0M) [application/octet-stream]
Saving to: ‘tiny.weights’

tiny.weights        100%[===================>]   3.99M   137KB/s    in 1m 41s  

2019-01-03 10:32:32 (40.5 KB/s) - ‘tiny.weights’ saved [4185968/4185968]

strong@foreverstrong:~/darknet_work/darknet_180906/darknet$ ./darknet classify ./cfg/tiny.cfg ./tiny.weights ./data/dog.jpg
layer     filters    size              input                output
    0 conv     16  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  16  0.043 BFLOPs
    1 max          2 x 2 / 2   224 x 224 x  16   ->   112 x 112 x  16
    2 conv     32  3 x 3 / 1   112 x 112 x  16   ->   112 x 112 x  32  0.116 BFLOPs
    3 max          2 x 2 / 2   112 x 112 x  32   ->    56 x  56 x  32
    4 conv     16  1 x 1 / 1    56 x  56 x  32   ->    56 x  56 x  16  0.003 BFLOPs
    5 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128  0.116 BFLOPs
    6 conv     16  1 x 1 / 1    56 x  56 x 128   ->    56 x  56 x  16  0.013 BFLOPs
    7 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128  0.116 BFLOPs
    8 max          2 x 2 / 2    56 x  56 x 128   ->    28 x  28 x 128
    9 conv     32  1 x 1 / 1    28 x  28 x 128   ->    28 x  28 x  32  0.006 BFLOPs
   10 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256  0.116 BFLOPs
   11 conv     32  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  32  0.013 BFLOPs
   12 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256  0.116 BFLOPs
   13 max          2 x 2 / 2    28 x  28 x 256   ->    14 x  14 x 256
   14 conv     64  1 x 1 / 1    14 x  14 x 256   ->    14 x  14 x  64  0.006 BFLOPs
   15 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512  0.116 BFLOPs
   16 conv     64  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x  64  0.013 BFLOPs
   17 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512  0.116 BFLOPs
   18 conv    128  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x 128  0.026 BFLOPs
   19 conv   1000  1 x 1 / 1    14 x  14 x 128   ->    14 x  14 x1000  0.050 BFLOPs
   20 avg                       14 x  14 x1000   ->  1000
   21 softmax                                        1000
Loading weights from ./tiny.weights...Done!
./data/dog.jpg: Predicted in 0.002268 seconds.
14.50%: malamute
 6.08%: Newfoundland
 5.59%: dogsled
 4.56%: standard schnauzer
 4.05%: Eskimo dog
strong@foreverstrong:~/darknet_work/darknet_180906/darknet$

strong@foreverstrong:~/darknet_work/darknet_180906/darknet$ ./darknet classifier predict ./cfg/imagenet1k.data ./cfg/tiny.cfg ./tiny.weights ./data/dog.jpg
layer     filters    size              input                output
    0 conv     16  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  16  0.043 BFLOPs
    1 max          2 x 2 / 2   224 x 224 x  16   ->   112 x 112 x  16
    2 conv     32  3 x 3 / 1   112 x 112 x  16   ->   112 x 112 x  32  0.116 BFLOPs
    3 max          2 x 2 / 2   112 x 112 x  32   ->    56 x  56 x  32
    4 conv     16  1 x 1 / 1    56 x  56 x  32   ->    56 x  56 x  16  0.003 BFLOPs
    5 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128  0.116 BFLOPs
    6 conv     16  1 x 1 / 1    56 x  56 x 128   ->    56 x  56 x  16  0.013 BFLOPs
    7 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128  0.116 BFLOPs
    8 max          2 x 2 / 2    56 x  56 x 128   ->    28 x  28 x 128
    9 conv     32  1 x 1 / 1    28 x  28 x 128   ->    28 x  28 x  32  0.006 BFLOPs
   10 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256  0.116 BFLOPs
   11 conv     32  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  32  0.013 BFLOPs
   12 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256  0.116 BFLOPs
   13 max          2 x 2 / 2    28 x  28 x 256   ->    14 x  14 x 256
   14 conv     64  1 x 1 / 1    14 x  14 x 256   ->    14 x  14 x  64  0.006 BFLOPs
   15 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512  0.116 BFLOPs
   16 conv     64  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x  64  0.013 BFLOPs
   17 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512  0.116 BFLOPs
   18 conv    128  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x 128  0.026 BFLOPs
   19 conv   1000  1 x 1 / 1    14 x  14 x 128   ->    14 x  14 x1000  0.050 BFLOPs
   20 avg                       14 x  14 x1000   ->  1000
   21 softmax                                        1000
Loading weights from ./tiny.weights...Done!
./data/dog.jpg: Predicted in 0.002295 seconds.
14.50%: malamute
 6.08%: Newfoundland
 5.59%: dogsled
 4.56%: standard schnauzer
 4.05%: Eskimo dog
strong@foreverstrong:~/darknet_work/darknet_180906/darknet$

malamute ['mæləmjuːt]：n. 北极狗，爱斯基摩狗
dogsled ['dɔɡslɛd]：n. 狗拖的雪橇
standard schnauzer：标准史纳莎，标准型雪纳瑞犬
siberian husky：西伯利亚爱斯基摩狗
Eskimo ['eskiməu]：n. 爱斯基摩人，爱斯基摩语 adj. 爱斯基摩人的

Here’s the config file: tiny.cfg
[darknet/cfg/tiny.cfg]

The model is just some 3x3 and 1x1 convolutional layers:

layer     filters    size              input                output
    0 conv     16  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  16
    1 max          2 x 2 / 2   224 x 224 x  16   ->   112 x 112 x  16
    2 conv     32  3 x 3 / 1   112 x 112 x  16   ->   112 x 112 x  32
    3 max          2 x 2 / 2   112 x 112 x  32   ->    56 x  56 x  32
    4 conv     16  1 x 1 / 1    56 x  56 x  32   ->    56 x  56 x  16
    5 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128
    6 conv     16  1 x 1 / 1    56 x  56 x 128   ->    56 x  56 x  16
    7 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128
    8 max          2 x 2 / 2    56 x  56 x 128   ->    28 x  28 x 128
    9 conv     32  1 x 1 / 1    28 x  28 x 128   ->    28 x  28 x  32
   10 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256
   11 conv     32  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  32
   12 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256
   13 max          2 x 2 / 2    28 x  28 x 256   ->    14 x  14 x 256
   14 conv     64  1 x 1 / 1    14 x  14 x 256   ->    14 x  14 x  64
   15 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512
   16 conv     64  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x  64
   17 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512
   18 conv    128  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x 128
   19 conv   1000  1 x 1 / 1    14 x  14 x 128   ->    14 x  14 x1000
   20 avg                       14 x  14 x1000   ->  1000
   21 softmax                                        1000
   22 cost                                           1000

Wordbook

you only look once，YOLO
Visual Object Classes，VOC
Pattern Analysis, Statistical Modelling and Computational Learning，PASCAL
mean Average Precision，mAP：平均精度均值
floating point operations per second，FLOPS
frame rate or frame frequency, frames per second，FPS
hertz，Hz
billion，Bn
operations，Ops
configuration，cfg
ImageNet Large Scale Visual Recognition Challenge，ILSVRC
Microsoft Common Objects in Context，MS COCO