From: http://dlbench.comp.hkbu.edu.hk/
The source code and experimental data of
Benchmarking State-of-the-Art Deep Learning Software Tools
(Version 7, 17 Feb 2017)
Declaration: The runtime performance of each software tool depends not only on the hardware platform, but also on the third-party libraries and the network configuration files. Our results only reflect the performance of the tested networks with the associated configuration files and the specified third-party libraries on our testing machines, which are not necessarily the best performance that can be achieved by the software tool.
Tested Software Tools
Tested Neural Networks
- Fully Connected Networks
- Convolutional Neural Networks: AlexNet ResNet
- Recurrent Neural Networks
Tested Hardware
Computational Unit | Cores | Memory | OS | CUDA |
---|---|---|---|---|
Intel CPU i7-3820 | 4 | 64 GB | Ubuntu 14.04 | - |
Intel CPU E5-2630x2 | 16 | 128 GB | CentOS 7.2 | - |
GTX 980 | 2048 | 4 GB | Ubuntu 14.04 | 8.0 |
GTX 1080 | 2560 | 8 GB | Ubuntu 14.04 | 8.0 |
Telsa K80 | 2496 | 12 GB | CentOS 7.2 | 8.0 |
Tested Hardware For Data Parallelization
GPUs | CPU | Memory | PCIe | OS | CUDA |
---|---|---|---|---|---|
K80x2 | E5-2630v4 | 128 GB | PCIe 3.0 | CentOS 7.2 | 8.0 |
Source Code
Fork me on Github: https://github.com/hclhkbu/dlbench at commit ID: ceac0ef
Acknowledgements
- Alexey-Kamenev: https://github.com/Alexey-Kamenev/Benchmarks.
- Soumith: https://github.com/soumith/convnet-benchmarks.
- Glample: https://github.com/glample/rnn-benchmarks.
- The CNTK Team for providing feedbacks and configuration files
Results
Item in cell: batchTime(totalTime)
fcn5 on i7-3820
i7-3820: Desktop CPU, with 4 physical cores Network: fcn5
fcn5 on E5-2630v3
E5-2630v3: Server CPU, with 8x2 physical cores Network: fcn5
fcn5 on GTX980
GTX 980, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: fcn5
fcn5 on GTX1080
GTX 1080, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: fcn5
alexnet on i7-3820
i7-3820: Desktop CPU, with 4 physical cores Network: alexnet
alexnet on GTX980
GTX 980, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: alexnet
alexnet on E5-2630v3
E5-2630v3: Server CPU, with 8x2 physical cores Network: alexnet
alexnet on K80
Tesla K80, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: alexnet
fcn5 on K80
Tesla K80, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: fcn5
alexnet on GTX1080
GTX 1080, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: alexnet
resnet on E5-2630v3
E5-2630v3: Server CPU, with 8x2 physical cores Network: resnet
resnet on i7-3820
i7-3820: Desktop CPU, with 4 physical cores Network: resnet
resnet on GTX1080
GTX 1080, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: resnet
resnet on GTX980
GTX 980, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: resnet
lstm on GTX1080
GTX 1080, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: lstm
lstm on GTX980
GTX 980, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: lstm
lstm on i7-3820
i7-3820: Desktop CPU, with 4 physical cores Network: lstm
lstm on E5-2630v3
E5-2630v3: Server CPU, with 8x2 physical cores Network: lstm
lstm on K80
Tesla K80, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: lstm
resnet on K80
Tesla K80, CUDA: 8.0 CUDNN: v5.1 CUDA_DRIVER: 367.48 Network: resnet