http://www.inspur.com/lcjtww/2315499/2315503/2316283/2318425/2318473/2340051/index.html ***浪潮caffe-mpi
http://hidl.cse.ohio-state.edu/userguide/osucaffe/0.9/#_installing_osu_caffe ****S-Caffe
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs ***paper compare four mpi framework
https://www.nextplatform.com/2017/02/07/pushing-mpi-deep-learning-training-stack/ ***
https://devblogs.nvidia.com/fast-multi-gpu-collectives-nccl/ ***NCCL
https://blog.csdn.net/litdaguang/article/details/55259389 ***翻译
http://baijiahao.baidu.com/s?id=1581386178946489641&wfr=spider&for=pc ***如何理解Nvidia英伟达的Multi-GPU多卡通信框架NCCL?
NCCL2 install:
https://developer.nvidia.com/nccl/nccl-download *download:
http://tech.amikelive.com/node-735/how-to-install-nvidia-collective-communications-library-nccl-2-for-tensorflow-on-ubuntu-16-04/ *install guide:
https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html ***NCCL2
***Inter multi-cluster caffe
https://github.com/intel/caffe/wiki/Recommendations-to-achieve-best-performance
https://github.com/intel/caffe/wiki/Multinode-guide
https://my.oschina.net/u/1459307/blog/1650028 ***nvidia-nccl 学习笔记
https://www.jiqizhixin.com/articles/Detectron ***Detectron精读系列之一:学习率的调节和踩坑
http://www.sohu.com/a/192850201_470008 ***24分钟完成ImageNet训练,刷新世界纪录 尤洋
https://aws.amazon.com/cn/blogs/machine-learning/scalable-multi-node-deep-learning-training-using-gpus-in-the-aws-cloud/ ***Scalable multi-node deep learning training using GPUs in the AWS Cloud
http://www.fast.ai/2018/08/10/fastai-diu-imagenet/ ***Now anyone can train Imagenet in 18 minutes 有源码