onnx runtime相比pytorch加速的地方

 先来一段stackoverflow上的一个回答:

https://stackoverflow.com/questions/67943173/onnxruntime-vs-pytorch

ONNX Runtime uses static ONNX graph, so it has full view of the graph and can do a lot of optimizations that are impossible/harder to do with PyTorch. In a sense, it's similar to compiled vs interpreted programming language implementations.

 对onnx runtime加速的报道:

https://cloudblogs.microsoft.com/opensource/2020/05/19/announcing-support-for-accelerated-training-with-onnx-runtime/

实际上,这篇报告中把onnx runtime用到的大部分技术都概括了(当然实际上还是需要从源码看具体实现):

https://techcommunity.microsoft.com/t5/azure-ai/onnx-runtime-training-technical-deep-dive/ba-p/1398310

流水线并行化:

https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/102878315

https://pytorch.org/docs/master/pipeline.html

deepspeed:

https://github.com/microsoft/DeepSpeed

https://www.microsoft.com/en-us/research/project/ai-at-scale/

https://zhuanlan.zhihu.com/p/106783111

https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/#toc-heading-3

https://www.oschina.net/news/113328/microsoft-opensource-deepspeed

https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/

deepspeed中用到的zero技术论文:

https://arxiv.org/pdf/1910.02054.pdf

https://www.deepspeed.ai/tutorials/zero-offload/

https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/

deepspeed使用说明:

https://www.deepspeed.ai/docs/config-json/#sparse-attention

onnx runtime测试报告:

https://cloudblogs.microsoft.com/opensource/2020/08/24/pytorch-gpt-2-fine-tuning-onnx-runtime-speedup-training-time/

Megatron-LM代码:

https://github.com/NVIDIA/Megatron-LM

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值