cuda平台TensorFlow&PyTorch&Paddle&mxnet等第三方库转码过程

技术瘾君子1573

于 2024-08-24 00:00:00 发布

阅读量770

点赞数 24

分类专栏：人工智能&深度学习&机器学习文章标签： tensorflow pytorch paddle CUDA Rocm DTK

本文链接：https://blog.csdn.net/qq_27815483/article/details/141188896

版权

人工智能&深度学习&机器学习专栏收录该内容

159 篇文章 3 订阅

订阅专栏

当前主流的深度学习框架有：TensorFlow/PyTorch/Paddle/mxnet等，在编写计算框架时，考虑到模型算子的不断更新，为方便用户使用，各框架均留有用户可自主添加kernel/op等计算的操作。不同硬件平台使用的编程语言差异，需要进行代码移植，这里以mmcv为例介绍原cuda平台PyTorch第三方库转码过程。

转码主要针对第三方库内涉及GPU计算的部分，把CUDA C编程代码替换成HIP编程，并适配PyTorch 扩展规则。mmcv是计算机视觉研究的python函数库，包括：目标检测基准、3D对象检测、语义分割、姿态估计等。

转码工具

系统转码工具

hipify-perf

usage hipify-perl [OPTIONS] INPUT_FILE

可用于转码单个cuda编程代码

hipify-clang

USAGE: hipify-clang [options] <source0> [... <sourceN>]

可批量转码多个cuda编程代码

PyTorch(cuda_to_hip_mappings)

pytorch/torch/utils/hipify/cuda_to_hip_mappings.py at main · pytorch/pytorch · GitHub

除标准CUDA编程kernel API、相关库函数API等，cuda_to_hip_mappings.py 文件补充PyTorch框架函数API转码规则，使得第三方库更好更快的兼容不同计算平台。

转码思路

PyTorch框架CUDA版本的第三方程序扩展分为两个部分

需要我们操作的部分有两点：

1、CUDA C 编程代码移植为ROCm平台可用的HIP 编程程序

2、正确匹配PyTorch扩展规则及源码转码规范

下面以MMCV为例，简介转码过程

MMCV 适配

目录结构

首先我们需要了解代码结构，确定需要移植的代码和编译规则。

.
├── CONTRIBUTING.md
├── Dockerfile
├── docs
├── examples
├── git_stats
├── Jenkinsfile
├── LICENSE
├── MANIFEST.in
├── mmcv         #mmcv库 主目录
│   ├── arraymisc
│   ├── cnn
│   ├── engine
│   ├── fileio
│   ├── image
│   ├── __init__.py
│   ├── model_zoo
│   ├── onnx
│   ├── ops      # 定义新增kernels
│   │   ├── bbox.py
│   │   ├── box_iou_rotated.py
│   │   ├── carafe.py
│   │   ├── cc_attention.py
│   │   ├── corner_pool.py
│   │   ├── csrc   
│   │   │   ├── bbox_overlaps_cuda_kernel.cuh
│   │   │   ├── box_iou_rotated_cuda.cuh
│   │   │   ├── box_iou_rotated_utils.hpp
│   │   │   ├── carafe_cuda_kernel.cuh
│   │   │   ├── carafe_naive_cuda_kernel.cuh
│   │   │   ├── cc_attention_cuda_kernel.cuh
│   │   │   ├── common_cuda_helper.hpp
│   │   │   ├── deform_conv_cuda_kernel.cuh
│   │   │   ├── deform_roi_pool_cuda_kernel.cuh
│   │   │   ├── masked_conv2d_cuda_kernel.cuh
│   │   │   ├── modulated_deform_conv_cuda_kernel.cuh
│   │   │   ├── nms_cuda_kernel.cuh
│   │   │   ├── nms_rotated_cuda.cuh
│   │   │   ├── onnxruntime
│   │   │   ├── parrots
│   │   │   ├── parrots_cpp_helper.hpp
│   │   │   ├── parrots_cuda_helper.hpp
│   │   │   ├── parrots_cudawarpfunction.cuh
│   │   │   ├── psamask_cuda_kernel.cuh
│   │   │   ├── pytorch  # 需要转码的 CUDA 程序
│   │   │   ├── pytorch_cpp_helper.hpp
│   │   │   ├── pytorch_cuda_helper.hpp
│   │   │   ├── roi_align_cuda_kernel.cuh
│   │   │   ├── roi_align_rotated_cuda_kernel.cuh
│   │   │   ├── roi_pool_cuda_kernel.cuh
│   │   │   ├── sigmoid_focal_loss_cuda_kernel.cuh
│   │   │   ├── softmax_focal_loss_cuda_kernel.cuh
│   │   │   ├── sync_bn_cuda_kernel.cuh
│   │   │   ├── tensorrt
│   │   │   └── tin_shift_cuda_kernel.cuh
│   ├── parallel
│   ├── runner
│   ├── tensorrt
│   ├── utils
│   ├── version.py
│   ├── video
│   └── visualization
├── README.md
├── README_zh-CN.md
├── requirements
├── requirements.txt
├── setup.cfg
├── setup.py      #定义编译安装接口
└── tests

setup.py 编译构建部分

setup(
    name='mmcv' if os.getenv('MMCV_WITH_OPS', '0') == '0' else 'mmcv-full',    #生成的egg的名称
    version=get_version(),
    description='OpenMMLab Computer Vision Foundation',
    keywords='computer vision',
    packages=find_packages(),
    include_package_data=True,
    classifiers=[
        'Development Status :: 4 - Beta',
        'License :: OSI Approved :: Apache Software License',
        'Operating System :: OS Independent',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
        'Topic :: Utilities',
    ],
    url='https://github.com/open-mmlab/mmcv',
    author='MMCV Authors',
    author_email='openmmlab@gmail.com',
    setup_requires=['pytest-runner'],
    tests_require=['pytest'],
    install_requires=install_requires,  # 需要的依赖包
    ext_modules=get_extensions(),   #Extension实例的列表，Extension也会定义参数（源码、include文件等）
    cmdclass=cmd_class,
    zip_safe=False)

需要转码的文件会以参数的形式在函数get_extensions() 里面给出，添加进转码代码:

以上，我们完成了转码后代码的编译调用，下面，了解一下具体的转码过程：

转码示例

由上面可知 MMCV全部待转码文件在mmcv/ops/csrc/pytorch里面

建立其同级目录pytorch_rocm

转码

hipify-perl pytorch/bbox_overlaps_cuda.cu > pytorch_rocm/bbox_overlaps_cuda.cu

注：不适配部分使用cuda_to_hip_mappings.py 对照修改

编译安装

MMCV_WITH_OPS=1 ROCM_HOME=rocm path python3 setup.py install

技术瘾君子1573

关注

24
点赞
踩
13

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录