vulkan学习_使用vulkan kompute在gpu中进行机器学习和数据处理

本文介绍了如何利用Vulkan的Kompute库在GPU上高效执行机器学习和数据处理任务,从而提升计算速度并利用GPU的并行处理能力。
摘要由CSDN通过智能技术生成

vulkan学习

Machine learning, together with many other advanced data processing paradigms, fits incredibly well to the parallel-processing architecture that GPU computing offers.

机器学习以及许多其他高级数据处理范例非常适合GPU计算提供的并行处理架构。

Image for post
Designed to be cross-platform and cross-vendor
设计为跨平台和跨供应商

In this article you’ll learn how to write your own ML algorithm from scratch in GPU optimized code, which will be able to run in virtually any hardware — including your mobile phone. We’ll introduce core GPU & ML concepts and show how you can use the Kompute framework to implement it in only a handful lines of code.

在本文中,您将学习如何在GPU优化的代码中从头开始编写自己的ML算法,该代码几乎可以在任何硬件(包括手机)中运行。 我们将介绍GPU和ML的核心概念,并展示如何使用Kompute框架仅用少量代码来实现它。

We will be building first a simple algorithm that will multiply two arrays in parallel, which will introduce the fundamentals of GPU processing. We will then write a Logistic Regression algorithm from scratch in the GPU. You can find the repo with the full code in the following links:

我们将首先构建一个简单的算法,该算法将两个数组并行相乘,从而介绍GPU处理的基本原理。 然后,我们将从头开始在GPU中编写Logistic回归算法。 您可以在以下链接中找到包含完整代码的仓库:

动机 (Motivation)

The potential and adoption of GPU computing has been exploding in recent years — you can get a glimps of the increasing speed in adoption from the charts in the image below. In deep learning there has been a massive increase in adoption of GPUs for processing, together with paradigms that have enabled massively parallelizable distribution of compute tasks across increasing number of GPU nodes. There is a lot of exciting research around techniques that propose new approaches towards model parallelism and data parallelism— both which allow algorithms and data respectively to be sub-divided in a broad range of approaches to maximize processing efficiency.

近年来,GPU计算的潜力和采用率一直在爆炸式增长-您可以从下图中的图表中快速了解采用率的提高速度。 在深度学习中,GPU在处理方面的采用已大大增加,而范式已使跨越来越多的GPU节点实现大规模可并行化的计算任务分配。 关于技术的提议,有很多令人兴奋的研究,它们提出了针对模型并行性数据并行性的新方法,它们都允许将算法和数据分别细分为多种方法,以最大化处理效率。

Image for post
Ben-Nun, Tal, and Torsten Hoefler. “Demystifying parallel and distributed deep learning: An in-depth concurrency analysis.” ACM Computing Surveys (CSUR) 52.4 (2019): 1–43.
Ben-Nun,Tal和Torsten Hoefler。 “揭开并行和分布式深度学习的神秘面纱:深入的并发分析。” ACM计算调查(CSUR)52.4(2019):1-43。

In this article we outline the theory, and hands on tools that will enable both, beginners and seasoned GPU compute practitioners, to be make use of and contribute to the current development and discussions across these fascinating high performance computing areas.

在本文中,我们概述了该理论,并提供了动手工具,使初学者和经验丰富的GPU计算从业人员都可以利用并为这些引人入胜的高性能计算领域中的当前开发和讨论做出贡献。

Vulkan框架 (The Vulkan Framework)

Before diving right in, it’s worth introducing the core framework that is making it possible to build hyper-optimized, cross platform and scalable GPU algorithms — and that is the Vulkan Framework.

在深入研究之前,值得介绍一下核心框架,该框架使构建超级优化的跨平台和可扩展的GPU算法成为可能,这就是Vulkan框架。

Image for post
Playing “where’s waldo” with Khronos Membership logos
使用Khronos会员徽标播放“瓦尔多的地方”

Vulkan is an Open Source project led by the Khronos Group, a consortium of a very large number of tech companies who have come together to work towards defining and advancing the open standards for mobile and desktop media (and compute) technologies. On the left you can see the broad range of Khronos Members.

Vulkan是一个由Khronos Group领导的开源项目, Khronos Group是由许多技术公司组成的财团,它们共同致力于为移动和桌面媒体(和计算)技术定义和推进开放标准。 在左侧,您可以看到各种各样的Khronos会员。

You may be wondering, why do we need yet another new GPU framework where there are already many options available for writing parallelizable GPU code? The main reason is that unlike some of its closed source counterparts (e.g. NVIDIA’s CUDA, or Apple’s Metal) Vulkan is fully Open Source, and unlike some of the older options (e.g. OpenGL), Vulkan is built with the modern GPU architecture in mind, providing very granular access to GPU optimizations. Finally, whilst some alternatives provide vendor-specific support for GPUs, Vulkan provides cross-platform, and cross-vendor support, which means that it opens the doors to opportunities in mobile processing, edge computing, and more.

您可能想知道, 为什么我们还需要一个新的GPU框架,其中已经有很多选项可用于编写可并行化的GPU代码? 主要原因是,Vulkan与某些封闭源同类产品(例如NVIDIA的CUDA或Apple的Metal)不同,它是完全开源的,与某些旧选项(如OpenGL)不同的是,Vulkan的构建考虑了现代GPU架构,提供对GPU优化的非常细粒度的访问。 最后,虽然某些替代方案为GPU提供了特定于供应商的支持,但Vulkan提供了跨平台跨供应商的支持,这意味着它为移动处理,边缘计算等方面的商机打开了大门。

Vulkan’s C-API also provides very low level access to GPUs, which allows for very specialized optimizations. This is a great asset for GPU developers — the main disadvantage is the verbosity involved, requiring 500–2000+ lines of code to only get the base boilerplate required to even start writing the application logic. This can result not only in expensive developer cycles but also prone to small errors that can lead to larger problems.

Vulkan的C-API还提供了对GPU的低级访问,从而可以进行非常专业的优化。 对于GPU开发人员而言,这是一笔巨大的财富-主要缺点是涉及冗长,需要500-2000多行代码才能获得甚至开始编写应用程序逻辑所需的基础样板。 这不仅会导致昂贵的开发人员周期,而且容易产生小的错误,从而导致更大的问题。

输入Vulkan Kompute (Enter Vulkan Kompute)

Vulkan Kompute is a framework built on top of the Vulkan SDK, specifically designed to extend its compute capabilities as a simple to use, highly optimized, and mobile friendly General Purpose GPU computing framework.

Vulkan Kompute是在Vulkan SDK之上构建的框架,专门设计用于扩展其计算功能,使其成为易于使用,高度优化和移动友好的通用GPU计算框架。

Image for post
Documentation 文档

Kompute was not built to hide any of the core Vulkan concepts — the Vulkan API is very well designed —instead it augments Vulkan’s Computing capabilities with a BYOV (bring your own Vulkan) design, enabling developers by reducing boilerplate code required and automating some of the more common workflows involved in writing Vulkan applications.

Kompute并不是为了隐藏Vulkan的任何核心概念而设计的-Vulkan API的设计非常好-而是通过BYOV(自带Vulkan)设计增强了Vulkan的计算功能,从而通过减少所需的样板代码并使某些功能自动化来使开发人员能够使用编写Vulkan应用程序涉及的更常见的工作流程。

For new developers curious to learn more, it provides a solid base to get started into GPU computing. For more advanced Vulkan developers, Kompute allows them to integrate it into their existing Vulkan applications, and perform very granular optimizations by getting access to all of the Vulkan internals when required. The project is fully open source, and we welcome bug reports, documentation extensions, new examples or suggestions — please feel free to open an issue in the repo.

对于渴望了解更多信息的新开发人员,它为入门GPU计算提供了坚实的基础。 对于更高级的Vulkan开发人员,Kompute允许他们将其集成到现有的Vulkan应用程序中,并通过在需要时访问所有Vulkan内部组件来执行非常精细的优化。 该项目是完全开源的,我们欢迎错误报告,文档扩展,新示例或建议-请随时在存储库中打开一个问题

写你的第一个Kompute (Writing your first Kompute)

To build our first simple array-multiplication GPU computing application using Kompute, we will create the following:

为了使用Kompute构建我们的第一个简单的数组乘法GPU计算应用程序,我们将创建以下代码:

  • Two Kompute Tensors to store the input data

    两个Kompute张量存储输入数据

  • One Kompute Tensor to store the output data

    一个Kompute张量来存储输出数据

  • A Kompute Operation to create and copy the tensors to the GPU

    Kompute操作以创建张量并将其复制到GPU

  • A Kompute Operation with an Kompute Algorithm that will hold the code to be executed in the GPU (called a “shader”)

    具有Kompute算法的Kompute操作 ,该操作将保存要在GPU中执行的代码(称为“着色器”)

  • A Kompute Operation

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值