WebGPU-6

最新推荐文章于 2024-04-26 09:45:11 发布

caxieyou

最新推荐文章于 2024-04-26 09:45:11 发布

阅读量581

点赞数 2

分类专栏：渲染文章标签： WebGL WebGL2 WebGPU

本文链接：https://blog.csdn.net/caxieyou/article/details/94987754

版权

渲染专栏收录该内容

9 篇文章 9 订阅

订阅专栏

再来看看第三个例子，textured_cube
在这里插入图片描述

先来看看shader部分：

        const vertexShaderGLSL = `#version 450
			layout(set = 0, binding = 0) uniform Uniforms {
			    mat4 modelViewProjectionMatrix;
			} uniforms;
			
			layout(location = 0) in vec4 position;
			layout(location = 1) in vec2 uv;
			
			layout(location = 0) out vec2 fragUV;
			
			void main() {
			    gl_Position = uniforms.modelViewProjectionMatrix * position;
			    fragUV = uv;
			}
`

        const fragmentShaderGLSL = `#version 450
			layout(set = 0, binding = 1) uniform sampler mySampler;
			layout(set = 0, binding = 2) uniform texture2D myTexture;
			
			layout(location = 0) in vec2 fragUV;
			layout(location = 0) out vec4 outColor;
			
			void main() {
			    outColor =  texture(sampler2D(myTexture, mySampler), fragUV);
			}
`

在上一章的基础上再来看看这部分shader的部分，内容没有什么特别大的改动，shader还是比较直观的，区别具体有这么几个地方，首先是在vertex shader中我们不在需要color了，代之的是uv坐标，用来指出要在texture采样的位置信息，最终out的，需要插值的数据也不再是color，而是fragUV了。而在fragment shader中，我们新增了texture2D的一个sample数据，然后直接输出采样后的图片像素值。更直观的比较：
在这里插入图片描述
区别似乎没有多大。-_-!!!

接下来定义了vertex数据，和之前也没有什么特别大的区别，这里仅仅比较一下即可，具体请参考源代码，这边就不过分细致的重复内容了。
在这里插入图片描述
通过比较可以发现，这个新的sample中vertex的数据从rgba的颜色数据变成了仅有两个变量的uv数据，仅此而已。

接下来init的函数也和之前几乎一毛一样。-_-!!!
区别在于对新增纹理数据的设置。
在这里插入图片描述
这里要增补一个内容，就是binding的layout的可见性，上一篇中visibility直接用1来代替了，应该是写sample的做着偷懒了，或者不希望在那个例子中搞得太复杂，就直接简化了。我们来回过头再看一下spec中的说明：

interface GPUShaderStageBit {
    const u32 NONE = 0;
    const u32 VERTEX = 1;
    const u32 FRAGMENT = 2;
    const u32 COMPUTE = 4;
};

我们需要对绑定的数据的可见性进行一个设置，上个例子中的camera vp matrix数据是用在vertex shader中对cube进行3维变换的，所以visibility是1，也就是spec中的GPUShaderStageBit的Vertex。而我们现在的texture cube中，vertex数据还是送入vertex shader中，而fragment shader中需要额外增加2个数据，一个是传入GPU的texture数据，也就是sampled-texture图片信息。而第二个数据是sampler本身，关于如何采样的方式。瞥一眼设置的内容：

			const sampler = device.createSampler({
                magFilter: "linear",
                minFilter: "linear",
            });

看着真的好眼熟，在WebGL中我们需要实现设置好texture的采样属性，而在WebGPU中我们可以灵活的在shader中设置采样方式！惊喜啊~

pipeline和上一章完全一样，这里就不重复贴代码了。

再看uniformBindGroup：
在这里插入图片描述
新增了sampler和纹理数据，并给他们设置了binding的position

后续的绘制frame函数也和上一章没有不同。

OK，重点来看一下我们的texture创建过程。

const cubeTexture = await Utils.createTextureFromImage(device, 'assets/img/Di-3d.png', GPUTextureUsage.SAMPLED);

被框架封装起来了，直接拆解内部代码：

	async function createTextureFromImage(device, src, usage) {
	  const img = document.createElement('img');
	  img.src = src;
	  await img.decode();

	  const imageCanvas = document.createElement('canvas');
	  imageCanvas.width = img.width;
	  imageCanvas.height = img.height;

	  const imageCanvasContext = imageCanvas.getContext('2d');
	  imageCanvasContext.drawImage(img, 0, 0, img.width, img.height);
	  const imageData = imageCanvasContext.getImageData(0, 0, img.width, img.height);

	  let data = null;

	  const rowPitch = Math.ceil(img.width * 4 / 256) * 256;
	  if (rowPitch == img.width * 4) {
	    data = imageData.data;
	  } else {
	    data = new Uint8Array(rowPitch * img.height);
	    for (let y = 0; y < canvas.height; ++y) {
	      for (let x = 0; x < canvas.width; ++x) {
	        let i = x * 4 + y * rowPitch;
	        data[i] = imageData.data[i];
	        data[i + 1] = imageData.data[i + 1];
	        data[i + 2] = imageData.data[i + 2];
	        data[i + 3] = imageData.data[i + 3];
	      }
	    }
	  }

	  const texture = device.createTexture({
	    size: {
	      width: img.width,
	      height: img.height,
	      depth: 1,
	    },
	    arrayLayerCount: 1,
	    mipLevelCount: 1,
	    sampleCount: 1,
	    dimension: "2d",
	    format: "rgba8unorm",
	    usage: GPUTextureUsage.TRANSFER_DST | usage,
	  });

	  const textureDataBuffer = device.createBuffer({
	    size: data.byteLength,
	    usage: GPUBufferUsage.TRANSFER_DST | GPUBufferUsage.TRANSFER_SRC,
	  });

	  textureDataBuffer.setSubData(0, data);

	  const commandEncoder = device.createCommandEncoder({});
	  commandEncoder.copyBufferToTexture({
	    buffer: textureDataBuffer,
	    rowPitch: rowPitch,
	    arrayLayer: 0,
	    mipLevel: 0,
	    imageHeight: 0,
	  }, {
	      texture: texture,
	      mipLevel: 0,
	      arrayLayer: 0,
	      origin: { x: 0, y: 0, z: 0 }
	    }, {
	      width: img.width,
	      height: img.height,
	      depth: 1,
	    });

	  device.getQueue().submit([commandEncoder.finish()]);

	  return texture;
	}

函数有点长，前半部分是从图片中获取像素数据，这部分倒是不难理解，不是很理解为什么数据要先绘制到canvas上一遍，应该有更好的办法吧？还有，本人比较迷惑的是，关于

	const rowPitch = Math.ceil(img.width * 4 / 256) * 256;
	  if (rowPitch == img.width * 4) {
	    data = imageData.data;
	  } else {
	    data = new Uint8Array(rowPitch * img.height);
	    for (let y = 0; y < canvas.height; ++y) {
	      for (let x = 0; x < canvas.width; ++x) {
	        let i = x * 4 + y * rowPitch;
	        data[i] = imageData.data[i];
	        data[i + 1] = imageData.data[i + 1];
	        data[i + 2] = imageData.data[i + 2];
	        data[i + 3] = imageData.data[i + 3];
	      }
	    }
	  }

显然希望做的是数据的对齐，一般对于纹理的要求是2的指数倍，这里显然没有这个需求，而spec中对于纹理的具体需求也没有明确指明，什么样的数据是合乎加载规范的。打算在github上问一下，有答案里会继续更新。

接下来的部分是将数据送入GPU的过程。我们可以稍微换个顺序来理解这个过程。
首先利用device.createBuffer创建纹理的buffer数据的句柄，
接着通过setSubData数据将像素的数据塞入buffer。

然后利用device.createTexture，创建一个texture的句柄。接着通过commandEncoder.copyBufferToTexture将纹理的像素数据和texture绑在一起。

这里又出现了一个问题，为什么塞入数据的过程是同步的，而将buffer数据和texture句柄联系在一起却只能通过commandEncoder的命令queue来操作。

个人理解是这样的，我们知道opengl一直说自己是个server-client的体系，且也从来没有听说过webgl有多个context的故事，我们面对的显卡操作永远都是单一，且异步的。CPU和GPU同时工作，GPU有自己的命令队列，排到了就操作，没排到就等着。还记得glReadPixel的API一直都是个很蛋疼的函数，每次调用这个函数获取数据时，都会block整个pipeline。同理，当我们将图片的像素数据送入GPU的时候，是一个CPU-GPU通信过程，它必须是同步的。所以这步操作不需要用GPU command queue的方式使用。而将GPU的buffer数据挂钩到texture句柄上，却是GPU内部的操作，没有人知道GPU是不是在忙，这步操作需要花多久。脱离了CPU之后这事就变得透明了。我们将这个copyBufferToTexture这个命令放入GPU的queue中，剩下的由GPU自己判断何时进行。

最新反馈，在github提出了相应的问题，特地分享出来（这个group回答问题的速度真的非常快，这个很棒）
在这里插入图片描述
我提出了三个问题，第一个问题是关于纹理尺寸的，回答是WebGPU没有对尺寸有特别明确的要求。sample code中最多不能比4kor8k大就行。这个也不是太难理解，OpenGL对纹理和FBO的尺寸总是有上限的。
rowPitch得是256的倍数，了解图像的同学应该有这个概念，宽度总是会设置成某一个更容易加载的数字，比如4的倍数，8的倍数，如果不是，就扩展一下。最好别是奇数，质数这种神奇的数字。