- 注意!本文是在下几年前入门期间所写(young and naive),其中许多表述可能不正确,为防止误导,请各位读者仔细鉴别。
环境光遮蔽
之前我们算环境光的时候都是考虑所有地方收到的环境光强度相等,直接用漫反射系数乘以环境光强度。
我们取一个骷髅头模型,单独渲染环境光,不计算任何其他光照,结果如下图。
这个结果显然是不对的,场景中有的地方被掩蔽的程度更大,反射光线会更难到达,而有的地方被掩蔽的程度更小,反射光线更容易到达,所以不同地方的环境光强度应该是不同的。
那么我们要怎么来计算场景中的点被遮蔽的程度呢?接下来讲一种离线的算法和一种在线的算法,其中在线的算法就是我们经常在游戏里见到的SSAO。
离线环境光遮蔽算法:RAY CASTING
ray casting,就是从要计算环境光遮蔽的点出发,发出很多光线,然后检查这些光线是否和其他表面发生了碰撞,假如碰撞距离t很近,就认为这个方向被遮蔽了,如果很远或者没有碰撞,就认为这个方向没有被遮蔽,最后假如有h条光线被遮蔽,总共N条,那么环境光遮蔽就等于
如下图所示
我们称p位置的accessibility是1-occlusion。
现在要离线地计算模型上每个点的环境光遮蔽,把每个三角面取出来,从质心出发放射出n条光线,然后计算每个三角面的遮蔽程度,然后顶点的遮蔽程度就是通过了这个顶点的三角面的环境光遮蔽取平均。
代码如下:
void AmbientOcclusionApp::BuildVertexAmbientOcclusion(std::vector<Vertex::AmbientOcclusion>& vertices,
const std::vector<UINT>& indices)
{
UINT vcount = vertices.size();
UINT tcount = indices.size()/3;
std::vector<XMFLOAT3> positions(vcount);
for(UINT i = 0; i < vcount; ++i)
positions[i] = vertices[i].Pos;
Octree octree;
octree.Build(positions, indices);
// For each vertex, count how many triangles contain
the vertex.
std::vector<int> vertexSharedCount(vcount);
// Cast rays for each triangle, and average triangle occlusion
// with the vertices that share this triangle.
for(UINT i = 0; i < tcount; ++i)
{
UINT i0 = indices[i*3+0];
UINT i1 = indices[i*3+1];
UINT i2 = indices[i*3+2];
XMVECTOR v0 = XMLoadFloat3(&vertices[i0].Pos);
XMVECTOR v1 = XMLoadFloat3(&vertices[i1].Pos);
XMVECTOR v2 = XMLoadFloat3(&vertices[i2].Pos);
XMVECTOR edge0 = v1 - v0;
XMVECTOR edge1 = v2 - v0;
XMVECTOR normal = XMVector3Normalize(XMVector3Cross(edge0, edge1));
XMVECTOR centroid = (v0 + v1 + v2)/3.0f;
// Offset to avoid self intersection.
centroid += 0.001f*normal;
const int NumSampleRays = 32;
float numUnoccluded = 0;
for(int j = 0; j < NumSampleRays; ++j)
{
XMVECTOR randomDir = MathHelper::RandHemisphereUnitVec3(normal);
// Test if the random ray intersects the scene mesh.
//
// TODO: Technically we should not count intersections
// that are far away as occluding the triangle, but
// this is OK for demo.
if( !octree.RayOctreeIntersect(centroid, randomDir) )
{
numUnoccluded++;
}
}
float ambientAccess = numUnoccluded / NumSampleRays;
// Average with vertices that share this face.
vertices[i0].AmbientAccess += ambientAccess;
vertices[i1].AmbientAccess += ambientAccess;
vertices[i2].AmbientAccess += ambientAccess;
vertexSharedCount[i0]++;
vertexSharedCount[i1]++;
vertexSharedCount[i2]++;
}
// Finish average by dividing by the number of samples we added,
// and store in the vertex attribute.
for(UINT i = 0; i < vcount; ++i)
{
vertices[i].AmbientAccess /= vertexSharedCount[i];
}
}
这样计算了环境光遮蔽之后,再在原来计算环境光的式子上乘一个该点的accessibility,渲染结果如下
这种方法费时比较长,一个这样的模型都可能要计算好几秒,无法实时使用,而且这种方法只适合静态的物体,如果场景里有动的物体,或者这个物体本身要做动画,那么每个点的遮蔽程度随时都会变,这种离线的算法就不再适用了。
在线算法:SSAO
SSAO是指Screen Space Ambient Occlusion,在屏幕空间计算遮蔽,也就是可以理解为一种后处理。
首先我们有整个屏幕的深度缓冲,那么可以理解为我们可以把距离摄像机最近的一层还原出来,我们把x和y变换到摄像机空间,或者世界空间(然而没有必要多变换一次),然后把深度也变换过去,那么就还原了离摄像机最近的一层的场景的三维图像(好像之前看到shader书里实现动态模糊也是利用这点)。
SSAO的大致思路是,首先我们渲染一个pass,把屏幕的法线和深度渲染到RT和DS,然后第二个pass我们利用整个屏幕的法线信息和深度信息来算环境光遮蔽,大致思路如下图
方法是从要计算的p点开始,在半球内取一堆随机的光线向量,光线向量终点q,然后根据q的x和y投影到屏幕空间,采样深度缓冲得到这个点上对应的深度,再变到视野空间里算出最近的点r,然后这个r点就是可能遮蔽我们屏幕上的点q的点,然后判断一下r和q的距离以及p到r的方向和法线的点积,判断r是否在p的后方,以此判断r是否能遮蔽p,这里我们要先渲染一个法线pass就是为了做这个判断,然后如果能遮蔽,根据距离算出遮蔽的量,把每个射线取平均,就是点的环境光遮蔽,下面给出具体的计算过程。
首先重建视野空间下的点,代码如下
static const float2 gTexCoords[6] =
{
float2(0.0f, 1.0f),
float2(0.0f, 0.0f),
float2(1.0f, 0.0f),
float2(0.0f, 1.0f),
float2(1.0f, 0.0f),
float2(1.0f, 1.0f)
};
// Draw call with 6 vertices
VertexOut VS(uint vid : SV_VertexID)
{
VertexOut vout;
vout.TexC = gTexCoords[vid];
// Quad covering screen in NDC space.
vout.PosH = float4(2.0f*vout.TexC.x - 1.0f, 1.0f - 2.0f*vout.TexC.y, 0.0f, 1.0f);
// Transform quad corners to view space near plane.
float4 ph = mul(vout.PosH, gInvProj);
vout.PosV = ph.xyz / ph.w;
return vout;
}
输入是六个点,即两个三角形,这两个三角形构成一个四边形,即屏幕坐标的四个点,在顶点着色器里我们只需要计算这四个点再屏幕空间下的坐标,到了像素着色器里之后中间点的位置就被插值出来了。注意这里我们没有输入顶点,绑vb和ib的时候绑bull就行,然后要运行6次,draw call的时候第一个参数写6即可。
然后书上给的这段代码里,最后除了一下w,这段代码估计有问题,应该是先乘个n,再乘gInvProj矩阵,后面不用除了。
cmdList->DrawInstanced(6, 1, 0, 0);
到这一步我们插值得到的还只是近平面上的v,实际上p=tv,其中t=p.z/v.z,那么我们先采样得到ndc下的深度,然后变换到view space里来得到p.z,然后v我们已经知道,故我们知道v.z,就可以根据p=(p.z/v/z)v还原p点坐标,如下
float NdcDepthToViewDepth(float z_ndc)
{
// We can invert the calculation from NDC space to view space for the
// z-coordinate. We have that
// z_ndc = A + B/viewZ, where gProj[2,2]=A and gProj[3,2]=B.
// Therefore…
float viewZ = gProj[3][2] / (z_ndc - gProj[2][2]);
return viewZ;
}
float4 PS(VertexOut pin) : SV_Target
{
// Get z-coord of this pixel in NDC space from depth map.
float pz = gDepthMap.SampleLevel(gsamDepthMap, pin.TexC, 0.0f).r;
// Transform depth to view space.
pz = NdcDepthToViewDepth(pz);
// Reconstruct the view space position of the point with depth pz.
float3 p = (pz/pin.PosV.z)*pin.PosV;
[…]
}
接下来我们要生成一系列随机向量用来计算遮蔽,这里我们用的是14个均匀分布在半球内的向量,实际用的时候,我们随机生成一个向量,然后根据这14个向量分别计算反射,这样算出来的结果也是均匀的,然后我们把不在半球内的向量反向一下移到半球内,就得到了一个随机的、均匀的14个向量。
void Ssao::BuildOffsetVectors()
{
// Start with 14 uniformly distributed vectors. We choose the
// 8 corners of the cube and the 6 center points along each
// cube face. We always alternate the points on opposite sides
// of the cubes. This way we still get the vectors spread out
// even if we choose to use less than 14 samples.
// 8 cube corners
mOffsets[0] = XMFLOAT4(+1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[1] = XMFLOAT4(-1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[2] = XMFLOAT4(-1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[3] = XMFLOAT4(+1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[4] = XMFLOAT4(+1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[5] = XMFLOAT4(-1.0f, -1.0f, +1.0f, 0.0f);
mOffsets[6] = XMFLOAT4(-1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[7] = XMFLOAT4(+1.0f, -1.0f, +1.0f, 0.0f);
// 6 centers of cube faces
mOffsets[8] = XMFLOAT4(-1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[9] = XMFLOAT4(+1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[10] = XMFLOAT4(0.0f, -1.0f, 0.0f, 0.0f);
mOffsets[11] = XMFLOAT4(0.0f, +1.0f, 0.0f, 0.0f);
mOffsets[12] = XMFLOAT4(0.0f, 0.0f, -1.0f, 0.0f);
mOffsets[13] = XMFLOAT4(0.0f, 0.0f, +1.0f, 0.0f);
for(int i = 0; i < 14; ++i)
{
// Create random lengths in [0.25, 1.0].
float s = MathHelper::RandF(0.25f, 1.0f);
XMVECTOR v = s *
XMVector4Normalize(XMLoadFloat4(&mOffsets[i]));
XMStoreFloat4(&mOffsets[i], v);
}
}
然后我们要生成可能的遮蔽点,假如我们已经生成了p周围的随机采样点q,现在要求的是r,我们把q乘上proj和tex矩阵变换到贴图坐标系,然后用这个坐标的x和y采样深度信息,再把深度信息变换回到view space,根据r=(rz/qz)*q求出r。
然后就是遮蔽测试,我们计算|q-r|,如果这个值太小,我们认为这两点离得太近,就在同一个面上,就不会遮蔽,然后遮蔽的效果和距离是线性衰减的关系,超过一定范围就为0,无法遮蔽了,此外,我们还要在掩蔽系数上乘上n和(r-p)的点积,即在前面就遮蔽的厉害一些,在旁边遮蔽效果就一般,在后面就不遮蔽。
最后我们把occlusion的值取一下平均,access为1-occlusion,然后access我们可以乘方几次加大对比度。
occlusionSum /= gSampleCount;
float access = 1.0f - occlusionSum;
// Sharpen the contrast of the SSAO map to make the SSAO affect more dramatic.
return saturate(pow(access, 4.0f));
然后这样的话算出来的遮蔽会比较噪,我们还要做几次模糊处理,但是这次做的不是高斯模糊,而是能保留边缘的模糊,我们通过法线和深度来检测边缘,如果超出边缘,就舍弃采样的值,不参与平均。
SSAO Demo
接下来我们实现一个完整的ssao的demo,并给出其中关键部分的代码。
首先我们封装一个SSAO的类,来存用到的dsv,srv,resource等等。
其中一些关键的方法如下
void Ssao::BuildDescriptors(
ID3D12Resource* depthStencilBuffer,
CD3DX12_CPU_DESCRIPTOR_HANDLE hCpuSrv,
CD3DX12_GPU_DESCRIPTOR_HANDLE hGpuSrv,
CD3DX12_CPU_DESCRIPTOR_HANDLE hCpuRtv,
UINT cbvSrvUavDescriptorSize,
UINT rtvDescriptorSize)
{
// Save references to the descriptors. The Ssao reserves heap space
// for 5 contiguous Srvs.
mhAmbientMap0CpuSrv = hCpuSrv;
mhAmbientMap1CpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhDepthMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhRandomVectorMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhAmbientMap0GpuSrv = hGpuSrv;
mhAmbientMap1GpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhDepthMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhRandomVectorMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapCpuRtv = hCpuRtv;
mhAmbientMap0CpuRtv = hCpuRtv.Offset(1, rtvDescriptorSize);
mhAmbientMap1CpuRtv = hCpuRtv.Offset(1, rtvDescriptorSize);
// Create the descriptors
RebuildDescriptors(depthStencilBuffer);
}
void Ssao::BuildResources()
{
// Free the old resources if they exist.
mNormalMap = nullptr;
mAmbientMap0 = nullptr;
mAmbientMap1 = nullptr;
D3D12_RESOURCE_DESC texDesc;
ZeroMemory(&texDesc, sizeof(D3D12_RESOURCE_DESC));
texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texDesc.Alignment = 0;
texDesc.Width = mRenderTargetWidth;
texDesc.Height = mRenderTargetHeight;
texDesc.DepthOrArraySize = 1;
texDesc.MipLevels = 1;
texDesc.Format = Ssao::NormalMapFormat;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET;
float normalClearColor[] = {
0.0f, 0.0f, 1.0f, 0.0f };
CD3DX12_CLEAR_VALUE optClear(NormalMapFormat, normalClearColor);
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_COMMON,
&optClear,
IID_PPV_ARGS(&mNormalMap)));
// Ambient occlusion maps are at half resolution.
texDesc.Width = mRenderTargetWidth / 2;
texDesc.Height = mRenderTargetHeight / 2;
texDesc.Format = Ssao::AmbientMapFormat;
float ambientClearColor[] = {
1.0f, 1.0f, 1.0f, 1.0f };
optClear = CD3DX12_CLEAR_VALUE(AmbientMapFormat, ambientClearColor);
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
&optClear,
IID_PPV_ARGS(&mAmbientMap0)));
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
&optClear,
IID_PPV_ARGS(&mAmbientMap1)));
}
void Ssao::BuildRandomVectorTexture(ID3D12GraphicsCommandList* cmdList)
{
D3D12_RESOURCE_DESC texDesc;
ZeroMemory(&texDesc, sizeof(D3D12_RESOURCE_DESC));
texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texDesc.Alignment = 0;
texDesc.Width = 256;
texDesc.Height = 256;
texDesc.DepthOrArraySize = 1;
texDesc.MipLevels = 1;
texDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT)