Dynamic VoluDynamic Volumetric Cloud Renderimetric Cloud Rendering for Games on Multi-Core Platforms

https://software.intel.com/en-us/articles/dynamic-volumetric-cloud-rendering-for-games-on-multi-core-platforms
有代码,但是必须配置完成才有意义,否则这么论文一点意义没有。大部分在强调是如何使用多线程进行云的渲染。

introduction

clouds play an important role in creating images of outdoor scenery. most games render coulds with the planar cloud textures mapping to the sky dome. this method is suitable when the viewpoint is close to the ground, but is not visually convincing when the viewpoint approaches or passes through the clouds. for a realistic experience in flying games, players should see clouds that appear to be three-dimensional with irregular shapes and realistic illumination. implementing these features requires using volumetric techniques to model, illuminate, and render cloulds. however, due to the inherent computational itensity of volumetric cloud techniques, it can be a challenge to apply these techniques in games. although there have been cloud systems that support real-time rendering of large-scale volumetric clouds in games, based on performance considerations, these systems generally must abandon the realistic dynamic features of clouds at run-time.

currently, multi-core platforms are the PC market mainstream. however, because traditional game architectures are not designed for multi-core systems, most games based on multi-core processors are not able to make full use of the power of all cores. using all cores would provide performance headroom ?? for games and flight simulators to render more realistic volumetric clouds.

this article presents a technique for games running on mainstream multi-core platforms to render dynamic volumetric clouds. This technique is based on existing algorithms and uses a multithread framework and Intel® Streaming SIMD Extensions (Intel® SSE) instructions to improve implementation and optimize game performance. A demo called LuckyCloud was developed to implement and evaluate our solution. LuckyCloud benchmark demonstrates that this technique has good performance scaling on multi-core platforms. When compared to previous static cloud systems, this solution enabled real-time dynamic simulation and illumination, and had no additional performance impact during game play.

background

over the past several decades, research in computer graphics produced many volumetric techniques that can simulate, illuminate, and render clouds. Most of these techniques required a great deal of computing resources, preventing them from being used for interactive performance. Some recent techniques achieved real-time performance by implementing algorithms in up-to-date GPU shaders. However, implementing volumetric clouds in PC games remains quite challenging. Unlike applications dedicated to generating volumetric cloud images, games have very limited performance space for rendering clouds because they must process game logic and render other scene objects at the same time. To guarantee optimal performance, some clouds systems used in games had to pre-process some complex calculations offline, such as modeling, illuminating, and so on. This leaves a lot of dynamic effects to be desired, for example, physics-based evolution, variable natural scattering illumination, or reasonable flying-in effects.

Our solution is mainly inspired by Harris2001 [2] and Dobashi2000 [1]. Harris proposed a cloud system that could be used in a flying game using a particle system to model and render volumetric clouds. The Harris system accelerated cloud rendering by using imposters for clouds at far distances, meaning the system could generate large-scale clouds while still able to have high real-time performance. Harris used a simplified Rayleigh scattering model to implement multiple-forward-scattering illumination. This model can achieve anisotropic light scattering in clouds so that different cloud colors can be observed from different angles. To accelerate the shading, Harris calculated the incident light intensity for each cloud particle through the GPU. But unfortunately, this method requires reading back the pixel color from the frame buffer every time a cloud particle billboard is rendered. The pixel read-back operation is very expensive in the graphics pipeline because there can be hundreds of thousands of cloud particles in the system. The benefit of GPU shading in this way is easily lost by the frequent pixel read back overhead, and instead causes a severe performance bottleneck. To render the volumetric clouds in real time, Harris’ system must shade the cloud particles offline, and merely render them at runtime. As a result, in Harris’ solution light intensity and direction is fixed. The LuckyCloud solution has adopted Harris’s illumination model, but uses a different implementation to enable dynamic interaction of the clouds and lights at runtime.

Another problem is that Harris’ approach can only render static clouds. To simulate the dynamic evolution of clouds, LuckyCloud adopts Dobashi’s cloud simulation method based on cellular automation. In this method, the simulation space is represented by a 3D grid. At each cell of the grid three binary state variables are assigned: humidity (hum), clouds (cld), and activation factors (act). The state of each variable is either 0 or 1. Cloud evolution is simulated by applying a set of simple transition rules at each time step, as shown in Figure 1. The transition rules represent formation, extinction, and advection by winds. Density at each point is calculated by smoothing the binary distribution of the surrounding cells’ cloud states. Compared to other simulation methods, the Dobashi method is able to produce realistic cloud animation at a smaller cost to performance. This is the main reason this method was chosen.

在这里插入图片描述
Figure 1: Dobashi’s Simulation Process [1]

the solution

As a solution, the process of generating cloud images in real-time includes three primary steps (in order): simulation, illumination, and rendering. The simulation and illumination is performed on the CPU, and rendering is primarily completed on the GPU. The simulation step uses Dobashi’s method to model dynamic clouds and generates the density distribution of cloud media. The illumination step calculates the scattering colors of cloud particles according to light passing through the cloud density space. The illumination model is the same as Harris’, but it is implemented in the CPU instead of the GPU. The rendering step is similar to Dobashi’s and Harris’ implementation which synthesizes the final cloud image by drawing the shaded cloud particles with the traditional billboard splatting technique.

The CPU approach is proposed for the simulation and illumination based on the following considerations:

a) CPU-based illumination avoids the performance bottleneck caused by the frequent read-back operations of the frame buffer in Harris’ implementation.
b) CPU-based implementation can reduce the GPU resources consumed by cloud rendering and lower the requirement for GPU functionality and performance so that a game can be compatible with a wider range of graphics cards.
c) Multi-core has become the mainstream PC gaming platform, but most games do not take full advantage of all the cores of a multi-core processor. Those available computing resources can be used to handle and accelerate the cloud simulation and illumination, thus minimizing the performance impact on the game loop.

Cloud particles are shaded by the CPU-based illumination method, as illustrated in Figure 2 and described here:

a) Cast a ray parallel with the sunlight from the sun to the cloud particle. Several sampling points are generates in the simulation space along the ray.
b) Iteratively calculate the incident light intensity of every sampling point based on Harris’ shading equations [2] until the cloud particle point is reached. During this course, the density of each sampling point is interpolated by the surrounding cells densities in the simulation space.
c) Calculate the scattering light intensity of the cloud particle according to Harris’ shading equations and use it as the color of the cloud particle.

在这里插入图片描述
Figure 2: The Illumination Method

As the relevant algorithms were implemented for the cloud, a multithreading framework was developed to render the whole cloud scene and Intel® Streaming SIMD Extensions (Intel® SSE) instructions were used to optimize illumination performance for each cloud particle.

multithreading framework
the multithread framework consists of two levels. the higher level does task decomposition. to minimize the performance impact of cloud rendering on game play, the simulation and illumination steps of the cloudspace are separated from the main thread of the game loop, and are placed into a separate thread for execute. because graphics middleware such as Direct3D* and OpenGL* do not recommend distributing rendering tasks in different threads, the rendering step remains in the main thread to render with other portions of game scene.

Clouds and light usually change slowly in games, so it is unnecessary to update the cloud simulation and illumination in every frame. That is, the main thread does not need to wait for the cloudscape thread to produce the latest data; it can render the cloudscape using the old data more than once and simply obtain the new data at a specified synchronization point. In this way, the cloudscape thread can amortize its heavy load in multiple frames. There are a couple of synchronization forms between the main thread and the cloudscape thread, for example, synchronizing every few frames, or after a specified amount of time, or when the cloudscape has completed its task. The last form is also called free step mode, which prevents the main thread from stalling to wait for the completion of a cloudscape thread. In this way, this technique achieves the same performance as rendering static clouds. this solution takes this technique as the default synchronization method.

The lower level of our multithread framework achieves data decomposition using the fork-join model in the cloudscape thread. Because there are usually many clouds in the cloudscape and the simulation and illumination of every cloud is independent from every other cloud, every cloud instance’s update task can be taken as a decomposition granularity and can be performed by different sub-threads in parallel. Data decomposition further enhances the multi-core utilization and the frequency of the cloudscape update.

Our multithread framework is implemented by the Intel® Threading Building Blocks (Intel® TBB) task manager and the TBB parallel_for construction. TBB provides C++ templates for parallel programming and enables developers to focus on tasks other than thread details [6]. The pseudo codes of the framework are as follows:

通篇说自己的多线程。

bool bSubmitNewTask = false;
if ( bFreeStepMode ){ 
bSubmitNewTask = pTaskManager->isJobDone();
}
else{
     pTaskManager->waitUntilDone();
bSubmitNewTask = true;
}
if (bSubmitNewTask){
GetNewDataFromCloudScapeThread ();
pTaskManager->submitJob(CloudScapeThreadFunction);
}
……
for(int i=0; i< uNumClouds; i++ ) 
cloudArray[i].render();	

Figure 3: Pseudo Codes in the Main Thread (Game Loop)

TaskManager manages the cloudscape thread and implements task parallelism (Figure 3). The simulation and illumination tasks in CloudScapeThreadFunction are submitted to the cloudscape thread at the appropriate time.

tbb::parallel_for ( tbb::blocked_range<int>( 0, uNumClouds, uNumClouds/uNumThreads),
*pForLoopToUpdateClouds );

Figure 4: Parallel_for Construction in the Cloudscape Thread Function

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值