PhysXSDKDoc翻译 - TaskManagement


翻译:rikpan,校对:ericating88,转载请注明出处。
我们的水平有限,错漏处,请指点。


PhysX 3.2.1版本

Task Management

PxTask is a subsystem for managing compute resources for PhysX and APEX. It manages CPU and GPU compute resources, as well as SPU units on PlayStation3, by distributing Tasks to a user-implemented dispatcher and resolving Task dependencies such that Tasks are run in a given order.
PxTask是用于管理PhysX和APEX计算资源的子系统。它通过分配Tasks给用户实现的分发器,管理CPU和GPU的计算资源,同时也管理PlayStation3的SPU,并且决定Task从属关系,从而让Tasks按照给定的顺序运行。

Middleware products typically do not want to create CPU threads for their own use. This is especially true on consoles where execution threads can have significant overhead. In the PxTask model, the computational work is broken into jobs that are submitted to the game's thread pool as they become ready to run.
中间件产品通常不希望创建它们自己使用的CPU线程。在控制台程序上执行的线程非常重要,因此不创建自己使用的CPU线程很正确。在PxTask模型中,计算工作被分配成jobs提交到游戏线程池,此时jobs已做好运行的准备。

The following classes comprise the PxTask CPU resource management.

TaskManager

A TaskManager manages inter-task dependencies and dispatches ready tasks to their respective dispatcher. There is a dispatcher for CPU tasks, GPU tasks, and SPU tasks assigned to the TaskManager.
TaskManager管理着inter-task从属关系,并且将准备好的任务分配到它们各自的分发器。TaskManager被指派了用于CPU任务、GPU任务和SPU任务的分发器。

TaskManagers are owned and created by the SDK. Each PxScene will allocate its own TaskManager instance which users can configure with dispatchers through either the PxSceneDesc or directly through the TaskManager interface.
SDK拥有和创建TaskManagers。每个PxScene将会分配它们自己的TaskManager实例,用户可以通过PxSceneDesc配置分发器,或者直接调用TaskManager的接口配置分发器,然后每个PxScene分配它们自己的TaskManager实例时会用之前配置好的分发器。

CpuDispatcher

The CpuDispatcher is an abstract class the SDK uses for interfacing with the application's thread pool. Typically, there will be one single CpuDispatcher for the entire application, since there is rarely a need for more than one thread pool. A CpuDispatcher instance may be shared by more than one TaskManager, for example if multiple scenes are being used.
CpuDispatcher是一个虚类,是SDK与应用程序的线程池之间使用的接口。通常情况下,整个应用程序只会有一个CpuDispatcher,很少需要超过一个线程池的情况。CpuDispatcher实例可能在多个TaskManager间共享,例如使用了多场景。

PxTask includes a default CpuDispatcher implementation, but we prefer applications to implement this class themselves so PhysX and APEX can efficiently share CPU resources with the application.
PxTask包含一个默认的CpuDispatcher实现,但我们(译著:NVidia)更希望应用程序实现他们自己的类以便PhysX和APEX能和应用程序更有效地共享CPU资源。

Note
The TaskManager will call CpuDispatcher::submitTask() from either the context of API calls (aka: scene::simulate()) or from other running tasks, so the function must be thread-safe.
TaskManager会从任何的API调用上下文(即:scene::simulate())或其他正在运行的任务中调用CpuDispatcher::submitTask(),因此函数(译著:CpuDispatcher::submitTask)必须线程安全。

An implemention of the CpuDispatcher interface must call the following two methods on each submitted task for it to be run correctly:
CpuDispatcher接口的实现必须在每个提交的任务中调用以下两个接口才能运行正确。
baseTask->run();        // optionally call runProfiled() to wrap with PVD profiling events
                        // 选择调用runProfiled()会wrap(译著:术语还真不好翻译) PVD剖析事件
baseTask->release();
The PxExtensions library has default implementations for all dispatcher types, the following code snippets are taken from SampleParticles and SampleBase and show how the default dispatchers are created. mNbThreads which is passed to PxDefaultCpuDispatcherCreate defines how many worker threads the CPU dispatcher will have.
PxExtensions 库有所有分发器类型的默认实现,下面的代码片段取自SampleParticles和SampleBase,并展示了如何创建默认分发器。传递给PxDefaultCpuDispatcherCreate 的mNbThreads定义了CPU分发器会拥有多少条工作线程。

Best performance is usually achieved if the number of threads is equal to the available hardware threads of the platform you are running on:
通常达到最佳性能的实现方法是,线程数量等于你正在运行平台的可用硬件线程数(译著:CPU核心数)
    PxSceneDesc sceneDesc(mPhysics->getTolerancesScale());
    [...]
    // create CPU dispatcher which mNbThreads worker threads
    mCpuDispatcher = PxDefaultCpuDispatcherCreate(mNbThreads);
    if(!mCpuDispatcher)
        fatalError("PxDefaultCpuDispatcherCreate failed!");
    sceneDesc.cpuDispatcher = mCpuDispatcher;
#ifdef PX_WINDOWS
    // create GPU dispatcher
    pxTask::CudaContextManagerDesc cudaContextManagerDesc;
    mCudaContextManager = pxTask::createCudaContextManager(cudaContextManagerDesc);
    sceneDesc.gpuDispatcher = mCudaContextManager->getGpuDispatcher();
#endif
    [...]
    mScene = mPhysics->createScene(sceneDesc);
Note
CudaContextManagerDesc support appGUID now. It only works on release build. If your application employs PhysX modules that use CUDA you need to use a GUID so that patches for new architectures can be released for your game. You can obtain a GUID for your application from Nvidia. The application should log the failure into a file which can be sent to NVIDIA for support.
CudaContextManagerDesc 现在支持appGUID。它仅适用于release版本。如果你的应用程序使用了CUDA的PhysX物理模块,你就需要使用一个GUID来发布新的游戏架构补丁。你可以从Nvidia获得你的应用程序的GUID。应用程序能够记录错误到文件中,然后发给NVIDIA获得帮助。

CpuDispatcher Implementation Guidelines

After the scene's TaskManager has found a ready-to-run task and submitted it to the appropriate dispatcher it is up to the dispatcher implementation to decide how and when the task will be run.
Often in game scenarios the rigid body simulation is time critical and the goal is to reduce the latency from simulate() to the completion of fetchResults(). The lowest possible latency will be achieved when the PhysX tasks have exclusive access to CPU resources during the update. In reality, PhysX will have to share compute resources with other game tasks. Below are some guidelines to help ensure a balance between throughput and latency when mixing the PhysX update with other work.
场景的TaskManager发现准备好运行的任务,并将其提交给适当的分发器,分发器的具体实现决定如何以及何时运行任务。通常在游戏的应用场景中,刚体模拟是时间关键的,并且目的是为了减少从simulate()到完成fetchResults()的延迟。当更新时PhysX任务独享访问CPU资源会有更少的延迟。事实上,PhysX会与其他游戏任务共享计算资源。下面的指南会帮助确定当PhysX和其他任务在一起更新时,吞吐量(CPU利用率)和延迟之间的平衡。

Avoid interleaving long running tasks with PhysX tasks, this will help reduce latency.
避免将长时间运行的任务插入到PhysX任务,这会帮助降低延迟。(译著:在BaseTask::run中,不要做长时间运行的任务)
Avoid assigning worker threads to the same execution core as higher priority threads. If a PhysX task is context switched during execution the rest of the rigid body pipeline may be stalled, increasing latency.
避免工作线程作为高优先级线程分配到相同的执行核心。如果PhysX任务在执行时上下文切换,其他的刚体流水线可能会停止,这会增加延迟。(译著:PhysX任务执行时上下文切换,在高优先级的线程中执行的任务,可能拖累低优先级线程中执行的任务。PhysX任务所在的线程应该优先级一样高才好)
PhysX occasionally submits tasks and then immediately waits for them to complete, because of this, executing tasks in LIFO (stack) order may perform better than FIFO (queue) order.
PhysX偶尔会提交任务,并且立刻等待任务完成,因此,执行任务的顺序采用后进先出的顺序可能比先入先出的顺序好。
PhysX is not a perfectly parallel SDK, so interleaving small to medium granularity tasks will generally result in higher overall throughput.
PhysX并非完美的并行SDK,因此插入小到中等粒度的任务通常会有更高的总吞吐量(CPU利用率)。(译著:PhysX在simulate后开始多线程模拟,simulate接口立即返回,当调用fetchResults时等待模拟完成。如果simulate后立刻fetchResults会等待比较长的时间,并且这些等待时间里PhysX并非一直在做密集型计算,也就是PhysX并非完美的并行SDK,此时是有CPU资源空闲的,因此在simulate和fetchResults之间插入小到中等粒度的任务,会更有效的利用CPU资源。在sample中,是插入渲染处理)
If your thread pool has per-thread job-queues then queuing tasks on the thread they were submitted may result in more optimal CPU cache coherence, however this is not required.
如果你的线程池是每条线程都有工作队列,它们提交的线程可能会导致更优化的CPU高速缓存一致性,但是这不是必需的。

For more details see the default CpuDispatcher implementation that comes as part of the PxExtensions package. It uses worker threads that each have their own task queue and steal tasks from the back of other worker's queues (LIFO order) to improve workload distribution.
更多的细节参考PxExtensions的默认CpuDispatcher实现。它使用的工作线程每条都有自己的任务队列,并从其他工作队列尾部中抓取任务(后进先出顺序),以此提高工作负载分配。

BaseTask

BaseTask is the abstract base class for all PxTask task types. All task run() functions will be executed on application threads, so they need to be careful with their stack usage, use a little stack as possible, and they should never block for any reason.
BaseTask是所有PxTask任务类型的虚基类。所有任务run()函数将在应用程序线程中被执行,因此它们需要注意堆栈的使用,尽可能使用小堆栈,并且它们不能因为任何原因阻塞。

Task

The Task class is the standard task type. Tasks must be submitted to the TaskManager each simulation step for them to be executed. Tasks may be named at submission time, this allows them to be discoverable. Tasks will be given a reference count of 1 when they are submitted, and the TaskManager::startSimulation() function decrements the reference count of all tasks and dispatches all Tasks whose reference count reaches zero. Before TaskManager::startSimulation() is called, Tasks can set dependencies on each other to control the order in which they are dispatched. Once simulation has started, it is still possible to submit new tasks and add dependencies, but it is up to the programmer to avoid race hazards. You cannot add dependencies to tasks that have already been dispatched, and newly submitted Tasks must have their reference count decremented before that Task will be allowed to execute.
Task类是标准的任务类型。每次模拟任务必须被提交给TaskManager以便被执行。Tasks可能在提交时被命名,这会让它们能被查询。在任务被提交后会有被赋予1次引用计数,并且TaskManager::startSimulation()函数会减少所有任务的引用计数,并分配所有引用计数为0的任务。在TaskManager::startSimulation()被调用前,Tasks能彼此设置从属关系,以便控制被分配后的执行顺序。一旦模拟开始,仍然可以提交新的任务,并添加从属关系,但是由程序员负责避免任务间的恶性竞争。你不能向已经被分配的任务添加从属关系,并且新提交的任务必须在Task允许执行前已减少它们的引用计数(译著:手动减少引用计数为0)。

Synchronization points can also be defined using Task names. The TaskManager will assign the name a TaskID with no Task implementation. When all of the named TaskID's dependencies are met, it will decrement the reference count of all Tasks with that name.
同步点也可以使用Task名称定义。Task没有赋予名称时TaskManager会分配一个TaskID名称。当所有已命名的TaskID的依赖关系满足(译著:属主任务都已执行),所有使用该名称的Tasks(译著:属主Tasks)的引用计数会被减少。

APEX uses the Task class almost exclusively to manage CPU resources. The ApexScene defines a number of named Tasks that the modules use to schedule their own Tasks (ex: start after LOD calculations are complete, finish before the PhysX scene is stepped).
APEX几乎完全使用Task类管理CPU资源。ApexScene定义了一定数量已命名的Tasks来调度它们自己的Tasks(例如:在LOD计算完成后开始,在PhysX场景步进前完成)

LightCpuTask

LightCpuTask is another subclass of BaseTask that is explicitly scheduled by the programmer. LightCpuTasks have a reference count of 1 when they are initialized, so their reference count must be decremented before they are dispatched. LightCpuTasks increment their continuation task reference count when they are initialized, and decrement the reference count when they are released (after completing their run() function).
LightCpuTask 是BaseTask的另一个子类,它被程序员明确的调度。LightCpuTasks初始化时有1次引用,因此它们的引用次数必须在被分配前降低(译著:手动减少引用计数为0)。LightCpuTasks初始化时增加它们后续任务的引用计数,并且当它们被释放时降低后续任务的引用计数(在完成它们的run函数后)。

PhysX 3.0 uses LightCpuTasks almost exclusively to manage CPU resources. For example, each stage of the simulation update may consist of multiple parallel tasks, when each of these tasks has finished execution it will decrement the reference count on the next task in the update chain. This will then be automatically dispatched for execution when its reference count reaches zero.
PhysX 3.0几乎完全使用LightCpuTasks管理CPU资源。例如,每个模拟更新阶段有可能由多个并行任务组成。当这些任务中的任意一个完成执行时,都会减少更新链上下一个任务的引用计数。当引用计数为0时就会自动被分配执行。

Note
Even when using LightCpuTasks exclusively to manage CPU resources, the TaskManager startSimulation() and stopSimulation() calls must be made each simulation step to keep the GpuDispatcher synchronized.
当只用LightCpuTasks管理CPU资源时,Taskmanager的startSimulation()和stopSimulation()调用必须在每次模拟时保持GpuDispatcher同步。

The following code snippets show how the crabs' A.I. in SampleSubmarine is run as a CPU Task. By doing so the Crab A.I. is run as a background Task in parallel with the PhysX simulation update.
下面的代码片段展示了SampleSubmarine里作为CPU任务运行的模拟螃蟹AI。通过像模拟螃蟹AI的例子,可以让后台任务在PhysX模拟更新时并行运行。

For a CPU task that does not need handling of multiple continuations LightCpuTask can be subclassed.
对CPU任务而言并不需要处理多个连续的LightCpuTask。

A LightCpuTask subclass requires that the getName and a run method be defined:
LightCpuTask 的子类需要实现getName和run方法定义如下:
class Crab: public ClassType, public physx::pxtask::LightCpuTask, public SampleAllocateable
{
public:
    Crab(SampleSubmarine& sample, const PxVec3& crabPos, RenderMaterial* material);
    ~Crab();
    [...]

    // Implements LightCpuTask
    virtual  const char*    getName() const { return "Crab AI Task"; }
    virtual  void           run();

    [...]
}
After PxScene::simulate() has been called, and the simulation started, the application calls removeReference() on each Crab task, this in turn causes it to be submitted to the CpuDispatcher for update. Note that it is also possible to submit tasks to the dispatcher directly (without manipulating reference counts) as follows:
PxScene::simulate()被调用,并且模拟开始后,应用程序在每个模拟螃蟹任务上调用removeReference(),这会向CpuDispatcher提交模拟螃蟹任务并更新。注意,也可以直接提交任务给分配器(不增减引用计数),如下:
pxtask::LightCpuTask& task = &mCrab;
mCpuDispatcher->submitTask(task);
Once queued for execution by the CpuDispatcher, one of the thread pool's worker threads will eventually call the task's run method. In this example the Crab task will perform raycasts against the scene and update its internal state machine:
一旦CpuDispatcher排好队等待执行,其中一个线程池的工作线程将会最终调用任务的run方法。在这个实例中模拟螃蟹任务对场景执行射线查询并更新它内部的状态机:
void Crab::run()
{
    // run as a separate task/thread
    scanForObstacles();
    updateState();
}
It is safe to perform API read calls, such as scene queries, from multiple threads while simulate() is running. However, care must be taken not to overlap API read and write calls from multiple threads. In this case the SDK will issue an error, see Data Access and Buffering for more information.
An example for explicit reference count modification and task dependency setup:
当simulate()正在运行时从多线程调用API读取数据是安全的,比如场景查询。但是,必须注意不要在从多线程重叠(?)调用API读取数据和写入数据。
// assume all tasks have a refcount of 1 and are submitted to the task manager
// 确保所有的任务都有一次引用计数,并且已提交给TaskManager
// 3 task chains a0-a2, b0-b2, c0-c2
// 3条任务链a0-a2, b0-b2, c0-c2
// b0 shall start after a1
// b0会在a1之后开始执行
// the a and c chain have no dependencies and shall run in parallel
// a和c链没有属主,会并行执行
//
// a0-a1-a2
//      \
//       b0-b1-b2
// c0-c1-c2

// setup the 3 chains
// 生成3条任务链
for(PxU32 i = 0; i < 2; i++)
{
    a[i].setContinuation(&a[i+1]);
    b[i].setContinuation(&b[i+1]);
    c[i].setContinuation(&c[i+1]);
}

// b0 shall start after a1
// b0会在a1之后开始执行
b[0].startAfter(a[1].getTaskID());

// setup is done, now start all task by decrementing their refcount by 1
// 生成完毕,现在减少所有的引用计数1次后开始所有任务
// tasks with refcount == 0 will be submitted to the dispatcher (a0 & c0 will start).
// 任务的refcount == 0会被提交给分配器(a0 & c0会开始执行)
for(PxU32 i = 0; i < 3; i++)
{
    a[i].removeReference();
    b[i].removeReference();
    c[i].removeReference();
}



  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值