这里写目录标题
Marker abstraction
The CUSW_UNIT_SYNC unit provides mechanisms to track work completion on GPU, CPU, and other Tegra engines (such as ISP, VIC, PVA, DLA). GPUs use several hardware synchronization primitives for task tracking and synchronization, which are abstracted away into this unit. The unit provides functionality to set up temporal dependencies between GPU task, CPU task and Tegra Engine Task. This unit also allows importing synchronization primitives from other CUDA or Graphic contexts. This allows CUDA task to wait for a Graphics task and vice versa. CUSW_UNIT_CHANNEL and CUSW_UNIT_STREAM units are the primary users of this unit’s interfaces. CUSW_UNIT_STREAM unit uses CUSW_UNIT_SYNC to implement stream ordering and Deferred Procedure Calls (DPC). CUSW_UNIT_CHANNEL uses CUSW_UNIT_SYNC for tracking work submission to a GPU channel.
The most important abstraction around which the unit is based, is described in the following section, whereas functionalities constructed around these abstractions are a part of the unit’s interface and described in the subsequent section.


Marker
A marker is a context-wide task tracker. It is the main data type through which the driver tracks task completion. A marker tracks work in various channels in CUDA context.
A marker can be used to setup dependency between:
GPU task and CPU task and vice-versa, for ex. cudaDeviceSynchronize(), stream-callback
Two GPU tasks in different channels for ex. cudaStreamWaitEvent()
Tegra engine task and GPU task and vice-versa for ex. PVA, DLA waiting for CUDA work to complete
While marker is tracking a GPU task, it can go through state changes indicating whether the tasks it tracks are yet to be pushed to the GPU, have already been pushed to the GPU, or they have been completed by the GPU. The marker-status indicates the state in which a marker instance is; if all the tasks being tracked by a marker are completed, the marker itself is considered to be in completed state.
When waiting for a marker to complete on CPU, different types of marker-wait behavior can be chosen by the client. The client can choose to use interrupt based wait or do a busy wait for the marker to complete.
Marker Entry
A marker is a collection of one or more marker entries. A marker entry represents a hardware or a software synchronization primitive. Every marker entry can be in two states, complete or not-complete. Each marker entry in a marker must be complete before the marker can be marked complete. A marker-entry- kind specifies the particular type of a marker entry.
The different categories of marker entries are:
GPU Semaphore tracking work in a GPU channel (could be GPU channel from different context)
GPU Semaphore to track UVM operations
QMD semaphore to track cuda kernel completion.
CPU semaphores : Track CPU work.
Tegra engine sync-points: These are managed by Tegra RM
Tegra syncpoints are managed externally by the operating system through the NvRm library on the respective platform and merely consumed by the marker abstraction to allow client code to wait on it. CUDA driver doesn’t manage sync-points on its own. Tegra syncpoints are registers on a Tegra chip and these are used for synchronization between different engines. There are fixed number of syncpoints registers. The number depends on the Tegra arch family the chip belongs to. The integrated GPU has means to wait for a Tegra syncpoint to complete Or can signal a Tegra syncpoint. The OS/NvRm library manages (allocates/deallocates) these syncpoints because these are limited system resources.
The mutual relationships between the various marker data-structures such as marker, marker-entry, marker-status, marker-kind, and marker-wait behavior can be described through the following class-diagram.
Detailed functional description
To manage the life-times of a marker: create, initialize, and destroy markers,
To populate entries into a marker. This functionality is used by CUSW_UNIT_CHANNEL and CUSW_UNIT_STREAM unit to record work into Marker.
Purge marker entries from a marker that have moved to completed state,
To compare markers, copy markers, merge and sort markers.
Provides interface for CPU to wait for a marker to complete.
Provides an interface to flush any pending tasks that the marker tracks to the relevant engine (GPU semaphores and QMDs).
Get the current state of a marker, having inspected and accounted for states of all marker entries.
Provides functions to reallocate memory for more marker entries in a marker, when current allocations are not sufficient, for example, while copying entries from another marker, for adding dependency (for example, to add an event’s entries to a stream’s, if a stream needs to wait on the event.
Code
Marker type
/**
* \brief Different types of Marker Entry
*
*/
typedef enum
{
CU_CTX_MARKER_ENTRY_TYPE_CHANNEL_TRACKING_SEMAPHORE = 1,
CU_CTX_MARKER_ENTRY_TYPE_PENDING_QMD = 3,
CU_CTX_MARKER_ENTRY_TYPE_CROSS_CTX_SEMAPHORE = 4,
CU_CTX_MARKER_ENTRY_TYPE_NVN_SEMAPHORE = 5,
CU_CTX_MARKER_ENTRY_TYPE_UVM_SEMAPHORE = 6,
CU_CTX_MARKER_ENTRY_TYPE_CPU_SEMAPHORE = 16,
CU_CTX_MARKER_ENTRY_TYPE_INVALID = 32,
} CUctxMarkerEntryType;
/*
* \brief The state of a channel in a context marker
*
*/
struct CUctxMarkerChannel_st
{
//! The channel being tracked
CUnvchannel *channel;
//! Value to compare against a channel's tracking semaphore value
NvU64 trackingSemaphoreValue;
};
/**
* \brief A launched QMD tracked by a marker
*/
struct CUctxMarkerPendingQMD_st
{
//! The launched QMD
CUqmd *qmd;
//! The id of the launch, used to calculate the QMD semaphore completion value
NvU64 launchId;
//! The channel marker of the launch
CUctxMarkerChannel launch;
};
/**
* \brief A semaphore released by another CUcontext
*/
struct CUctxMarkerCrossCtxSema_st
{
//! The tracking semaphore
CUcrossCtxSema *crossCtxSema;
//! Value to compare against the tracking semaphore value
NvU64 trackingSemaphoreValue;
};
// A semaphore released from CPU
struct CUctxMarkerCpuSema_st
{
//! Host payload value
volatile NvU32 *payload;
//! Value to compare against the semaphore payload
NvU32 completionValue;
// Memobj associated with CPU semaphore page
CUmemobj *memobj;
// Offset to the CPU semaphore inside page
NvU64 offsetFromPage;
};
/**
* \brief A semaphore released by graphics for interop
*/
struct CUctxMarkerNVNInterop_st
{
//! NVN Sync Data
CUINVNSyncData *syncData;
//! Should the sync data use the nvn interop sync or the cuda semaphore.
NvBool useInteropSync;
//! The tracking semaphore value of the sync.
NvU64 trackingSemaphoreValue;
};
// A semaphore used for UVM prefetching
struct CUctxMarkerUvmSema_st
{
//! UVM semaphore
NvU64 offset;
//! Value to compare against the tracking semaphore value
NvU64 trackingSemaphoreValue;
//! Associated stream, if any
CUIstream *stream;
};
Marker 定义
CUctxMarker_st管理着一个array CUctxMarkerEntry_st
/**
*
*/
struct CUctxMarkerEntry_st
{
CUctxMarkerEntryType type;
union {
CUctxMarkerChannel channelTrackSemVal;
CUctxMarkerPendingQMD pendingQmd;
CUctxMarkerCrossCtxSema crossCtxSem;
CUctxMarkerCpuSema cpuSem;
CUctxMarkerNVNInterop graphicsSem;
CUctxMarkerUvmSema uvmSema;
} data;
};
/*
* \brief context-wide tracking marker
*
*/
struct CUctxMarker_st
{
//! The channel manager which is the parent of all channels waited on
CUchannelManager *channelManager;
//! Array of entries
NvU32 numEntries;
CUctxMarkerEntry *entries;
//! Static array of entries
NvU32 numEntriesMax;
CUctxMarkerEntry entriesStatic[CU_CTX_MARKER_NUM_ENTRIES_STATIC];
};
函数
/**
* \fn CUresult ctxMarkerCreate(CUctxMarker** ctxMarker, CUchannelManager* channelManager);
* \brief Allocates and an initializes a new marker
*
* \detailDescription This function allocates and initializes the memory of a new marker
* and assigns to ctxMarker. The function assumes that ctxMarker is Valid.
*
* \param[out] ctxMarker context marker , None
* \param[in] channelManager channel Manager, None
*
* \return CUresult This captures whether the function was successful or not.
*
* \retval
* CUDA_SUCCESS
* \retval
* CUDA_ERROR_OUT_OF_MEMORY
*
*\additionalNotes
* Destroys the marker when the memory allocation fails.
*\additionalNotes
* Initializes the marker with default values.
* \notesync
* \noteReentrant{pool->mutex,pool}
*
* \endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerCreate(CUctxMarker** ctxMarker, CUchannelManager* channelManager);
/**
*
* \brief Destroys the marker
*
* \detailDescription This function deallocates the marker.
*
* \param[in] ctxMarker Context marker which will be destroyed
*
* \return void
*
* \additionalNotes
* ctxMarkerDestroy() shall just return without doing anything if ctxMarker is NULL.
* \additionalNotes
* ctxMarkerDestroy() shall call de-initialize on the marker before freeing the memory
* for it. This shall free any dynamic memory allocated for marker entries.
*
*
* \endfn
*/
CUDA_TEST_EXPORT void ctxMarkerDestroy(CUctxMarker* ctxMarker);
/**
* \brief This function initializes the members of the marker
*
* \detailDescription This function just initializes the members of the CUctxMarker struct.
* to the default values. The function assumes that ctxMarker is valid.
*
* \param ctxMarker marker which shall be initialized
* \param channelManager channel manager to which marker should be associated with
*
* \return void
*
* \additionalNotes
* ctxMarkerInitialize() shall set default values for all fields i.e
* numEntries to 0, numEntriesMax to CU_CTX_MARKER_NUM_ENTRIES_STATIC in the marker struct.
*
*\endfn
*/
CUDA_TEST_EXPORT void ctxMarkerInitialize(CUctxMarker *ctxMarker, CUchannelManager* channelManager);
/**
* \brief Function ctxMarkerDeinitialize de-initializes the marker
*
* \detailDescription This function frees any dynamic memory allocated for the marker
* entries by checking if ctxMarker->entries != ctxMarker->entriesStatic.
* It de-initializes all values by setting to 0/NULL. This function assumes that ctxMarker
* is valid.
*
* \param ctxMarker marker which should be de-initialized
*
* \return void
*
* \additionalNotes
* ctxMarkerDeinitialize() shall free any dynamic memory allocated for the marker entries.
* \additionalNotes
* ctxMarkerDeinitialize() does not validate channel manager.
*
* \endfn
*/
CUDA_TEST_EXPORT void ctxMarkerDeinitialize(CUctxMarker *ctxMarker);
CUresult
ctxMarkerCreate(CUctxMarker** pCtxMarker, CUchannelManager* channelManager)
{
CUresult status;
CUctxMarker* ctxMarker;
CU_TRACE_FUNCTION();
ctxMarker = (CUctxMarker* )malloc(sizeof(CUctxMarker));
if (ctxMarker == NULL) {
CU_ERROR_PRINT(("Failed to allocate ctxMarker\n"));
status = CUDA_ERROR_OUT_OF_MEMORY;
goto Error;
}
ctxMarkerInitialize(ctxMarker, channelManager);
*pCtxMarker = ctxMarker;
return CUDA_SUCCESS;
Error:
ctxMarkerDestroy(ctxMarker);
return status;
}
void
ctxMarkerDestroy(CUctxMarker* ctxMarker)
{
CU_TRACE_FUNCTION();
if (ctxMarker) {
ctxMarkerDeinitialize(ctxMarker);
free(ctxMarker);
}
}
void
ctxMarkerInitialize(CUctxMarker *ctxMarker, CUchannelManager* channelManager)
{
ctxMarker->channelManager = channelManager;
ctxMarker->numEntries = 0;
ctxMarker->numEntriesMax = CU_CTX_MARKER_NUM_ENTRIES_STATIC;
ctxMarker->entries = ctxMarker->entriesStatic;
}
void
ctxMarkerDeinitialize(CUctxMarker *ctxMarker)
{
CU_ASSERT(ctxMarker);
if (ctxMarker->entries != ctxMarker->entriesStatic) {
free(ctxMarker->entries);
}
ctxMarker->channelManager = NULL;
ctxMarker->entries = NULL;
ctxMarker->numEntries = 0;
ctxMarker->numEntriesMax = 0;
}
CUresult
ctxMarkerAppendEntry(CUctxMarkerEntry **outEntry, CUctxMarker *ctxMarker)
{
CUresult status = CUDA_SUCCESS;
CU_TRACE_FUNCTION();
if (ctxMarker->numEntries == ctxMarker->numEntriesMax) {
// Double the size of the marker array, to make future marker additions less likely to reallocate
status = ctxMarkerResize(ctxMarker, ctxMarker->numEntries * 2);
if (status != CUDA_SUCCESS) {
CU_ERROR_PRINT(("Failed to allocate space for new marker entries\n"));
return status;
}
}
*outEntry = &ctxMarker->entries[ctxMarker->numEntries++];
return CUDA_SUCCESS;
}
其它代码
/**
*
* \brief ctxMarkerWaitForAndRemoveHostItmes removes the host items
* from the marker.
*
* \detailDescription This function waits for all the marker entries with the type
* CU_CTX_MARKER_ENTRY_TYPE_CPU_SEMAPHORE to complete and marks them invalid and calls
* ctxMarkerRemoveInvalidEntries.
* This function assumes that ctxMarker is valid.
*
* \param[in] ctxMarker Marker on which to operate
*
* \return void
*
* \additionalNotes
* The marker shall not be left with any entries of type
* CU_CTX_MARKER_ENTRY_TYPE_CPU_SEMAPHORE after the function returns.
* \additionalNotes
* ctxMarkerWaitForAndRemoveHostItems shall only adjust the marker entries such that the
* host entries are removed from tracking but it shall not free memory allocated for
* those entries.
* \additionalNotes
* The marker entries of type other than CU_CTX_MARKER_ENTRY_TYPE_CPU_SEMAPHORE shall not
* be disturbed including their order in the original marker.
*
* \endfn
*/
CUDA_TEST_EXPORT void ctxMarkerWaitForAndRemoveHostItems(CUctxMarker *ctxMarker);
/** \brief Push an awaken after any device work in the marker. This does not wait on
* CPU work.
*
* \detailDescription
* \param[in] marker The marker in which an awaken is pushed.
*
* \return CUresult
*
* \endfn
*
*/
CUDA_TEST_EXPORT CUresult ctxMarkerPushAwaken(CUctxMarker *marker);
/**
* \brief Determine if a marker should spin or yield while waiting to complete.
* \detailDescription This function can be used to find out if a marker should
* spin or yield while waiting to synchronize. If stream is NULL, it uses context
* sched flags to determine the wait behavior, else it uses the stream sched flags.
* \param[in] ctx The context used to determine wait behavior.
* \param[in] stream The stream used to determine wait behavior.
*
* \result CUctxMarkerWaitBehavior Marker wait behavior which can be passed to ctxMarkerWait()
*
* \additionalNotes
* If the stream sched flag is CU_STREAM_SCHED_AUTO, the function will use context flags
* to determine wait behavior.
*
* \endfn
*/
CUDA_TEST_EXPORT CUctxMarkerWaitBehavior ctxMarkerShouldYieldInSpinLoops(CUctx *ctx, CUIstream *stream);
/**
* \brief Wait for the marker to complete
* \detailDescription ctxMarkerWait() waits for the marker to complete. This function
* assumes that the ctxMarker is valid.
* \param ctxMarker marker on which to wait upon
* \param waitBehavior the kind of wait to perform
*
* \return CUresult the result of the function
*
* \retval CUDA_SUCCESS
* \retval CUDA_ERROR_LAUNCH_FAILED
* \retval CUDA_ERROR_LAUNCH_TIMEOUT
* \retval CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES
* \retval CUDA_ERROR_ILLEGAL_ADDRESS
* \retval CUDA_ERROR_ECC_UNCORRECTABLE
* \retval CUDA_ERROR_MAP_FAILED
* \retval CUDA_ERROR_UNMAP_FAILED
* \retval CUDA_ERROR_NOT_INITIALIZED
* \retval CUDA_ERROR_DEINITIALIZED
* \retval CUDA_ERROR_UNKNOWN
*
* \additionalNotes
* ctxMarkerWait shall return CUDA_ERROR_INVALID_VALUE if the waitBehavior value is outside
* of the values defined by waitBehaviour enum.
* \additionalNotes
* The function shall check for any errors which might have occurred in the cuda context
* during the previous operations and report back the errors.
* \additionalNotes
* \notesync
* \additionalNotes
* The function requires that DPC is woken up after ctxMarker completion when
* CU_CTX_MARKER_WAIT_BLOCKING_SYNC is used and NvRmSync is not used.
* \additionalNotes
* It flushes the marker before doing the wait.
* \additionalNotes
* It waits on the CPU for the GPU to pass the ctxMarker.
* \additionalNotes
* This will lock the context to do pushes and flushes.
*
* \endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerWait(CUctxMarker* ctxMarker, CUctxMarkerWaitBehavior waitBehavior);
/**
* \brief Flush all the methods to the gpu
*
* \detailDescription Flushes the work to the GPU if it finds unflushed work in ctxMarker.
* This function is a no-op for all entry types except
* CU_CTX_MARKER_ENTRY_TYPE_CHANNEL_TRACKING_SEMAPHORE and
* CU_CTX_MARKER_ENTRY_TYPE_PENDING_QMD. This function assumes that ctxMarker is valid.
*
*
* \param[in] ctxMarker Marker on which to operate
*
* \return void
*
* \additionalNotes
* The marker shall not be left with any unflushed work after the function returns.
* \additionalNotes
* The function shall not modify the marker or any of it's entries.
*
*
*\endfn
*/
void ctxMarkerFlush(CUctxMarker *ctxMarker);
/**
*
* \brief Returns the status of the given marker
*
* \detailDescription The function returns the status of the marker which
* is the minimum of the status of all the marker entries in the marker.
* The function also flushes the methods in the marker if not yet flushed
* and removes the finished entries before getting the status. This
* function assumes both ctxMarker and status are valid.
*
* \param[out] status Status for the marker
* \param[in] ctxMarker Marker for which the status is queried
* \param[in] flags The additional flags which specify any operations needed to be done
* before/after querying marker
*
*
* \return CUresult the result of the function
*
* \retval CUDA_SUCCESS
* \retval CUDA_ERROR_LAUNCH_FAILED
* \retval CUDA_ERROR_LAUNCH_TIMEOUT
* \retval CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES
* \retval CUDA_ERROR_ILLEGAL_ADDRESS
* \retval CUDA_ERROR_ECC_UNCORRECTABLE
* \retval CUDA_ERROR_MAP_FAILED
* \retval CUDA_ERROR_UNMAP_FAILED
* \retval CUDA_ERROR_NOT_INITIALIZED
* \retval CUDA_ERROR_DEINITIALIZED
* \retval CUDA_ERROR_UNKNOWN
*
* \additionalNotes
* ctxMarkerGetStatus shall check for any errors which might have occurred in the cuda
* context during the previous operations and report back the errors
* \notesync
* \additionalNotes
* The function flushes the marker if it is required, but not yet flushed.
* \additionalNotes
* If CU_CTX_MARKER_FLAGS_NO_CHANNEL_FLUSH is not specified then the
* caller must hold the context lock (otherwise, no locks need be held by
* caller, function is re-entrant)
* \additionalNotes
* Checks the channel error notifier when status is CU_CTX_MARKER_COMPLETED_BY_GPU
*\endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerGetStatus(CUctxMarkerStatus *status, CUctxMarker *ctxMarker, NvU32 flags);
/**
* \brief Sets the ctxMarkerA to the later point in time of ctxMarkerA and ctxMarkerB
*
* \detailDescription
* Merges both input markers into ctxMarkerA in a way that ctxMarkerA captures both the
* markers.
*
* \param[out] ctxMarkerA Marker which will store max of both markers
* \param[in] ctxMarkerB The other markers
*
* \return CUresult The result of the function.
*
* \retval
* CUDA_SUCCESS
* \retval CUDA_ERROR_OUT_OF_MEMORY
*
* \additionalNotes
* The function shall not modify ctxMarkerB or any of it's entries.
* \additionalNotes
* The function shall return CUDA_ERROR_OUT_OF_MEMORY if the malloc fails.
* \additionalNotes
* The marker ctxMarkerA might be resize to accommodate all the entries.
*
* \endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerSetMax(CUctxMarker *ctxMarkerA, CUctxMarker *ctxMarkerB);
/**
* \brief Removes invalid entries in the marker.
*
* \detailDescription This function removes all entries
* from the marker whose type is CU_CTX_MARKER_ENTRY_TYPE_INVALID. It doesn't
* change the order of the rest of the entries. This function assumes the
* ctxMarker is valid.
*
*\param[in] ctxMarker Marker on which to operate, pointer
*
*\return void None
*
* \additionalNotes
* Order of Valid Entries will not be changed.
* \additionalNotes
* Memory used by the invalid will not be freed.
*
*
*\endfn
*
*/
CUDA_TEST_EXPORT void ctxMarkerRemoveInvalidEntries(CUctxMarker *ctxMarker);
/**
*
* \brief Remove entries of ctxMarkerB from ctxMarkerA.
*
* \detailDescription Removes all marker entries of ctxMarkerA which are redundant in
* ctxMarkerB or already captured by ctxMarkerB. Redundancy is established by comparing
* the entry id and value. If ctxMarkerA.id == ctxMarkerB.id &&
* ctxMarkerA.value <= ctxMarkerB.value, then that entry in ctxMarkerA is removed.
* This function assumes both ctxMarkerA and ctxMarkerB are valid and sorted.
*
* \param[in] ctxMarkerA Marker from which the second marker needs to be removed.
* \param[in] ctxMarkerB The marker which needs to be removed from the first marker.
*
* \additionalNotes
* The function shall not modify ctxMarkerB or any of it's entries.
* \additionalNotes
* ctxMarkerRemoveMarker shall only adjust the marker entries of ctxMarkerA such that
* redundant entries are removed from tracking but it shall not free memory allocated
* for those entries.
* \additionalNotes
* The non-redundant marker entries shall not be disturbed including their order in the
* original marker.
*
* \endfn
*
*/
CUDA_TEST_EXPORT void ctxMarkerRemoveMarker(CUctxMarker *ctxMarkerA, CUctxMarker *ctxMarkerB);
/**
*
* \brief Copies the srcMarker to dstMarker.
*
* \detailDescription It resizes dstMarker reallocating memory into it, if
* sufficient memory is not available for copying.
* This function assumes that both srcMarker and dstMarker are valid and sorted.
*
* \param[out] dstMarker The destination marker in which to copy
* \param[in] srcMarker The source marker from which to copy
*
* \return CUresult the result of the function
*
* \retval
* CUDA_SUCCESS
* \retval CUDA_ERROR_OUT_OF_MEMORY
*
* \additionalNotes
* The function shall not modify ctxMarkerB or any of it's entries.
* \additionalNotes
* The number of entries of destination marker are updated with the source
* marker entries.
* \endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerCopy(CUctxMarker *dstMarker, const CUctxMarker *srcMarker);
/**
* \brief Clears the markers or set the ctxMarker to track nothing.
*
* \detailDescription Clears the markers. The marker has no entries after this function.
* This function assumes that ctxMarker is valid.
*
* \param[in] ctxMarker marker on which to operate.
*
* \return void
*
* \additionalNotes
* The function shall not change any other fields of the marker including the memory
* allocations.
*
*
* \endfn
*/
CUDA_TEST_EXPORT void ctxMarkerClear(CUctxMarker *ctxMarker);
/**
* \brief Gets the channel manager which a marker uses.
*
* \detailDescription This function returns marker->channelManager.
* This function assumes that ctxMarker is valid.
*
* \param[in] marker Marker on which to operate
*
* \return CUchannelManager * the channel manager associated with the marker.
*
* \additionalNotes
* Valid channel Manager shall be returned.
*
*
* \endfn
*
*/
CUDA_TEST_EXPORT CUchannelManager *ctxMarkerGetChannelManager(const CUctxMarker *marker);
/**
* \brief Get the context of the marker
*
* \detailDescription This function returns the context of the marker. The function
* expects that the marker to be valid.
*
* \param[in] marker The marker on which to operate.
*
* \return CUctx the marker's context.
*
* \endfn
*
*/
CUDA_TEST_EXPORT CUctx *ctxMarkerGetCtx(const CUctxMarker *marker);
/**
*
* \brief Wait for markerAcquire on stream from a different context.
*
* \detailDescription ctxMarkerAcquireRemote function updates the stream with the latest
* barrier and if the \p markerToAcquire has completed it returns early. It selects a channel
* on which has to wait from the stream, it is generally the last channel on which it pushed
* to avoid the extra semaphore acquire/release. A semaphore wait on the channel is done for
* every entry in the \p markerToAcquire. The channel also does an acquire for every entry
* type in the marker.
*
*
* \param[in] stream The stream on which to wait.
* \param[in] markerToAcquire The marker which should be acquired.
*
* \return CUresult the result of the function
*
* \retval CUDA_SUCCESS
* \retval CUDA_ERROR_NOT_SUPPORTED
* \retval CUDA_ERROR_LAUNCH_FAILED
* \retval CUDA_ERROR_LAUNCH_TIMEOUT
* \retval CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES
* \retval CUDA_ERROR_ILLEGAL_ADDRESS
* \retval CUDA_ERROR_ECC_UNCORRECTABLE
* \retval CUDA_ERROR_MAP_FAILED
* \retval CUDA_ERROR_UNMAP_FAILED
* \retval CUDA_ERROR_NOT_INITIALIZED
* \retval CUDA_ERROR_DEINITIALIZED
* \retval CUDA_ERROR_UNKNOWN.
*
* \additionalNotes
* ctxMarkerAcquireRemote function adds a barrier if the stream is NullStream or BarrierStream
* \additionalNotes
* The ctxMarkerGetStatus of the markerToAcquire should not return the status
* CU_CTX_MARKER_METHODS_FLUSHED_TO_GPU.
* \additionalNotes
* ctxMarkerAcquireRemote function does a host-side synchronization if the hardware
*
* \endfn
*/
CUresult ctxMarkerAcquireRemote(CUIstream *stream, CUctxMarker *markerToAcquire);
/**
* \brief Release a semaphore on a Remote Context
*
* \detailDescription ctxMarkerReleaseRemote function creates the markerOfSignal to
* be marker in the ctxToWait which has the property that waiting on that marker will
* have the effect of waiting on all the work pushed in streamToSignal of the other context.
* It pushes a semaphore acquire on the cross context before releasing a semaphore on the
* channel. It updates \p markerOfSignal with the details of the cross context semaphore
* release.
*
*
* \param[in] streamToSignal The stream on which the semaphore is released
* \param[in] ctxToWait This is the other context on which it should wait
* \param[out] markerOfSignal This is the marker in the ctxToWait which contains the
* cross context semaphore release.
* \param[in] streamToWaitHint The stream of the other context in which it should wait.
*
* \return CUresult the result of the function.
*
* \retval
* CUDA_SUCCESS
*
* \additionalNotes
* ctxMarkerReleaseRemote function expects that the ctxToWait and streamToSignal's context
* are different.
* \additionalNotes
* If the streamToWaitHint is set, it expects that the streamToWaitHint->ctx and ctxToWait
* are same
* \additionalNotes
* The function is re-entrant, it takes the markerMutex lock before doing any operation on the
* crossCtxSema.
*
* \endfn
*/
CUDA_TEST_EXPORT CUresult ctxMarkerReleaseRemote(CUIstream *streamToSignal, CUctx *ctxToWait, CUctxMarker *markerOfSignal, CUIstream *streamToWaitHint);
33

被折叠的 条评论
为什么被折叠?



