视频编解码(六)之NVDEC主要cuvid-API

最新推荐文章于 2025-03-21 09:54:26 发布

jrglinux

最新推荐文章于 2025-03-21 09:54:26 发布

阅读量1.9k

点赞数 1

分类专栏：视频编解码文章标签：视频编解码

本文链接：https://blog.csdn.net/qq_23662505/article/details/131840889

版权

1. 创建cuvidCreateDecoder

/*****************************************************************************************************/
//! \fn CUresult CUDAAPI cuvidCreateDecoder(CUvideodecoder *phDecoder, CUVIDDECODECREATEINFO *pdci)
//! Create the decoder object based on pdci. A handle to the created decoder is returned
/*****************************************************************************************************/
extern CUresult CUDAAPI cuvidCreateDecoder(CUvideodecoder *phDecoder, CUVIDDECODECREATEINFO *pdci);

CUresult

其中，CUresult是cuda编程里函数返回值，是个enum枚举变量，判断函数在执行时成功还是失败以及失败的原因。

enum CUresult {
   
    CUDA_SUCCESS = 0,
    CUDA_ERROR_INVALID_VALUE = 1,
    CUDA_ERROR_OUT_OF_MEMORY = 2,
    //...
};

CUDAAPI

CUDAAPI is a macro used in the CUDA runtime API to specify the calling convention for functions. It is defined as __stdcall on Windows and empty on other platforms. The CUDAAPI macro is typically used when defining function pointer types and when declaring CUDA API functions. 在Linux系统中，CUDAAPI可以当做空来处理。

Cuvideodecoder

typedef void *CUvideodecoder;

Cuvideodecoder是void *类型指针变量，那么**cuvidCreateDecoder()**函数的第一个入参：

CUvideodecoder *phDecoder则是指向二级指针变量（void * *）。

所以在调用cuvidCreateDecoder()函数时的第一个参数需要小心赋值：

CUvideoDecoder decoder;
cuvidCreateDecoder(&decoder, &vdci);  //这里第一个参数需要传入decoder变量的地址。

CUVIDDECODECREATEINFO

该结构体的作用是用于创建decoder时的主要入参依据。

/**************************************************************************************************************/
//! \struct CUVIDDECODECREATEINFO
//! This structure is used in cuvidCreateDecoder API
/**************************************************************************************************************/
typedef struct _CUVIDDECODECREATEINFO
{
   
    unsigned long ulWidth;              /**< IN: Coded sequence width in pixels */
    unsigned long ulHeight;             /**< IN: Coded sequence height in pixels */
    unsigned long ulNumDecodeSurfaces;  /**< IN: Maximum number of internal decode surfaces */
    cudaVideoCodec CodecType;           /**< IN: cudaVideoCodec_XXX */
    cudaVideoChromaFormat ChromaFormat; /**< IN: cudaVideoChromaFormat_XXX */
    unsigned long ulCreationFlags;      /**< IN: Decoder creation flags (cudaVideoCreateFlags_XXX)*/
    unsigned long bitDepthMinus8;       /**< IN: The value "BitDepth minus 8" */
    unsigned long ulIntraDecodeOnly;    /**< IN: Set 1 only if video has all intra frames (default value is 0). This will optimize video memory for Intra frames only decoding. The support is limited to specific codecs - H264, HEVC, VP9, the flag will be ignored for codecs which are not supported. However decoding might fail if the flag is enabled in caseof supported codecs for regular bit streams having P and/or B frames. */
    unsigned long ulMaxWidth;           /**< IN: Coded sequence max width in pixels used with reconfigure Decoder */
    unsigned long ulMaxHeight;          /**< IN: Coded sequence max height in pixels used with reconfigure Decoder */                                           
    unsigned long Reserved1;            /**< Reserved for future use - set to zero */
    /**
    * IN: area of the frame that should be displayed
    */
    struct {
   
        short left;
        short top;
        short right;
        short bottom;
    } display_area;

    cudaVideoSurfaceFormat OutputFormat;       /**< IN: cudaVideoSurfaceFormat_XXX  */
    cudaVideoDeinterlaceMode DeinterlaceMode;  /**< IN: cudaVideoDeinterlaceMode_XXX */
    unsigned long ulTargetWidth;               /**< IN: Post-processed output width (Should be aligned to 2) */
    unsigned long ulTargetHeight;              /**< IN: Post-processed output height (Should be aligned to 2)          */
    unsigned long ulNumOutputSurfaces;         /**< IN: Maximum number of output surfaces simultaneously mapped        */
    CUvideoctxlock vidLock;                    /**< IN: If non-NULL, context lock used for synchronizing ownership of  the cuda context. Needed for cudaVideoCreate_PreferCUDA decode     */
    /**
    * IN: target rectangle in the output frame (for aspect ratio conversion)
    * if a null rectangle is specified, {0,0,ulTargetWidth,ulTargetHeight} will be used
    */
    struct {
   
        short left;
        short top;
        short right;
        short bottom;
    } target_rect;

    unsigned long enableHistogram;             /**< IN: enable histogram output, if supported */
    unsigned long Reserved2[4];                /**< Reserved for future use - set to zero */
} CUVIDDECODECREATEINFO;

cudaVideoCodec

枚举变量cudaVideoCodec中定义了cuvid中支持的decode codec。

/*********************************************************************************/
//! \enum cudaVideoCodec
//! Video codec enums
//! These enums are used in CUVIDDECODECREATEINFO and CUVIDDECODECAPS structures
/*********************************************************************************/
typedef enum cudaVideoCodec_enum {
   
    cudaVideoCodec_MPEG1=0,                                         /**<  MPEG1             */
    cudaVideoCodec_MPEG2,                                           /**<  MPEG2             */
    cudaVideoCodec_MPEG4,                                           /**<  MPEG4             */
    cudaVideoCodec_VC1,                                             /**<  VC1               */
    cudaVideoCodec_H264,                                            /**<  H264              */
    cudaVideoCodec_JPEG,                                            /**<  JPEG              */
    cudaVideoCodec_H264_SVC,                                        /**<  H264-SVC          */
    cudaVideoCodec_H264_MVC,                                        /**<  H264-MVC          */
    cudaVideoCodec_HEVC,                                            /**<  HEVC              */
    cudaVideoCodec_VP8,                                             /**<  VP8               */
    cudaVideoCodec_VP9,                                             /**<  VP9               */
    cudaVideoCodec_AV1,                                             /**<  AV1               */
    cudaVideoCodec_NumCodecs,                                       /**<  Max codecs        */
    // Uncompressed YUV
    cudaVideoCodec_YUV420 = (('I'<<24)|('Y'<<16)|('U'<<8)|('V')),   /**< Y,U,V (4:2:0)      */
    cudaVideoCodec_YV12   = (('Y'<<24)|('V'<<16)|('1'<<8)|('2')),   /**< Y,V,U (4:2:0)      */
    cudaVideoCodec_NV12   = (('N'<<24)|('V'<<16)|('1'<<8)|('2')),   /**< Y,UV  (4:2:0)      */
    cudaVideoCodec_YUYV   = (('Y'<<24)|('U'<<16)|('Y'<<8)|('V')),   /**< YUYV/YUY2 (4:2:2)  */
    cudaVideoCodec_UYVY   = (('U'<<24)|('Y'<<16)|('V'<<8)|('Y'))    /**< UYVY (4:2:2)       */
} cudaVideoCodec;

cudaVideoChromaFormat

枚举变量，定义chroma format（色度格式），420,422,444。

/**************************************************************************************************************/
//! \enum cudaVideoChromaFormat
//! Chroma format enums
//! These enums are used in CUVIDDECODECREATEINFO and CUVIDDECODECAPS structures
/**************************************************************************************************************/
typedef enum cudaVideoChromaFormat_enum {
   
    cudaVideoChromaFormat_Monochrome=0,  /**< MonoChrome */
    cudaVideoChromaFormat_420,           /**< YUV 4:2:0  */
    cudaVideoChromaFormat_422,           /**< YUV 4:2:2  */
    cudaVideoChromaFormat_444            /**< YUV 4:4:4  */
} cudaVideoChromaFormat;

cudaVideoSurfaceFormat

枚举变量，定义surface format，有NV12，P016，YUV444，YUV444_16Bit等。

/*********************************************************************************/
//! \enum cudaVideoSurfaceFormat
//! Video surface format enums used for output format of decoded output
//! These enums are used in CUVIDDECODECREATEINFO structure
/*********************************************************************************/
typedef enum cudaVideoSurfaceFormat_enum {
   
    cudaVideoSurfaceFormat_NV12=0,          /**< Semi-Planar YUV [Y plane followed by interleaved UV plane]     */
    cudaVideoSurfaceFormat_P016=1,          /**< 16 bit Semi-Planar YUV [Y plane followed by interleaved UV plane]. Can be used for 10 bit(6LSB bits 0), 12 bit (4LSB bits 0)      */
    cudaVideoSurfaceFormat_YUV444=2,        /**< Planar YUV [Y plane followed by U and V planes]*/
    cudaVideoSurfaceFormat_YUV444_16Bit=3,  /**< 16 bit Planar YUV [Y plane followed by U and V planes].Can be used for 10 bit(6LSB bits 0), 12 bit (4LSB bits 0)*/
} cudaVideoSurfaceFormat;

cudaVideoDeinterlaceMode

枚举变量，定义逐行扫描格式，有weave、bob、adaptive三种格式。

/******************************************************************************************************************/
//! \enum cudaVideoDeinterlaceMode
//! Deinterlacing mode enums
//! These enums are used in CUVIDDECODECREATEINFO structure
//! Use cudaVideoDeinterlaceMode_Weave for progressive content and for content that doesn't need deinterlacing
//! cudaVideoDeinterlaceMode_Adaptive needs more video memory than other DImodes
/******************************************************************************************************************/
typedef enum cudaVideoDeinterlaceMode_enum {
   
    cudaVideoDeinterlaceMode_Weave=0,   /**< Weave both fields (no deinterlacing) */
    cudaVideoDeinterlaceMode_Bob,       /**< Drop one field                       */
    cudaVideoDeinterlaceMode_Adaptive   /**< Adaptive deinterlacing               */
} cudaVideoDeinterlaceMode;

CUvideoctxlock

typedef struct _CUcontextlock_st *CUvideoctxlock;

2. 同步cuvidCtxLock

对于多线程编程，cuda video sdk提供了4个函数用于创建lock、销毁lock、上锁、释放锁

/********************************************************************************************************************/
//! \fn CUresult CUDAAPI cuvidCtxLockCreate(CUvideoctxlock *pLock, CUcontext ctx)
//! This API is used to create CtxLock object
/********************************************************************************************************************/
extern CUresult CUDAAPI cuvidCtxLockCreate(CUvideoctxlock *pLock, CUcontext ctx);

/********************************************************************************************************************/
//! \fn CUresult CUDAAPI cuvidCtxLockDestroy(CUvideoctxlock lck)
//! This API is used to free CtxLock object
/********************************************************************************************************************/
extern CUresult CUDAAPI cuvidCtxLockDestroy(CUvideoctxlock lck);

/********************************************************************************************************************/
//! \fn CUresult CUDAAPI cuvidCtxLock(CUvideoctxlock lck, unsigned int reserved_flags)
//! This API is used to acquire ctxlock
/********************************************************************************************************************/
extern CUresult CUDAAPI cuvidCtxLock(CUvideoctxlock lck, unsigned int reserved_flags);

/********************************************************************************************************************/
//! \fn CUresult CUDAAPI cuvidCtxUnlock(CUvideoctxlock lck, unsigned int reserved_flags)
//! This API is used to release ctxlock
/********************************************************************************************************************/
extern CUresult CUDAAPI cuvidCtxUnlock(CUvideoctxlock lck, unsigned int reserved_flags);

CUvideoctxlock

typedef struct _CUcontextlock_st *CUvideoctxlock;

typedef定义一个指向锁结构体变量的指针变量。struct _CUcontextlock_st是cuda driver内部实现，用户代码无需感知。

CUcontext

typedef struct CUctx_st *CUcontext;

cuCtxPushCurrent/cuCtxPopCurrent

//Pushes a context on the current CPU thread.
CUresult cuCtxPushCurrent (CUcontext *pctx)

//Pops the current CUDA context from the current CPU thread.
CUresult cuCtxPopCurrent (CUcontext *pctx);

cuCtxPushCurrent() Pushes the given context ctx onto the CPU thread’s stack of current contexts. The specified context becomes the CPU thread’s current context, so all CUDA functions that operate on the current context are affected. 推送ctx为当前cpu线程的当前context。

而cuCtxPopCurrent()作用是Pops the current CUDA context from the CPU thread and passes back the old context handle in *pctx. That context may then be made current to a different CPU thread by calling cuCtxPushCurrent(). If a context was current to the CPU thread before cuCtxCreate() or cuCtxPushCurrent() was called, this function makes that context current to the CPU thread again.

push/pop与ctxlock/unlock之间有何不同呢？

cuCtxPushCurrent and cuvidCtxLock are two different functions provided by the NVIDIA CUDA library.

cuCtxPushCurrent is used to make a CUDA context current. A CUDA context is a set of resources and a state that can be used by a CUDA device for executing operations. The cuCtxPushCurrent function makes the specified context the current context on the calling thread. This means that subsequent CUDA operations will be executed on the resources associated with this context. This function can be used to switch between multiple contexts in a single thread.

On the other hand, cuvidCtxLock</

最低0.47元/天解锁文章