CUDA cusparseSnnz解析

cusparseSnnz(cusparseHandle_t handle, cusparseDirection_t dirA, int m, int n, const cusparseMatDescr_t descrA, const float *A, int lda, int *nnzPerRowColumn, int *nnzTotalDevHostPtr)


This function computes the number of nonzero elements per row or column and the total number of nonzero elements in a dense matrix.

这个函数计算每个行或列的非零元素数和稠密矩阵中非零元素的总数。

This function requires no extra storage. It is executed asynchronously with respect to the host and may return control to the application on the host before the result is ready.

此函数不需要额外存储。它相对于主机异步执行,在结果就绪之前可以将控制权返回给主机上的应用程序。

Input
handlehandle to the cuSPARSE library context.
dirAdirection that specifies whether to count nonzero elements by CUSPARSE_DIRECTION_ROW or by CUSPARSE_DIRECTION_COLUMN.
通过行模式还是列模式计算矩阵非零元素个数
mnumber of rows of matrix A.          矩阵行数目
nnumber of columns of matrix A.     矩阵列数目
descrAthe descriptor of matrix A. The supported matrix type is CUSPARSE_MATRIX_TYPE_GENERAL. Also, the supported index bases are CUSPARSE_INDEX_BASE_ZERO andCUSPARSE_INDEX_BASE_ONE.
函数支持的输入矩阵的模式。
Aarray of dimensions (lda, n). 输入的矩阵指针
ldaleading dimension of dense array A. 密集阵的主导维数,也就是矩阵行数。


Output
nnzPerRowColumnarray of size m or n containing the number of nonzero elements per row or column, respectively.
每一行或每一列非零元素组成的数组
nnzTotalDevHostPtrtotal number of nonzero elements in device or host memory.
所有非零元素的个数
Status Returned
CUSPARSE_STATUS_SUCCESSthe operation completed successfully.
CUSPARSE_STATUS_NOT_INITIALIZEDthe library was not initialized.
CUSPARSE_STATUS_ALLOC_FAILEDthe resources could not be allocated.
CUSPARSE_STATUS_INVALID_VALUEinvalid parameters were passed (m, n<0).
CUSPARSE_STATUS_ARCH_MISMATCHthe device does not support double precision.
CUSPARSE_STATUS_EXECUTION_FAILEDthe function failed to launch on the GPU.
CUSPARSE_STATUS_INTERNAL_ERRORan internal operation failed.
CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTEDthe matrix type is not supported.

测试代码:

#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>
#include "cusparse.h"
#include <cublas_v2.h>
#include <helper_cuda.h>
#include <iostream>
using namespace std;
int main() {
	cusparseStatus_t status;
	cusparseHandle_t handle = 0;
	cusparseMatDescr_t descr = 0;

	status = cusparseCreate(&handle);
	if (status != CUSPARSE_STATUS_SUCCESS) {
		cout << "CUSPARSE Library initialization failed" << endl;
	}
	status = cusparseCreateMatDescr(&descr);
	if (status != CUSPARSE_STATUS_SUCCESS) {
		cout << "Matrix descriptor initialization failed" << endl;
	}
	status = cusparseSetMatType(descr, CUSPARSE_MATRIX_TYPE_GENERAL);
	if (status != CUSPARSE_STATUS_SUCCESS) {
		cout << "cusparseSetMatType failed" << endl;
	}
	status = cusparseSetMatIndexBase(descr, CUSPARSE_INDEX_BASE_ZERO);
	if (status != CUSPARSE_STATUS_SUCCESS) {
		cout << "cusparseSetMatIndexBase failed" << endl;
	}
	int *nnzPerRow=0;
	int nnzTotal;
	float* d_Temp;
	
	cudaMallocManaged(&d_Temp, sizeof(float) * 6);//分配矩阵存储空间,并初始化。
        d_Temp[0]=1.0;
        d_Temp[1]=0.0;
        d_Temp[2]=2.0;
        d_Temp[3]=3.0;
        d_Temp[4]=0.0;
        d_Temp[5]=2.0;
	//
	cudaMallocManaged(&nnzPerRow, sizeof(int) * 2);
	//
	status = cusparseSnnz(handle, CUSPARSE_DIRECTION_ROW, 2, 3, descr, d_Temp, 2, nnzPerRow, &nnzTotal);
	
	if (status != CUSPARSE_STATUS_SUCCESS) {
		cout << "nnz calculation failed" << endl;
		cout << "status = " << status << endl;
	}
	cout << "nnzTotal = " << nnzTotal << endl;
	cout << "nnzPerRow[0] = " << nnzPerRow[0] << endl;
	cout << "nnzPerRow[1] = " << nnzPerRow[1] << endl;
	
    cudaFree(d_Temp);
    cudaFree(nnzPerRow);

}

原始数组:

1.0  0.0  2.0  3.0  0.0  2.0 

原始矩阵:GPU将原始数据按照列优先的方式排列

1.0   2.0  0.0

0.0   3.0  2.0

计算结果:

nnzTotal = 4
nnzPerRow[0] = 2
nnzPerRow[1] = 2


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值