学习BLAS库 -- GEMM_gemm lda-CSDN博客

本文链接：https://blog.csdn.net/cocoonyang/article/details/58602654

本文详细探讨了BLAS库中的GEMM函数，包括Fortran和C语言版本的sgemm实现，介绍了两个不同的测试场景及其结果，展示了如何进行矩阵乘法并分析性能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

函数语法:

SGEMM( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB, BETA, C, LDC)

功能：

matrix matrix multiply ( row major order)

| C C C C C |           | A A A |                      | C C C C C |                        
| C C C C C |           | A A A | |B B B B B|          | C C C C C |
| C C C C C | = alpha * | A A A | |B B B B B| + beta * | C C C C C |  
| C C C C C |           | A A A | |B B B B B|          | C C C C C |
| C C C C C |           | A A A |                      | C C C C C | 

int m = 5;
int n = 3;
int k = 5;
float aplha = 1.0;
float beta = 1.0;

sgemm('N', 'N', m, n, k, alpha, A, n, B, k, beta, C, k);

参数：

transa -- 数据类型: char；功能：设定矩阵A是否转置, ‘N'或 'n' 为不转置, ‘T'或 't' 或'C'或 'c' 表示矩阵A需转置.
transb -- 数据类型: char；功能：设定矩阵B是否转置, ‘N'或 'n' 为不转置, ‘T'或 't' 或'C'或 'c' 表示矩阵B需转置.
m -- 数据类型: int；功能：矩阵A和矩阵C的行数.
n -- 数据类型: int；功能：矩阵B和矩阵C的列数.
k -- 数据类型: int；功能：矩阵A和矩阵B的列数.
alpha -- 数据类型: float；功能：数乘系数.
a -- 数据类型: float array；功能：保存矩阵A.
lda -- 数据类型: int；功能：矩阵A的递增步长.
b -- 数据类型: float array；功能：保存矩阵B.
ldb -- 数据类型: int；功能：矩阵B的递增步长.
beta -- 数据类型: float；功能：数乘系数.
c -- 数据类型: float array; 功能：保存矩阵C, 计算结果写入矩阵C.
ldc -- 数据类型: int；功能：矩阵C的递增步长

Fortran语言版sgemm

源代码：

      SUBROUTINE SGEMM(TRANSA,TRANSB,M,N,K,ALPHA,A,LDA,B,LDB,BETA,C,LDC)
*     .. Scalar Arguments ..
      REAL ALPHA,BETA
      INTEGER K,LDA,LDB,LDC,M,N
      CHARACTER TRANSA,TRANSB
*     ..
*     .. Array Arguments ..
      REAL A(LDA,*),B(LDB,*),C(LDC,*)
*     ..
*
*  Purpose
*  =======
*
*  SGEMM  performs one of the matrix-matrix operations
*
*     C := alpha*op( A )*op( B ) + beta*C,
*
*  where  op( X ) is one of
*
*     op( X ) = X   or   op( X ) = X',
*
*  alpha and beta are scalars, and A, B and C are matrices, with op( A )
*  an m by k matrix,  op( B )  a  k by n matrix and  C an m by n matrix.
*
*  Arguments
*  ==========
*
*  TRANSA - CHARACTER*1.
*           On entry, TRANSA specifies the form of op( A ) to be used in
*           the matrix multiplication as follows:
*
*              TRANSA = 'N' or 'n',  op( A ) = A.
*
*              TRANSA = 'T' or 't',  op( A ) = A'.
*
*              TRANSA = 'C' or 'c',  op( A ) = A'.
*
*           Unchanged on exit.
*
*  TRANSB - CHARACTER*1.
*           On entry, TRANSB specifies the form of op( B ) to be used in
*           the matrix multiplication as follows:
*
*              TRANSB = 'N' or 'n',  op( B ) = B.
*
*              TRANSB = 'T' or 't',  op( B ) = B'.
*
*              TRANSB = 'C' or 'c',  op( B ) = B'.
*
*           Unchanged on exit.
*
*  M      - INTEGER.
*           On entry,  M  specifies  the number  of rows  of the  matrix
*           op( A )  and of the  matrix  C.  M  must  be at least  zero.
*           Unchanged on exit.
*
*  N      - INTEGER.
*           On entry,  N  specifies the number  of columns of the matrix
*           op( B ) and the number of columns of the matrix C. N must be
*           at least zero.
*           Unchanged on exit.
*
*  K      - INTEGER.
*           On entry,  K  specifies  the number of columns of the matrix
*           op( A ) and the number of rows of the matrix op( B ). K must
*           be at least  zero.
*           Unchanged on exit.
*
*  ALPHA  - REAL            .
*           On entry, ALPHA specifies the scalar alpha.
*           Unchanged on exit.
*
*  A      - REAL             array of DIMENSION ( LDA, ka ), where ka is
*           k  when  TRANSA = 'N' or 'n',  and is  m  otherwise.
*           Before entry with  TRANSA = 'N' or 'n',  the leading  m by k
*           part of the array  A  must contain the matrix  A,  otherwise
*           the leading  k by m  part of the array  A  must contain  the
*           matrix A.
*           Unchanged on exit.
*
*  LDA    - INTEGER.
*           On entry, LDA specifies the first dimension of A as declared
*           in the calling (sub) program. When  TRANSA = 'N' or 'n' then
*           LDA must be at least  max( 1, m ), otherwise  LDA must be at
*           least  max( 1, k ).
*           Unchanged on exit.
*
*  B      - REAL             array of DIMENSION ( LDB, kb ), where kb is
*           n  when  TRANSB = 'N' or 'n',  and is  k  otherwise.
*           Before entry with  TRANSB = 'N' or 'n',  the leading  k by n
*           part of the array  B  must contain the matrix  B,  otherwise
*           the leading  n by k  part of the array  B  must contain  the
*           matrix B.
*           Unchanged on exit.
*
*  LDB    - INTEGER.
*           On entry, LDB specifies the first dimension of B as declared
*           in the calling (sub) program. When  TRANSB = 'N' or 'n' then
*           LDB must be at least  max( 1, k ), otherwise  LDB must be at
*           least  max( 1, n ).
*           Unchanged on exit.
*
*  BETA   - REAL            .
*           On entry,  BETA  specifies the scalar  beta.  When  BETA  is
*           supplied as zero then C need not be set on input.
*           Unchanged on exit.
*
*  C      - REAL             array of DIMENSION ( LDC, n ).
*           Before entry, the leading  m by n  part of the array  C must
*           contain the matrix  C,  except when  beta  is zero, in which
*           case C need not be set on entry.
*           On exit, the array  C  is overwritten by the  m by n  matrix
*           ( alpha*op( A )*op( B ) + beta*C ).
*
*  LDC    - INTEGER.
*           On entry, LDC specifies the first dimension of C as declared
*           in  the  calling  (sub)  program.   LDC  must  be  at  least
*           max( 1, m ).
*           Unchanged on exit.
*
*
*  Level 3 Blas routine.
*
*  -- Written on 8-February-1989.
*     Jack Dongarra, Argonne National Laboratory.
*     Iain Duff, AERE Harwell.
*     Jeremy Du Croz, Numerical Algorithms Group Ltd.
*     Sven Hammarli