tf.matmul - 矩阵乘法

最新推荐文章于 2024-07-25 22:30:43 发布

原创最新推荐文章于 2024-07-25 22:30:43 发布 · 2.4k 阅读

4 ·

CC 4.0 BY-SA版权

世上没有白读的书，每一页都算数。

文章标签：

#tf.matmul - 矩阵乘法

TensorFlow - Keras 专栏收录该内容

104 篇文章

订阅专栏

本文详细介绍了TensorFlow中tf.matmul函数的使用方法，包括如何进行矩阵乘法、转置和共轭操作，以及如何利用稀疏矩阵优化计算。通过多个实例展示了不同维度的张量之间的乘法运算。

tf.matmul - 矩阵乘法

https://github.com/tensorflow/docs/tree/r1.4/site/en/api_docs/api_docs/python/tf
site/en/api_docs/api_docs/python/tf/matmul.md

matmul(
    a,
    b,
    transpose_a=False,
    transpose_b=False,
    adjoint_a=False,
    adjoint_b=False,
    a_is_sparse=False,
    b_is_sparse=False,
    name=None
)

Defined in tensorflow/python/ops/math_ops.py.
See the guide: Math > Matrix Math Functions

Multiplies matrix a by matrix b, producing a * b.
矩阵 a 乘以矩阵 b 生成 a * b。

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication arguments, and any further outer dimensions match.
输入必须在任何转换之后是 rank >= 2 的张量，其中内部 2 维度指定有效的矩阵乘法参数，并且任何其他外部维度匹配。

Both matrices must be of the same type. The supported types are: float16, float32, float64, int32, complex64, complex128.
两个矩阵必须是相同类型。

Either matrix can be transposed or adjointed (conjugated and transposed) on the fly by setting one of the corresponding flag to True. These are False by default.
通过将相应的标志之一设置为 True，矩阵可以被 transposed or adjointed (共轭和转置)。默认情况下，这些都是 False。

If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding a_is_sparse or b_is_sparse flag to True. These are False by default. This optimization is only available for plain matrices (rank-2 tensors) with datatypes bfloat16 or float32.
如果一个或两个矩阵包含很多的零，则可以通过将相应的 a_is_sparse 或 b_is_sparse 标志设置为 True 来使用更有效的乘法算法，默认为 False。这个优化仅适用于具有数据类型为 bfloat16 或 float32 的纯矩阵 (rank 为 2 的张量)。

transposition [trænspə'zɪʃ(ə)n; trɑːns-; -nz-]：n. 调换，换置，词序的换位，移项，一个古老故事的现代翻版，变换物
conjugate ['kɒndʒʊgeɪt]：v. 列举 （动词的） 词形变化，结合，使成对，使共轭 adj. 共轭的，结合的 n. 结合物，共轭物，偶联物

For example:

# 2-D tensor `a`
# [[1, 2, 3],
#  [4, 5, 6]]
a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3])

# 2-D tensor `b`
# [[ 7,  8],
#  [ 9, 10],
#  [11, 12]]
b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2])

# `a` * `b`
# [[ 58,  64],
#  [139, 154]]
c = tf.matmul(a, b)


# 3-D tensor `a`
# [[[ 1,  2,  3],
#   [ 4,  5,  6]],
#  [[ 7,  8,  9],
#   [10, 11, 12]]]
a = tf.constant(np.arange(1, 13, dtype=np.int32),
                shape=[2, 2, 3])

# 3-D tensor `b`
# [[[13, 14],
#   [15, 16],
#   [17, 18]],
#  [[19, 20],
#   [21, 22],
#   [23, 24]]]
b = tf.constant(np.arange(13, 25, dtype=np.int32),
                shape=[2, 3, 2])

# `a` * `b`
# [[[ 94, 100],
#   [229, 244]],
#  [[508, 532],
#   [697, 730]]]
c = tf.matmul(a, b)

# Since python >= 3.5 the @ operator is supported (see PEP 465).
# In TensorFlow, it simply calls the `tf.matmul()` function, so the
# following lines are equivalent:
d = a @ b @ [[10.], [11.]]
d = tf.matmul(tf.matmul(a, b), [[10.], [11.]])

1. Args

a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.
transpose_a: If True, a is transposed before multiplication. (如果 True，a 在乘法之前转置。)
transpose_b: If True, b is transposed before multiplication. (如果 True，b 在乘法之前转置。)
adjoint_a: If True, a is conjugated and transposed before multiplication. (如果 True，a 在乘法之前共轭和转置。)
adjoint_b: If True, b is conjugated and transposed before multiplication. (如果 True，b 在乘法之前共轭和转置。)
a_is_sparse: If True, a is treated as a sparse matrix. (如果 True，a 被视为稀疏矩阵。)
b_is_sparse: If True, b is treated as a sparse matrix. (如果 True，b 被视为稀疏矩阵。)
name: Name for the operation (optional).

2. Returns

A Tensor of the same type as a and b where each inner-most matrix is the product of the corresponding matrices in a and b, e.g. if all transpose or adjoint attributes are False:
该函数返回与 a 和 b 具有相同类型的张量，其中每个最内矩阵是 a 和 b 中对应矩阵的乘积。例如，如果所有转置或伴随的属性为 False：

output[…, i, j] = sum_k (a[…, i, k] * b[…, k, j]),
for all indices i, j.

Note: This is matrix product, not element-wise product. (这是矩阵乘积，而不是元素的乘积。)

3. Raises

ValueError: If transpose_a and adjoint_a, or transpose_b and adjoint_b are both set to True.

4. Example

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys
import numpy as np
import tensorflow as tf

sys.path.append(os.path.dirname(os.path.abspath(__file__)))
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)
print(16 * "++--")

# 2-D tensor `a`
a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3])
# => [[1. 2. 3.]
#     [4. 5. 6.]]

# 2-D tensor `b`
b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2])
# => [[7. 8.]
#     [9. 10.]
#     [11. 12.]]

c = tf.matmul(a, b)  # => [[58 64]
#     [139 154]]

with tf.Session() as sess:
    input_a = sess.run(a)
    print("input_a.shape:", input_a.shape)
    print("input_a:\n", input_a)
    print('\n')

    input_b = sess.run(b)
    print("input_b.shape:", input_b.shape)
    print("input_b:\n", input_b)
    print('\n')

    output_c = sess.run(c)
    print("output_c.shape:", output_c.shape)
    print("output_c:\n", output_c)
    print('\n')

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
2019-08-21 20:31:03.554301: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-08-21 20:31:03.621830: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-21 20:31:03.622083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.31GiB
2019-08-21 20:31:03.622093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
input_a.shape: (2, 3)
input_a:
 [[1 2 3]
 [4 5 6]]


input_b.shape: (3, 2)
input_b:
 [[ 7  8]
 [ 9 10]
 [11 12]]


output_c.shape: (2, 2)
output_c:
 [[ 58  64]
 [139 154]]

Process finished with exit code 0

5. Example

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import print_function
from __future__ import division

import os
import sys
import numpy as np
import tensorflow as tf

sys.path.append(os.path.dirname(os.path.abspath(__file__)))
current_directory = os.path.dirname(os.path.abspath(__file__))

print(16 * "++--")
print("current_directory:", current_directory)
print(16 * "++--")

# 3-D tensor `a`
a = tf.constant(np.arange(1, 13, dtype=np.int32), shape=[2, 2, 3])
# => [[[ 1.  2.  3.]
#      [ 4.  5.  6.]],
#     [[ 7.  8.  9.]
#      [10. 11. 12.]]]

a0 = tf.constant(np.arange(1, 7, dtype=np.int32), shape=[2, 3])
# =>  [[ 1.  2.  3.]
#      [ 4.  5.  6.]]

a1 = tf.constant(np.arange(7, 13, dtype=np.int32), shape=[2, 3])
# =>  [[ 7.  8.  9.]
#      [10. 11. 12.]]

# 3-D tensor `b`
b = tf.constant(np.arange(13, 25, dtype=np.int32), shape=[2, 3, 2])
# => [[[13. 14.]
#      [15. 16.]
#      [17. 18.]],
#     [[19. 20.]
#      [21. 22.]
#      [23. 24.]]]

b0 = tf.constant(np.arange(13, 19, dtype=np.int32), shape=[3, 2])
# =>  [[13. 14.]
#      [15. 16.]
#      [17. 18.]]

b1 = tf.constant(np.arange(19, 25, dtype=np.int32), shape=[3, 2])
# =>  [[19. 20.]
#      [21. 22.]
#      [23. 24.]]

a0b0 = tf.matmul(a0, b0)
a0b1 = tf.matmul(a0, b1)
a1b0 = tf.matmul(a1, b0)
a1b1 = tf.matmul(a1, b1)

c = tf.matmul(a, b)
# => [[[ 94 100]
#      [229 244]],
#     [[508 532]
#      [697 730]]]

with tf.Session() as sess:
    input_a = sess.run(a)
    print("input_a.shape:", input_a.shape)
    print("input_a:\n", input_a)
    print('\n')

    input_b = sess.run(b)
    print("input_b.shape:", input_b.shape)
    print("input_b:\n", input_b)
    print('\n')

    output_c = sess.run(c)
    print("output_c.shape:", output_c.shape)
    print("output_c:\n", output_c)
    print('\n')

    input_a0 = sess.run(a0)
    print("input_a0.shape:", input_a0.shape)
    print("input_a0:\n", input_a0)
    print('\n')

    input_a1 = sess.run(a1)
    print("input_a1.shape:", input_a1.shape)
    print("input_a1:\n", input_a1)
    print('\n')

    input_b0 = sess.run(b0)
    print("input_b0.shape:", input_b0.shape)
    print("input_b0:\n", input_b0)
    print('\n')

    input_b1 = sess.run(b1)
    print("input_b1.shape:", input_b1.shape)
    print("input_b1:\n", input_b1)
    print('\n')

    output_a0b0 = sess.run(a0b0)
    print("output_a0b0.shape:", output_a0b0.shape)
    print("output_a0b0:\n", output_a0b0)
    print('\n')

    output_a0b1 = sess.run(a0b1)
    print("output_a0b1.shape:", output_a0b1.shape)
    print("output_a0b1:\n", output_a0b1)
    print('\n')

    output_a1b0 = sess.run(a1b0)
    print("output_a1b0.shape:", output_a1b0.shape)
    print("output_a1b0:\n", output_a1b0)
    print('\n')

    output_a1b1 = sess.run(a1b1)
    print("output_a1b1.shape:", output_a1b1.shape)
    print("output_a1b1:\n", output_a1b1)
    print('\n')

    print("output_a0b0 + a1b1:\n")
    print(output_a0b0)
    print(output_a1b1)

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
current_directory: /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow
++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--++--
2019-08-21 20:57:43.726875: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-08-21 20:57:43.792803: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-21 20:57:43.793048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.31GiB
2019-08-21 20:57:43.793059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
input_a.shape: (2, 2, 3)
input_a:
 [[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


input_b.shape: (2, 3, 2)
input_b:
 [[[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]]


output_c.shape: (2, 2, 2)
output_c:
 [[[ 94 100]
  [229 244]]

 [[508 532]
  [697 730]]]


input_a0.shape: (2, 3)
input_a0:
 [[1 2 3]
 [4 5 6]]


input_a1.shape: (2, 3)
input_a1:
 [[ 7  8  9]
 [10 11 12]]


input_b0.shape: (3, 2)
input_b0:
 [[13 14]
 [15 16]
 [17 18]]


input_b1.shape: (3, 2)
input_b1:
 [[19 20]
 [21 22]
 [23 24]]


output_a0b0.shape: (2, 2)
output_a0b0:
 [[ 94 100]
 [229 244]]


output_a0b1.shape: (2, 2)
output_a0b1:
 [[130 136]
 [319 334]]


output_a1b0.shape: (2, 2)
output_a1b0:
 [[364 388]
 [499 532]]


output_a1b1.shape: (2, 2)
output_a1b1:
 [[508 532]
 [697 730]]


output_a0b0 + a1b1:

[[ 94 100]
 [229 244]]
[[508 532]
 [697 730]]

Process finished with exit code 0