triton之normalization教程

youzjuer

已于 2024-05-06 16:52:44 修改

阅读量64

点赞数

分类专栏：通俗易懂技术站文章标签： cuda triton python

于 2024-05-03 14:21:25 首次发布

本文链接：https://blog.csdn.net/youzjuer/article/details/138415670

版权

通俗易懂技术站专栏收录该内容

63 篇文章 5 订阅 ¥59.90 ¥99.00

订阅专栏

超级会员免费看

一前向

在上式中，x是代表一个tensor

import torch

import triton
import triton.language as tl

try:
    # This is https://github.com/NVIDIA/apex, NOT the apex on PyPi, so it
    # should not be added to extras_require in setup.py.
    import apex
    HAS_APEX = True
except ModuleNotFoundError:
    HAS_APEX = False


@triton.jit
def _layer_norm_fwd_fused(
    X,  # pointer to the input
    Y,  # pointer to the output
    W,  # pointer to the weights
    B,  # pointer to the biases
    Mean,  # pointer to the mean
    Rstd,  # pointer to the 1/std
    stride,  # how much to increase the pointer when moving by 1 row
    N,  # number of columns in X
    eps,  # epsilon to avoid division by zero
    BLOCK_SIZE: tl.constexpr,
):
    # Map the program id to the row of X and Y it should compute.
    row = tl.

了解本专栏