maxpooling2d的C++细节实现

SimpleLearing

已于 2024-05-09 20:47:03 修改

阅读量215

点赞数 9

分类专栏：多模态理解 Aigc 文章标签： c++ 深度学习

于 2024-05-09 20:45:56 首次发布

本文链接：https://blog.csdn.net/yiqiedouhao11/article/details/138630861

版权

多模态理解同时被 2 个专栏收录

9 篇文章 0 订阅

订阅专栏

Aigc

7 篇文章 0 订阅

订阅专栏

最大池化是一种常见的操作，用于减小输入特征图的大小并提取最显著的特征。PyTorch提供了torch.nn.functional.max_pool2d` 函数来执行这个操作，如果不具备pytorch环境，可以通过C++实现这个操作，更清楚地了解其原理；

PyTorch实现

函数调用

import torch

# 示例输入张量
head_out = torch.randn(1, 3, 28, 28)

# 执行最大池化操作
output = torch.nn.functional.max_pool2d(head_out, (1, 3), stride=(1, 1), padding=(0, 1))

# 输出张量大小
print("Output size:", output.size())

参数说明

输入张量：head_out 是一个四维张量，表示输入特征图。
池化核大小：(1, 3) 指定了池化核的高度和宽度。
步长：(1, 1) 指定了在高度和宽度上的步长。
填充：(0, 1) 指定了在输入张量的高度和宽度上的填充。

C++实现细节

函数定义

#include <iostream>
#include <vector>

// 最大池化函数
std::vector<std::vector<float>> max_pool2d(const std::vector<std::vector<float>>& input, int kernel_size, int stride, int padding) {
    // 实现代码
    int input_height = input.size();
    int input_width = input[0].size();

    // 计算输出张量的大小
    int output_height = (input_height + 2 * padding - kernel_size) / stride + 1;
    int output_width = (input_width + 2 * padding - kernel_size) / stride + 1;

    // 初始化输出张量
    std::vector<std::vector<float>> output(output_height, std::vector<float>(output_width, 0.0));

    // 对输入张量进行最大池化
    for (int i = 0; i < output_height; ++i) {
        for (int j = 0; j < output_width; ++j) {
            // 计算当前池化窗口的位置
            int start_h = i * stride - padding;
            int start_w = j * stride - padding;
            int end_h = std::min(start_h + kernel_size, input_height);
            int end_w = std::min(start_w + kernel_size, input_width);

            // 找到池化窗口内的最大值
            float max_val = std::numeric_limits<float>::lowest();
            for (int h = start_h; h < end_h; ++h) {
                for (int w = start_w; w < end_w; ++w) {
                    max_val = std::max(max_val, input[h][w]);
                }
            }
            output[i][j] = max_val;
        }
    }

    return output;
}

参数说明

输入张量：input 是一个二维向量，表示输入特征图。
池化核大小：kernel_size 指定了池化核的大小。
步长：stride 指定了在输入特征图上的步长。
填充：padding 指定了在输入特征图上的填充。

水平有限，有问题随时交流~

SimpleLearing

关注

9
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
maxpooling2d的C++细节实现

/ 最大池化函数// 实现代码// 计算输出张量的大小// 初始化输出张量float// 对输入张量进行最大池化++i) {++j) {// 计算当前池化窗口的位置// 找到池化窗口内的最大值h < end_h;++w) {
复制链接

扫一扫