秋招计算机视觉

吃肉不能购

于 2024-07-31 15:48:08 发布

阅读量209

点赞数 2

分类专栏：自动驾驶文章标签：计算机视觉人工智能

本文链接：https://blog.csdn.net/weixin_44298961/article/details/140821439

版权

自动驾驶专栏收录该内容

3 篇文章 2 订阅

订阅专栏

这里写目录标题

基础算法和数据结构

- 快速排序、归并排序等排序算法。

- 二分查找、二叉树遍历（前序、中序、后序）。

- 链表操作（插入、删除、反转）。

- 栈和队列的实现及其应用。

机器学习基础

- 实现一个简单的线性回归或逻辑回归模型。

- 编写K-近邻（KNN）算法。

- 实现决策树或随机森林的简单版本。

深度学习

- 编写卷积神经网络（CNN）的基础层，如卷积层、池化层。

卷积层：

import numpy as np

def conv2d(input_array, kernel_array, stride=1, padding=0):
    # 输入和卷积核的维度
    n, c, h, w = input_array.shape
    k, _, kh, kw = kernel_array.shape
    
    # 计算输出特征图的尺寸
    h_out = (h - kh + 2 * padding) // stride + 1
    w_out = (w - kw + 2 * padding) // stride + 1
    
    # 初始化输出特征图
    output = np.zeros((n, k, h_out, w_out))
    
    # 填充输入数组
    input_padded = np.pad(input_array, ((0, 0), (0, 0), (padding, padding), (padding, padding)), mode='constant')
    
    # 应用卷积
    for i in range(n):  # 遍历批次中的每个样本
        for j in range(k):  # 遍历每个输出通道
            for x in range(0, h_out):
                for y in range(0, w_out):
                    x1 = x * stride
                    y1 = y * stride
                    x2 = x1 + kh
                    y2 = y1 + kw
                    output[i, j, x, y] = np.sum(input_padded[i, :, x1:x2, y1:y2] * kernel_array[j, :, :, :], axis=(0, 1, 2))
    return output

# 测试卷积层
# 创建一个单通道输入数组，模拟一个 5x5 的图像
input_array = np.random.rand(1, 1, 5, 5)

# 创建一个卷积核数组，模拟一个 3x3 的卷积核，输出通道为 1
kernel_array = np.random.rand(1, 1, 3, 3)

# 应用卷积层
output_array = conv2d(input_array, kernel_array, stride=1, padding=1)

# 打印结果
print("Input shape:", input_array.shape)
print("Output shape:", output_array.shape)
print("Output data:\n", output_array)
#输出结果：
Input shape: (1, 1, 5, 5)
Output shape: (1, 1, 5, 5)
Output data:
 [[[[0.75773916 1.41080892 1.41720873 2.23616    1.2913502 ]
   [1.96894655 2.63949014 2.817716   3.29313139 1.69216063]
   [1.95485893 3.15706786 2.57267158 3.87890898 1.98338294]
   [2.05593676 3.34350405 2.54709454 2.97866573 1.37964358]
   [1.08824088 1.49975143 1.45773857 1.86624518 0.76132244]]]]

池化层

import numpy as np

def max_pooling(input_array, pool_size=2, stride=2):
    # 输入数组的维度
    n, c, h, w = input_array.shape
    
    # 计算输出特征图的尺寸
    h_out = (h - pool_size) // stride + 1
    w_out = (w - pool_size) // stride + 1
    
    # 初始化输出特征图
    output = np.zeros((n, c, h_out, w_out))
    
    # 应用最大池化
    for i in range(n):  # 遍历批次中的每个样本
        for j in range(c):  # 遍历每个通道
            for x in range(0, h_out):
                for y in range(0, w_out):
                    x1 = x * stride
                    y1 = y * stride
                    x2 = x1 + pool_size
                    y2 = y1 + pool_size
                    output[i, j, x, y] = np.max(input_array[i, j, x1:x2, y1:y2])
    return output

# 测试最大池化层
# 创建一个单通道输入数组，模拟一个 4x4 的特征图
# 注意：添加一个维度以匹配四维数组的预期形状
input_array = np.array([[
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
]]).reshape(1, 1, 4, 4)

# 应用最大池化层
output_array = max_pooling(input_array, pool_size=2, stride=2)

# 打印结果
print("Input shape:", input_array.shape)
print("Output shape:", output_array.shape)
print("Output data:\n", output_array)
#输出结果：
Input shape: (1, 1, 4, 4)
Output shape: (1, 1, 2, 2)
Output data:
 [[[[ 6.  8.]
   [14. 16.]]]]

- 实现一个简单的循环神经网络（RNN）或长短期记忆网络（LSTM）。

- 编写神经网络的前向传播和反向传播。

计算机视觉

- 实现图像滤波（如高斯模糊、中值滤波）。

- 编写SIFT或SURF特征提取算法的简化版本。

- 实现图像的边缘检测算法（如Canny边缘检测）。

多传感器融合

- 实现卡尔曼滤波或粒子滤波。

- 编写数据关联和跟踪算法，如最近邻算法。

其他常见算法

- 实现IOU计算和非极大值抑制（NMS）。

IOU计算：
在这里插入图片描述

def iou(box1, box2):
    """
    计算两个矩形的IOU
    box格式: (x1, y1, x2, y2)
    """
    # 获取矩形的坐标
    x1, y1, x2, y2 = box1
    x1_prime, y1_prime, x2_prime, y2_prime = box2

    # 计算交集的坐标
    x1_intersect = max(x1, x1_prime)
    y1_intersect = max(y1, y1_prime)
    x2_intersect = min(x2, x2_prime)
    y2_intersect = min(y2, y2_prime)

    # 计算交集的面积
    if x1_intersect >= x2_intersect or y1_intersect >= y2_intersect:
        return 0
    width_intersect =  x2_intersect - x1_intersect
    height_intersect = y2_intersect - y1_intersect
    area_intersect = width_intersect * height_intersect

    # 计算各自的面积
    area_box1 = (x2 - x1) * (y2 - y1)
    area_box2 = (x2_prime - x1_prime) * (y2_prime - y1_prime)

    # 计算并集的面积
    area_union = area_box1 + area_box2 - area_intersect

    # 计算IOU
    iou = area_intersect / area_union
    return iou

# print(iou((0, 0, 2, 2), (1, 1, 3, 3)))  # 0.14285714285714285
print(iou((1,1,3,3),(0,0,2,2)))
print(iou((0,0,1,1),(2,2,4,4)))

非极大值抑制（NMS)


def nms(boxes, scores, threshold):
    # 根据分数对边界框进行排序
    sorted_indices = np.argsort(scores)[::-1]

    keep_boxes = []
    
    while sorted_indices.size > 0:
        # 选择具有最高分数的框
        box_idx = sorted_indices[0]
        keep_boxes.append(box_idx)
        
        # 计算剩余框与当前最高分框的 IOU
        ious = np.array([iou(boxes[box_idx], boxes[other_idx]) for other_idx in sorted_indices[1:]])
        
        # 移除 IOU 高于阈值的框
        keep_indices = np.where(ious < threshold)[0] + 1
        print("Keep indices:", keep_indices)
        sorted_indices = sorted_indices[keep_indices]
    
    return keep_boxes

# 测试例子
boxes = np.array([
    [20, 20, 60, 60],  # x1, y1, x2, y2
    [30, 30, 70, 70],
    [25, 25, 65, 65],
    [100, 100, 140, 140],
    [95, 95, 135, 135],
])
scores = np.array([0.9, 0.8, 0.7, 0.95, 0.85])
threshold = 0.5

selected_indices = nms(boxes, scores, threshold)
selected_boxes = boxes[selected_indices]
print("Selected boxes:", selected_boxes)