opencv 读取buffer_算法实现，用机器学习模拟一个opencv的边缘识别算法

最新推荐文章于 2024-03-08 16:02:29 发布

weixin_39808726

最新推荐文章于 2024-03-08 16:02:29 发布

阅读量404

点赞数

文章标签： opencv 读取buffer

本文链接：https://blog.csdn.net/weixin_39808726/article/details/111605511

版权

本文介绍了如何将OpenCV的边缘检测算法转化为TensorFlow JS模型。首先，通过视频转图片获取数据，进行灰度处理和边缘检测。接着，训练一个简单的图像模型，并将其转换为TensorFlow JS格式。最后，在浏览器中运行模型，实现边缘检测功能。虽然模型参数较少，导致效果不理想，但证明了这种方法在特定场景的可行性。

摘要由CSDN通过智能技术生成

开篇先说

本文是把opencv里面的边缘检测算法，训练为TensorFlow的一个模型，并在浏览器的TensorFlowJS中运行这个模型，尝试这整个过程；
这个并不是一个严肃的方法，不过它的核心和机器学习一样，是普世的，我并不想说这总是一个合理的方法；
实现一个目标用什么方法、什么算法，总是有很多种方法的，从某种意义上来说，本文的方法在一些特殊情况也是可以应用的；

所有项目代码： github.com/qhduan/tfjs_

按照机器学习的角度收集数据

首先随便找到一个视频，将它转换成图片

通过视频转换图片是快速收集图片的一个方法，这里用ffmpeg把一个视频转换成一堆图片，按照每秒一张的方式输出出来。

以下这条命令是把input.mp4这个文件，以每秒一帧fps=1，即每秒一个图片的形式，以400x225的大小，转换到imgs目录里面去：

$ ffmpeg -i 'input.mp4' -vf fps=1,scale=400:225 'imgs/out%05d.png'

把图片进行灰度处理，输出边缘检测后的图片

import os
from pathlib import Path
import cv2
import math
import numpy as np
from tqdm import tqdm
from joblib import Parallel, delayed


# 这个函数完成具体的边缘检测，并且把黑白进行转换，即255转换为0，0转换为255
def get_image(gray, a, b):
return 255 - cv2.Canny(gray, a, b)

def get_gray(fpath):
# 读取图片
    img = cv2.imread(fpath)
    height, width = img.shape[:2]
    height, width = 224, 400
# 转换大小
    img = cv2.resize(img, (width, height))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGBA)
# 转换成灰度图片(黑白)
    gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
return gray


imgs_root = Path(f'imgs')
# a, b是opencv边缘检测算法的参数，这里随便选取两个看上去还不错的
a, b = 10, 40
# 输出目录
os.makedirs(f'bin_imgs', exist_ok=True)


def convert(inpath, outpath):
    img = get_image(get_gray(inpath), 10, 40)
    cv2.imwrite(outpath, img)

# 从目录收集要转换的图片，构成参数
params = []
for dirname, _, filenames in os.walk(imgs_root):
    filenames = [x for x in filenames if x.endswith('.png')]
for f in tqdm(filenames):
        inpath = str(imgs_root / f)
        outpath = f'bin_imgs/{f}'
        params.append((inpath, outpath))

# 用joblib实现多进程转换，快一点
_ = Parallel(n_jobs=-1, verbose=0)(delayed(convert)(a, b) for a, b in tqdm(params))

输入图片和预测图片的预览：

训练

经过上面的步骤，我们已经有了一些输入和输出能用来训练机器学习模型。

输入数据是一堆原始图片，输出数据是一堆经过边缘检测后的图片。

模型

模型部分这里用了一个特别简单的图像模型，没经过微调，实际上效果并不是最好的，那为什么用效果不好呢？因为这个模型参数只有817个，在浏览器里面运行也可以比较好的满帧运行。

model = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    tf.keras.layers.Conv2D(8, 3, padding='same', activation='relu'),
    tf.keras.layers.Conv2D(8, 3, padding='same', activation='relu'),
    tf.keras.layers.Conv2D(1, 1, padding='same'),
])


model.summary()

"""Model: "sequential_23"_________________________________________________________________Layer (type)                 Output Shape              Param #=================================================================conv2d_134 (Conv2D)          (None, 224, 400, 8)       224_________________________________________________________________conv2d_135 (Conv2D)          (None, 224, 400, 8)       584_________________________________________________________________conv2d_136 (Conv2D)          (None, 224, 400, 1)       9=================================================================Total params: 817Trainable params: 817Non-trainable params: 0_________________________________________________________________"""

训练的其他部分

from pathlib import Path
import tensorflow as tf
import numpy as np

IMG_HEIGHT, IMG_WIDTH = 224, 400


def decode_img(img):
# convert the compressed string to a 3D uint8 tensor
    img = tf.image.decode_png(img, channels=3)
# Use `convert_image_dtype` to convert to floats in the [0,1] range.
    img = tf.image.convert_image_dtype(img, tf.float32)
# resize the image to the desired size.
# return tf.image.resize(img, [IMG_HEIGHT, IMG_WIDTH])
    img = tf.image.resize(img, [IMG_HEIGHT, IMG_WIDTH])
    img = img[:-1,:,:]
return img


def process_path(file_path):
    img = tf.io.read_file(file_path)
    img = decode_img(img)

    label_path = tf.strings.regex_replace(file_path, 'imgs', 'bin_imgs')
    label = tf.io.read_file(label_path)
    label = decode_img(label)
    label = tf.cast(label > 0.5, tf.float32)[:, :, :1]
return img, label


data_dir = Path(f'imgs')
input_ds = tf.data.Dataset.list_files(str(data_dir / '*.png'))
input_ds = input_ds.map(process_path)

train_ds = input_ds.shuffle(buffer_size=100).repeat().batch(32)

model.compile(
    loss=tf.keras.losses.MeanSquaredError(),
    optimizer=tf.keras.optimizers.Adam(5e-5),
    metrics=['acc']
)

model.fit(train_ds, steps_per_epoch=50, epochs=50)


model.save(f'./model_export')

上面代码的最后，我们把模型转换到了model_export文件夹里面

训练好后输出图片的预览

可以看出来模型效果并不算好~~主要是因为我们把模型参数量限制的太小了

运行

转换模型到TensorFlow JS的格式

需要安装python包tensorflowjs

然后运行：

tensorflowjs_converter ./model_export ./modeljs

最终得到浏览器可以读取的model.json文件

让我们在TensorFlow JS里面读取模型并运行模型

最终完整成品的预览：

<html>

<head>
    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js">script>

head>

<body>

    <video id="video" controls="controls" style="height: 224px; width: 400px; opacity: 1;">

video>
    <canvas id="render" style="height: 224px; width: 400px; opacity: 1;">canvas>
    <button id='play'>playbutton>
    <button id='stop'>stopbutton>

    <script>

const height = 224
const width = 400
let isStop = false

let model = null

// 读取模型        async function loadModel() {
            model = await tf.loadGraphModel('jsmodel/model.json')
        }
        loadModel()

const video = document.querySelector("video")
const canvasRender = document.querySelector("#render")
const contextRender = canvasRender.getContext('2d')

// 获取摄像头权限        if (navigator.mediaDevices.getUserMedia) {
            navigator.mediaDevices.getUserMedia({ video: true })
            .then(function (stream) {
                video.srcObject = stream;
            })
            .catch(function (err0r) {
                console.log("Something went wrong!");
            });
        }

// 从摄像头里面获取一个图片        function getImage() {
let img = tf.image.resizeBilinear(tf.expandDims(tf.browser.fromPixels(document.querySelector('video')), 0), [height, width])
return tf.div(img, 255.0)
        }

// 主循环        async function run() {
if (!model) {
return window.requestAnimationFrame(run)
            }
// 获取图片            const input = getImage()
// 预测图片            let out = model.execute(input)
// 图片归一            out = tf.clipByValue(out, 0.0, 1.0)
// 设置0.75的阈值            out = tf.cast(tf.greater(out, 0.75), 'float32')
// 去掉多余的张量维度            out = out.squeeze()
// 绘制图片            tf.browser.toPixels(out, canvasRender)
if (!isStop) {
                setTimeout(run, 50)
            }
        }

document.querySelector('#play').addEventListener('click', () => {
            video.play()
            isStop = false
            run()
        })

document.querySelector('#stop').addEventListener('click', () => {
            isStop = true
            video.pause()
        })

script>
body>

html>

题图 "Image processing experiments" bysnebtoris licensed underCC BY-NC 2.0

weixin_39808726

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
opencv 读取buffer_算法实现，用机器学习模拟一个opencv的边缘识别算法

开篇先说本文是把opencv里面的边缘检测算法，训练为TensorFlow的一个模型，并在浏览器的TensorFlowJS中运行这个模型，尝试这整个过程；这个并不是一个严肃的方法，不过它的核心和机器学习一样，是普世的，我并不想说这总是一个合理的方法；实现一个目标用什么方法、什么算法，总是有很多种方法的，从某种意义上来说，本文的方法在一些特殊情况也是可以应用的；所有项目代码：gi...
复制链接

扫一扫