golang部署YOLOv8

云上翔宇

已于 2024-03-12 20:01:47 修改

阅读量727

点赞数

文章标签： golang YOLO 开发语言

于 2024-03-12 20:01:32 首次发布

原文链接：http://www.zhangshiyu.com/post/69273.html

版权

10. 在 Go 上创建 Web 服务

10.1 创建项目

创建一个新项目目录，进入并初始化项目

go mod init object_detector

安装所需的外部模块：

go get github.com/yalue/onnxruntime_gogo get github.com/nfnt/resize

github.com/yalue/onnxruntime_go： Golang 的 ONNX 库。github.com/nfnt/resize：处理图像的库。

同Python和Node.js，我们只需要修改后端程序即可。

我们创建一个main.go的文件，内容如下：

package mainimport (    "encoding/json"    "github.com/nfnt/resize"    ort "github.com/yalue/onnxruntime_go"    "image"    _ "image/gif"    _ "image/jpeg"    _ "image/png"    "io"    "math"    "net/http"    "os"    "sort")func main() {    server := http.Server{Addr: "0.0.0.0:8080",}    http.HandleFunc("/", index)    http.HandleFunc("/detect", detect)    server.ListenAndServe()}func index(w http.ResponseWriter, _ *http.Request) {    file, _ := os.Open("index.html")    buf, _ := io.ReadAll(file)    w.Write(buf)}func detect(w http.ResponseWriter, r *http.Request) {    r.ParseMultipartForm(0)    file, _, _ := r.FormFile("image_file")    boxes := detect_objects_on_image(file)    buf, _ := json.Marshal(&boxes)    w.Write(buf)}func detect_objects_on_image(buf io.Reader) [][]interface{} {    input, img_width, img_height := prepare_input(buf)    output := run_model(input)    return process_output(output, img_width, img_height)}func prepare_input(buf io.Reader) ([]float32, int64, int64) {}func run_model(input []float32) []float32 {}func process_output(output []float32, img_width, img_height int64) [][]interface{} {}

首先，我们导入所需的包：

encoding/json在发送响应之前将边界框编码为 JSONgithub.com/nfnt/resize将图像大小调整为 640x640ort "github.com/yalue/onnxruntime_go"ONNX 运行时库，我们将其重命名为ortimage, image/gif, image/jpeg, image/png图片库和支持不同格式图片的库io从本地文件读取数据math对于Max一个Min函数net/http创建并运行网络服务器os打开本地文件sort对边界框进行排序

然后，该main函数定义两个 HTTP 服务，并在端口 8080 上启动Web 服务。

index函数仅返回文件的内容index.html。
detect函数接收上传的图像文件，将其传递给函数detect_objects_on_image，然后利用 YOLOv8 模型推了，获得输出的边界框，接着将它们编码为 JSON 并返回到前端。
这detect_objects_on_image与之前的语言的项目相同。唯一的区别是它返回的值的类型，即[][]interface{}，表示边界框数组。每个边界框都是一个包含 6 个项目的数组（x1，y1，x2，y2，种类标签，置信度）。

10.2 准备输入

要准备 YOLOv8 模型的输入，首先加载图像，调整其大小并转换为 (3,640,640) 的张量，其中第一项是图像像素的红色分量数组，第二项是绿色分量数组，最后一个是蓝色数组。此外，Go 的 ONNX 库要求输入这个张量作为一维数组，例如将这三个数组一个接一个地连接起来，就像下一张图像上显示的那样。

请添加图片描述

代码如下：

func prepare_input(buf io.Reader) ([]float32, int64, int64) {    img, _, _ := image.Decode(buf)    size := img.Bounds().Size()    img_width, img_height := int64(size.X), int64(size.Y)    img = resize.Resize(640, 640, img, resize.Lanczos3)

这段代码完成了加载图像，并将其大小调整为 640x640 像素。
然后将像素的颜色分到不同的数组中：

 red := []float32{} green := []float32{} blue := []float32{}

接着需要从图像中提取像素及其颜色，并把他们归一化，代码如下：

for y := 0; y < 640; y++ {    for x := 0; x < 640; x++ {        r, g, b, _ := img.At(x, y).RGBA()        red = append(red, float32(r/257)/255.0)        green = append(green, float32(g/257)/255.0)        blue = append(blue, float32(b/257)/255.0)    }}

最后，以正确的顺序将这些数组连接成一个数组：

input := append(red, green...)input = append(input, blue...)

完整的prepare_input代码如下：

func prepare_input(buf io.Reader) ([]float32, int64, int64) {    img, _, _ := image.Decode(buf)    size := img.Bounds().Size()    img_width, img_height := int64(size.X), int64(size.Y)    img = resize.Resize(640, 640, img, resize.Lanczos3)    red := []float32{}    green := []float32{}    blue := []float32{}    for y := 0; y < 640; y++ {        for x := 0; x < 640; x++ {            r, g, b, _ := img.At(x, y).RGBA()            red = append(red, float32(r/257)/255.0)            green = append(green, float32(g/257)/255.0)            blue = append(blue, float32(b/257)/255.0)         }     }     input := append(red, green...)     input = append(input, blue...)     return input, img_width, img_height}

10.3 运行模型

run_model的代码如下：

func run_model(input []float32) []float32 {    ort.SetSharedLibraryPath("./libonnxruntime.so")    _ = ort.InitializeEnvironment()        inputShape := ort.NewShape(1, 3, 640, 640)    inputTensor, _ := ort.NewTensor(inputShape, input)        outputShape := ort.NewShape(1, 84, 8400)    outputTensor, _ := ort.NewEmptyTensor[float32](outputShape)        model, _ := ort.NewSession[float32]("./yolov8m.onnx",        []string{"images"},         []string{"output0"},        []*ort.Tensor[float32]{inputTensor},        []*ort.Tensor[float32]{outputTensor}    )    _ = model.Run()    return outputTensor.GetData()}

我们从ONNX官网上下载了对应的库，并命名为libonnxruntime.so，在程序中加载使用。然后，库需要将其转换input为形状为 (1,3,640,640) 的内部张量格式。为输出创建一个空结构。ONNX 库不返回输出，而是将其写入预先定义的变量中。在这里，我们将outputTensor变量定义为形状为 (1,84,8400) 的张量，用于接收来自模型的数据。然后我们创建一个NewSession，接收输入和输出名称数组以及输入和输出张量数组。然后我们运行这个模型，处理输入并将输出写入变量outputTensor。该outputTensor.GetData()方法以浮点数字的一维数组形式返回输出数据。

结果，该函数返回形状为 (1,84,8400) 的数组，可以将其视为大约 84x8400 矩阵。它以一维数组的形式返回。所以，你不能转置它。

10.4 处理输出

该process_output函数的代码将使用 IoU 算法来过滤掉所有重叠的框。将 Python 中的iou、intersect 和 union 函数重写为 Go 很容易。将它们包含到函数的代码中process_output：

func iou(box1, box2 []interface{}) float64 {    return intersection(box1, box2) / union(box1, box2)}func union(box1, box2 []interface{}) float64 {    box1_x1, box1_y1, box1_x2, box1_y2 := box1[0].(float64), box1[1].(float64), box1[2].(float64), box1[3].(float64)    box2_x1, box2_y1, box2_x2, box2_y2 := box2[0].(float64), box2[1].(float64), box2[2].(float64), box2[3].(float64)    box1_area := (box1_x2 - box1_x1) * (box1_y2 - box1_y1)    box2_area := (box2_x2 - box2_x1) * (box2_y2 - box2_y1)        return box1_area + box2_area - intersection(box1, box2)}func intersection(box1, box2 []interface{}) float64 {    box1_x1, box1_y1, box1_x2, box1_y2 := box1[0].(float64), box1[1].(float64), box1[2].(float64), box1[3].(float64)    box2_x1, box2_y1, box2_x2, box2_y2 := box2[0].(float64), box2[1].(float64), box2[2].(float64), box2[3].(float64)    x1 := math.Max(box1_x1, box2_x1)y1 := math.Max(box1_y1, box2_y1)    x2 := math.Min(box1_x2, box2_x2)y2 := math.Min(box1_y2, box2_y2)        return (x2 - x1) * (y2 - y1)}

同样创建种类标签：

var yolo_classes = []string{    "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train",     "truck", "boat","traffic light", "fire hydrant", "stop sign",     "parking meter", "bench", "bird", "cat", "dog", "horse","sheep",     "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella",     "handbag", "tie","suitcase", "frisbee", "skis", "snowboard", "sports ball",     "kite", "baseball bat", "baseball glove","skateboard", "surfboard",     "tennis racket", "bottle", "wine glass", "cup", "fork", "knife",     "spoon","bowl", "banana", "apple", "sandwich", "orange", "broccoli",     "carrot", "hot dog", "pizza", "donut","cake", "chair", "couch",     "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse",    "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink",     "refrigerator", "book","clock", "vase", "scissors", "teddy bear",     "hair drier", "toothbrush",}

如上所述，该函数接收以 84x8400 矩阵排序的平面数组形式的输出。这里类同前述Node.js版本的处理。只能在头脑中将这个一维数组虚拟地重塑为 84x8400 矩阵，并使用这些“虚拟行”和“虚拟列”使用此表示来计算这些绝对索引。
如下图所示：

请添加图片描述

在这里，我们实际上将output包含 705600 个项目的数组重塑为 84x8400 矩阵。它有 8400 列，索引从 0 到 8399，84 行，索引从 0 到 83。数据的绝对索引写在方框内。每个检测到的对象都由该矩阵中的一列表示。每列的前 4 行索引从 0 到 3 对应的是对象的边界框的坐标：x_center、y_center、宽、高。其他 80 行中的单元格（从 4 到 83）包含对象属于 80 个 YOLO 类中每个类的概率。

代码如下：

boxes := [][]interface{}{}for index := 0; index < 8400; index++ {    xc := output[index]    yc := output[8400+index]    w := output[2*8400+index]    h := output[3*8400+index]}

然后，计算边界框的角并将其缩放到原始图像的大小：

x1 := (xc - w/2) / 640 * float32(img_width)y1 := (yc - h/2) / 640 * float32(img_height)x2 := (xc + w/2) / 640 * float32(img_width)y2 := (yc + h/2) / 640 * float32(img_height)

现在，类似地，获取第 4 行到第 83 行中的对象的概率，找到其中哪一个最大以及该概率的索引，并将这些值保存到和prob变量中class_id：

class_id, prob := 0, float32(0.0)for col := 0; col < 80; col++ {    if output[8400*(col+4)+index] > prob {        prob = output[8400*(col+4)+index]        class_id = col    }}

然后，有了最大概率和 class_id，如果概率小于 0.5，您可以跳过该对象，找到该类的标签。

最终代码如下：

boxes := [][]interface{}{}for index := 0; index < 8400; index++ {    class_id, prob := 0, float32(0.0)    for col := 0; col < 80; col++ {        if output[8400*(col+4)+index] > prob {            prob = output[8400*(col+4)+index]            class_id = col         }    }    if prob < 0.5 {        continue    }    label := yolo_classes[class_id]    xc := output[index]    yc := output[8400+index]    w := output[2*8400+index]    h := output[3*8400+index]    x1 := (xc - w/2) / 640 * float32(img_width)    y1 := (yc - h/2) / 640 * float32(img_height)    x2 := (xc + w/2) / 640 * float32(img_width)    y2 := (yc + h/2) / 640 * float32(img_height)    boxes = append(boxes, []interface{}{float64(x1), float64(y1), float64(x2), float64(y2), label, prob})}

boxes最后一步是使用“非极大值抑制”过滤数组，以排除其中所有重叠的框。此代码与Python 实现相同，但由于 Go 语言的具体情况而看起来略有不同：

sort.Slice(boxes, func(i, j int) bool {    return boxes[i][5].(float32) < boxes[j][5].(float32)})result := [][]interface{}{}for len(boxes) > 0 {    result = append(result, boxes[0])    tmp := [][]interface{}{}    for _, box := range boxes {        if iou(boxes[0], box) < 0.7 {            tmp = append(tmp, box)        }     }     boxes = tmp}

首先，我们按相反的顺序对框进行排序，将概率最高的框放在顶部。在循环中，我们将概率最高的输入框放入数组result中然后我们创建一个临时tmp数组，并在所有框的内部循环中，仅将不会与所选内容重叠太多的框（IoU<0.7）放入该数组中。然后我们boxes用tmp数组覆盖数组。这样，就可以从boxes数组中过滤掉所有重叠的框。如果过滤后存在一些框，则循环继续进行，直到boxes数组变空。

最后，该result变量包含应返回的所有边界框。

完整的代码如下：

func process_output(output []float32, img_width, img_height int64) [][]interface{} {    boxes := [][]interface{}{}    for index := 0; index < 8400; index++ {        class_id, prob := 0, float32(0.0)        for col := 0; col < 80; col++ {            if output[8400*(col+4)+index] > prob {                prob = output[8400*(col+4)+index]                class_id = col            }         }         if prob < 0.5 {             continue         }         label := yolo_classes[class_id]         xc := output[index]         yc := output[8400+index]         w := output[2*8400+index]         h := output[3*8400+index]         x1 := (xc - w/2) / 640 * float32(img_width)         y1 := (yc - h/2) / 640 * float32(img_height)         x2 := (xc + w/2) / 640 * float32(img_width)         y2 := (yc + h/2) / 640 * float32(img_height)         boxes = append(boxes, []interface{}{float64(x1), float64(y1), float64(x2), float64(y2), label, prob})     }          sort.Slice(boxes, func(i, j int) bool {         return boxes[i][5].(float32) < boxes[j][5].(float32)     })          result := [][]interface{}{}     for len(boxes) > 0 {         result = append(result, boxes[0])         tmp := [][]interface{}{}         for _, box := range boxes {             if iou(boxes[0], box) < 0.7 {                 tmp = append(tmp, box)              }          }          boxes = tmp      }      return result}

通过运行以下命令来启动此 Web 服务：

go run main.go

打开浏览器并访问地址http://localhost:8080获取服务。

11. 总结

在本文中，展示了如何在不需要PyTorch和官方API的情况下使用 YOLOv8 模型，需要将模型部署在不同的端上，让模型使用的资源减少十倍，并且使用了如何在Python、 Node.js、和 Go 上创建由 YOLOv8 的 Web 服务。

如何使用 Python、Node.js 和 Go 创建基于 YOLOv8 的对象检测 Web 服务 - 张士玉小黑屋1.介绍这是有关YOLOv8系列文章的第二篇。在上一篇文章中我们介绍了YOLOv8以及如何使用它，然后展示了如何使用Python和基于PyTorch的官方YOLOv8库创建一个Web服务来检测图像上的对象。在本文中，将展示如何在不需要http://www.zhangshiyu.com/post/69273.html

云上翔宇

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
golang部署YOLOv8

在本文中，展示了如何在不需要PyTorch和官方API的情况下使用 YOLOv8 模型，需要将模型部署在不同的端上，让模型使用的资源减少十倍，并且使用了如何在Python、 Node.js、和 Go 上创建由 YOLOv8 的 Web 服务。如何使用 Python、Node.js 和 Go 创建基于 YOLOv8 的对象检测 Web 服务 - 张士玉小黑屋1.介绍这是有关YOLOv8系列文章的第二篇。
复制链接

扫一扫