基于索尼Spresense的眼睛跟随平台软件服务端代码详解(Detailed explanation of the server-side code of the software software)

本文链接：https://blog.csdn.net/qq_62975494/article/details/140113388

我们将服务端分为两部分
1.接收图片
2.检测图片
We divide the server into two parts

Receive pictures
Detect the image return information

接收图片(Receive pictures)

导入所需库:
•struct用于处理字节流与Python数据类型之间的转换，这里是用于解析网络字节序的整数。
•serial用于串口通信。
•time用于时间相关的操作，如计时和格式化时间戳。
初始化:
•打开文件b.txt，并写入初始值’0’，这可能用于外部进程检查是否已有新图像接收。
•定义save_to_file函数，用于将二进制数据保存到指定的文件中。
主函数main:
•设置串口名PORT和波特率BAUDRATE，应根据实际使用的串口和设备配置进行修改。
•初始化ser为None，然后尝试打开指定串口。
•进入一个无限循环，持续监听串口数据。
•每次从串口读取一个字节，累加到buffer中。
•当累积的数据足够读取长度字段（假设长度字段为4字节，位于数据包开头）时，使用struct.unpack从网络字节序转换为Python整数，得到图像数据的长度。
•如果累积的数据足够构成一个完整的数据包（包括长度字段和图像数据），则提取图像数据，使用save_to_file函数保存为JPEG文件（示例中直接命名为./eyes.jpg，也可按时间戳动态命名）。
•图片保存后，更新辅助文件b.txt的内容为’1’，可能用于指示图像已接收。
•清除已处理的数据，重置计时器，继续监听新数据。•异常处理: 包括串口打开失败和用户中断（Ctrl+C）的情况。
•最后，确保即使遇到异常，也会尝试关闭已打开的串口。

import struct
import serial
import time

with open('./b.txt', 'w') as fp:
    fp.write('0')
def save_to_file(data, filename):
    """将接收到的数据保存到文件中"""
    with open(filename, 'wb') as f:
        f.write(data)


def main():
    PORT = 'COM6'  # 根据实际情况修改为你的串口名称
    BAUDRATE = 115200  # 确保这与Spresense端设置的波特率一致
    ser = None  # 初始化ser为None

    try:
        ser = serial.Serial(PORT, BAUDRATE, timeout=5)  # 尝试打开串口
        print(f"监听串口{PORT}...")
        start_time = time.time()
        buffer = bytearray()

        while True:
            byte = ser.read(1)
            if byte:
                buffer.extend(byte)

                # 当有足够的字节来读取长度字段时
                if len(buffer) >= 4:  # 假定长度字段为4字节
                    # 解析长度字段（网络序转换回主机序）
                    img_length = struct.unpack('>I', buffer[:4])[0]  # '>I' 表示大端序的无符号整数
                    if len(buffer) >= 4 + img_length:  # 当已接收到整个数据包
                        # 提取实际图片数据
                        img_data = buffer[4:4 + img_length]

                        # 保存图片数据
                        #save_to_file(img_data, f'image_{time.strftime("%Y%m%d%H%M%S")}.jpg')
                        save_to_file(img_data, './eyes.jpg')
                        print("图片保存完成.")

                        with open('./b.txt', 'w') as fp:
                            fp.write('1')
                        # 清除已处理的数据
                        buffer = buffer[4 + img_length:]  # 移除已处理的数据包
                        start_time = time.time()  # 重置计时器
            else:
                time.sleep(0.01)

    except serial.SerialException as e:
        print(f"串口打开失败: {e}")
    except KeyboardInterrupt:
        print("接收中断")
    finally:
        if ser:  # 检查ser是否已经被赋值
            ser.close()
        else:
            print("串口未打开，无需关闭。")


if __name__ == "__main__":
    main()

Import the required libraries:
•struct is used to handle the conversion between the byte stream and the Python data type, and here is the integer used to parse the network endianism.
• Serial is used for serial communication.
•Time is used for time-related operations such as timing and formatting timestamps.
Initialization:
•Open the file b.txt and write the initial value ‘0’, which may be used by an external process to check if a new image has been received.
•Define save_to_file function to save binary data to a specified file.
Main function main:
•Set the serial port name PORT and baud rate BAUDRATE, which should be modified according to the actual serial port and device configuration.
•Initialize the ser to None, and then try to open the specified serial port.
• Enter an infinite loop to continuously monitor serial port data.
•Read one byte at a time from the serial port and add it to the buffer.
•When the accumulated data is sufficient to read the length field (assuming the length field is 4 bytes, at the beginning of the packet), use struct.unpack to convert from network endianness to a Python integer to obtain the length of the image data.
•If the accumulated data is sufficient to form a complete data package (including length fields and image data), the image data is extracted and saved as a JPEG file using the save_to_file function (named ./eyes.jpg in the example, or dynamically named by timestamp).
•After the picture is saved, update the secondary file b.txt with a content of ‘1’, which may be used to indicate that the image has been received.
• Clear processed data, reset the timer, and continue listening for new data. •Exception handling: Includes serial port opening failures and user interruptions (Ctrl+C).
• Finally, make sure that even if you encounter an exception, you will try to close the serial port that has been opened.

检测图片

This script integrates a machine learning model, specifically YOLOv7, for real-time or on-demand object detection. It periodically checks a status file (b.txt) for a trigger (‘1’) to process an image (./eyes.jpg). Upon triggering, it loads the pre-trained model onto a CUDA-enabled GPU (if available), runs inference on the image, and filters the detection results to include only those detections that belong to class 0 (assuming this class represents a target object of interest) with confidence scores higher than 0.5. The filtered detections are then printed out, providing coordinates of bounding boxes around detected objects along with their confidence levels.

import torch
import bluetooth


def hqa():
    # Open the status file for reading
    with open("./b.txt", "r") as f:
        data = f.read()  # Read the content of the file

    return data

# Load the local machine learning model
device = torch.device("cuda")  # Specify GPU usage for model execution
model = torch.hub.load('D:/AI/yolov7-main', 'custom',  # Load custom model from local path
                       'D:\AI\yolov7-main\weights\last2.pt',
                       source='local', force_reload=False)  # Avoid reloading the model if already cached

# Continuous monitoring loop
while True:
    # Check the status file for the trigger to process an image
    if hqa() == '1':
        # Move the model to the designated computation device (GPU in this case)
        model = model.to(device)
        
        # Perform inference using the model on the image file './eyes.jpg'
        results = model('./eyes.jpg')
        
        # Filter the detection results: extract bounding boxes coordinates, classes, and confidence scores
        xmins = results.pandas().xyxy[0]['xmin']  # Minimum x-coordinates of bounding boxes
        ymins = results.pandas().xyxy[0]['ymin']  # Minimum y-coordinates of bounding boxes
        xmaxs = results.pandas().xyxy[0]['xmax']  # Maximum x-coordinates of bounding boxes
        ymaxs = results.pandas().xyxy[0]['ymax']  # Maximum y-coordinates of bounding boxes
        class_list = results.pandas().xyxy[0]['class']  # Class labels for detections
        confidences = results.pandas().xyxy[0]['confidence']  # Confidence scores for detections
        
        # Create a new list containing detections of a specific class (assuming class 0) with a confidence above 0.5
        newlist = []  # Initialize an empty list to store filtered detections
        for xmin, ymin, xmax, ymax, classitem, conf in zip(xmins, ymins, xmaxs, ymaxs, class_list, confidences):
            if classitem == 0 and conf > 0.5:  # Check if the detection is of class 0 and has sufficient confidence
                newlist.append([int(xmin), int(ymin), int(xmax), int(ymax), conf])  # Append filtered detection details
        
        # Output the filtered detections
        print(newlist)