学习yolo+Java+opencv简单案例（三）

乄bluefox

已于 2024-08-21 22:19:10 修改

阅读量410

点赞数 6

分类专栏： java 文章标签：学习 YOLO opencv

于 2024-08-21 22:17:21 首次发布

本文链接：https://blog.csdn.net/qq_73440769/article/details/141403209

版权

java 专栏收录该内容

31 篇文章 1 订阅

订阅专栏

主要内容：车牌检测+识别（什么颜色的车牌，车牌号）

模型作用：车牌检测，车牌识别

文章的最后附上我的源码地址。

学习还可以参考我前两篇博客：

学习yolo+Java+opencv简单案例（一）-CSDN博客

学习yolo+Java+opencv简单案例（二）-CSDN博客

一、模型示例

二、解释流程

1、加载opencv库

static {
    // 加载opencv动态库，
    nu.pattern.OpenCV.loadLocally();
}

2、定义车牌颜色和字符

final static String[] PLATE_COLOR = new String[]{"黑牌", "蓝牌", "绿牌", "白牌", "黄牌"};
final static String PLATE_NAME= "#京沪津渝冀晋蒙辽吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品";

3、模型路径和图片路径

// 车牌检测模型
String model_path1 = "./PlateDetection/src/main/resources/model/plate_detect.onnx";

// 车牌识别模型
String model_path2 = "./PlateDetection/src/main/resources/model/plate_rec_color.onnx";

// 要检测的图片所在目录
String imagePath = "./yolo-common/src/main/java/com/bluefoxyu/carImg";

// 定义保存目录
String outputDir = "./PlateDetection/output/";

4、创建目录

File directory = new File(outputDir);
if (!directory.exists()) {
    directory.mkdirs();  // 创建目录
    System.out.println("目录不存在，已创建：" + outputDir);
}

5、加载onnx模型

// 加载ONNX模型
OrtEnvironment environment = OrtEnvironment.getEnvironment();
OrtSession.SessionOptions sessionOptions = new OrtSession.SessionOptions();
OrtSession session = environment.createSession(model_path1, sessionOptions);

// 加载ONNX模型
OrtEnvironment environment2 = OrtEnvironment.getEnvironment();
OrtSession session2 = environment2.createSession(model_path2, sessionOptions);

使用 ONNX Runtime 加载两个模型，一个用于车牌检测，另一个用于车牌识别。

6、车牌检测和识别

（1）处理每一张图片：

Map<String, String> map = getImagePathMap(imagePath);
for(String fileName : map.keySet()){
    // 处理图片逻辑
}

（2）读取和预处理图片：

Mat img = Imgcodecs.imread(imageFilePath);
Mat image = img.clone();
Imgproc.cvtColor(image, image, Imgproc.COLOR_BGR2RGB);

使用 OpenCV 读取图片并进行颜色空间转换（从 BGR 转换为 RGB）。

（3）图像尺寸调整

Letterbox letterbox = new Letterbox();
image = letterbox.letterbox(image);

使用 Letterbox 方法调整图像的尺寸，使其适应模型的输入要求。

（4）模型推理（车牌检测）

float[] chw = ImageUtil.whc2cwh(whc);
OnnxTensor tensor = OnnxTensor.createTensor(environment, FloatBuffer.wrap(chw),shape );
OrtSession.Result output = session.run(stringOnnxTensorHashMap);

将预处理后的图像转换为 ONNX Tensor 格式，并使用车牌检测模型进行推理。

（5）非极大值抑制

List<float[]> bboxes = nonMaxSuppression(bboxes, nmsThreshold);

使用非极大值抑制（NMS）来过滤重叠的检测框，只保留最有可能的车牌位置。

（6）车牌裁剪和识别

Mat image2 = new Mat(img.clone(), rect);
// 车牌识别模型推理
OrtSession.Result output2 = session2.run(stringOnnxTensorHashMap2);
String plateNo = decodePlate(maxScoreIndex(result[0]));

使用检测到的车牌位置，裁剪出车牌区域，再使用车牌识别模型进行识别，得到车牌号码和颜色。

（7）结果展示和保存

BufferedImage bufferedImage = matToBufferedImage(img);
Graphics2D g2d = bufferedImage.createGraphics();
g2d.drawString(PLATE_COLOR[colorRResult[0].intValue()]+"-"+plateNo, (int)((bbox[0]-dw)/ratio), (int)((bbox[1]-dh)/ratio-3));
ImageIO.write(bufferedImage, "jpg", new File(outputPath));

7、辅助方法

xywh2xyxy: 将中心点坐标转换为边框坐标。

nonMaxSuppression: 实现非极大值抑制。

matToBufferedImage: 将 OpenCV 的 Mat 对象转换为 Java 的 BufferedImage，以便进行图像处理。

argmax: 找出最大值的索引，用于推理时的结果处理。

softMax: 实现 softmax 函数，用于将输出转化为概率分布。

decodePlate 和 decodeColor: 用于解码模型输出的车牌字符和颜色。

三、代码实现

1、pom.xml

yolo-study：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.bluefoxyu</groupId>
    <artifactId>yolo-study</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>pom</packaging>
    <modules>
        <module>predict-test</module>
        <module>CameraDetection</module>
        <module>yolo-common</module>
        <module>CameraDetectionWarn</module>
        <module>PlateDetection</module>
        <module>dp</module>
    </modules>

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.microsoft.onnxruntime</groupId>
            <artifactId>onnxruntime</artifactId>
            <version>1.16.1</version>
        </dependency>
        <dependency>
            <groupId>org.openpnp</groupId>
            <artifactId>opencv</artifactId>
            <version>4.7.0-0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.projectlombok/lombok -->
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.34</version>
            <scope>provided</scope>
        </dependency>

    </dependencies>

</project>

PlateDetection：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>com.bluefoxyu</groupId>
        <artifactId>yolo-study</artifactId>
        <version>1.0-SNAPSHOT</version>
    </parent>

    <artifactId>PlateDetection</artifactId>

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.bluefoxyu</groupId>
            <artifactId>yolo-common</artifactId>
            <version>1.0-SNAPSHOT</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-web -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <version>3.2.4</version>
        </dependency>
    </dependencies>



</project>

2、controller

@RestController
@RequestMapping("/plate-detect")
public class PlateDetectionController {

    static {
        // 加载opencv动态库，
        //System.load(ClassLoader.getSystemResource("lib/opencv_videoio_ffmpeg470_64.dll").getPath());
        nu.pattern.OpenCV.loadLocally();
    }

    final static String[] PLATE_COLOR = new String[]{"黑牌", "蓝牌", "绿牌", "白牌", "黄牌"};
    final static String PLATE_NAME= "#京沪津渝冀晋蒙辽" +
            "吉黑苏浙皖闽赣鲁豫鄂湘粤桂琼川贵云藏陕甘青宁新学警港澳挂使领民航危0123456789ABCDEFGHJKLMNPQRSTUVWXYZ险品";

    @PostMapping
    public List<String> plateDetection() throws OrtException {

        //记录车牌号返回前端
        List<String>plateNumberList=new ArrayList<>();

        // 车牌检测模型
        String model_path1 = "./PlateDetection/src/main/resources/model/plate_detect.onnx";

        // 车牌识别模型
        String model_path2 = "./PlateDetection/src/main/resources/model/plate_rec_color.onnx";

        // 要检测的图片所在目录
        String imagePath = "./yolo-common/src/main/java/com/bluefoxyu/carImg";

        // 定义保存目录
        String outputDir = "./PlateDetection/output/";

        // 用于区别后续各个图片的标识
        int index=0;

        // 创建目录（如果不存在）
        File directory = new File(outputDir);
        if (!directory.exists()) {
            directory.mkdirs();  // 创建目录
            System.out.println("目录不存在，已创建：" + outputDir);
        }


        float confThreshold = 0.35F;

        float nmsThreshold = 0.45F;

        // 1.单行蓝牌
        // 2.单行黄牌
        // 3.新能源车牌
        // 4.白色警用车牌
        // 5.教练车牌
        // 6.武警车牌
        // 7.双层黄牌
        // 8.双层白牌
        // 9.使馆车牌
        // 10.港澳粤Z牌
        // 11.双层绿牌
        // 12.民航车牌
        String[] labels = {"1", "2", "3", "4", "5", "6", "7","8", "9", "10", "11","12"};

        // 加载ONNX模型
        OrtEnvironment environment = OrtEnvironment.getEnvironment();
        OrtSession.SessionOptions sessionOptions = new OrtSession.SessionOptions();
        OrtSession session = environment.createSession(model_path1, sessionOptions);

        // 加载ONNX模型
        OrtEnvironment environment2 = OrtEnvironment.getEnvironment();
        OrtSession session2 = environment2.createSession(model_path2, sessionOptions);

        // 加载标签及颜色
        ODConfig odConfig = new ODConfig();
        Map<String, String> map = getImagePathMap(imagePath);
        for(String fileName : map.keySet()){

            // 生成输出文件名
            index++;
            String outputFileName = "temp_output_image" + "_" + index + ".jpg";
            String outputPath = outputDir + outputFileName;
            System.out.println("outputPath = " + outputPath);

            String imageFilePath = map.get(fileName);
            System.out.println(imageFilePath);
            // 读取 image
            Mat img = Imgcodecs.imread(imageFilePath);
            Mat image = img.clone();
            Imgproc.cvtColor(image, image, Imgproc.COLOR_BGR2RGB);

            // 在这里先定义下框的粗细、字的大小、字的类型、字的颜色(按比例设置大小粗细比较好一些)
            int minDwDh = Math.min(img.width(), img.height());
            int thickness = minDwDh/ODConfig.lineThicknessRatio;
            long start_time = System.currentTimeMillis();
            // 更改 image 尺寸
            Letterbox letterbox = new Letterbox();
            image = letterbox.letterbox(image);

            double ratio  = letterbox.getRatio();
            double dw = letterbox.getDw();
            double dh = letterbox.getDh();
            int rows  = letterbox.getHeight();
            int cols  = letterbox.getWidth();
            int channels = image.channels();

            image.convertTo(image, CvType.CV_32FC1, 1. / 255);
            float[] whc = new float[3 * 640 * 640];
            image.get(0, 0, whc);
            float[] chw = ImageUtil.whc2cwh(whc);

            // 创建OnnxTensor对象
            long[] shape = { 1L, (long)channels, (long)rows, (long)cols };
            OnnxTensor tensor = OnnxTensor.createTensor(environment, FloatBuffer.wrap(chw),shape );
            HashMap<String, OnnxTensor> stringOnnxTensorHashMap = new HashMap<>();
            stringOnnxTensorHashMap.put(session.getInputInfo().keySet().iterator().next(), tensor);

            // 运行推理
            OrtSession.Result output = session.run(stringOnnxTensorHashMap);
            float[][] outputData = ((float[][][])output.get(0).getValue())[0];
            Map<Integer, List<float[]>> class2Bbox = new HashMap<>();
            for (float[] bbox : outputData) {
                float score = bbox[4];
                if (score < confThreshold) continue;

                float[] conditionalProbabilities = Arrays.copyOfRange(bbox, 5, bbox.length);
                int label = argmax(conditionalProbabilities);

                // xywh to (x1, y1, x2, y2)
                xywh2xyxy(bbox);

                // 去除无效结果
                if (bbox[0] >= bbox[2] || bbox[1] >= bbox[3]) continue;

                class2Bbox.putIfAbsent(label, new ArrayList<>());
                class2Bbox.get(label).add(bbox);
            }

            List<CarDetection> CarDetections = new ArrayList<>();
            for (Map.Entry<Integer, List<float[]>> entry : class2Bbox.entrySet()) {

                List<float[]> bboxes = entry.getValue();
                bboxes = nonMaxSuppression(bboxes, nmsThreshold);
                for (float[] bbox : bboxes) {
                    String labelString = labels[entry.getKey()];
                    CarDetections.add(new CarDetection(labelString,entry.getKey(), Arrays.copyOfRange(bbox, 0, 4), bbox[4],bbox[13] == 0,0.0f,null,null));
                }
            }


            for (CarDetection carDetection : CarDetections) {
                float[] bbox = carDetection.getBbox();

                Rect rect = new Rect(new Point((bbox[0]-dw)/ratio, (bbox[1]-dh)/ratio), new Point((bbox[2]-dw)/ratio, (bbox[3]-dh)/ratio));
                // img.submat(rect)
                Mat image2 = new Mat(img.clone(), rect);
                Imgproc.cvtColor(image2, image2, Imgproc.COLOR_BGR2RGB);
                Letterbox letterbox2 = new Letterbox(168,48);
                image2 = letterbox2.letterbox(image2);

                double ratio2  = letterbox2.getRatio();
                double dw2 = letterbox2.getDw();
                double dh2 = letterbox2.getDh();
                int rows2  = letterbox2.getHeight();
                int cols2  = letterbox2.getWidth();
                int channels2 = image2.channels();

                image2.convertTo(image2, CvType.CV_32FC1, 1. / 255);
                float[] whc2 = new float[3 * 168 * 48];
                image2.get(0, 0, whc2);
                float[] chw2 = ImageUtil.whc2cwh(whc2);

                // 创建OnnxTensor对象
                long[] shape2 = { 1L, (long)channels2, (long)rows2, (long)cols2 };
                OnnxTensor tensor2 = OnnxTensor.createTensor(environment2, FloatBuffer.wrap(chw2), shape2);
                HashMap<String, OnnxTensor> stringOnnxTensorHashMap2 = new HashMap<>();
                stringOnnxTensorHashMap2.put(session2.getInputInfo().keySet().iterator().next(), tensor2);

                // 运行推理
                OrtSession.Result output2 = session2.run(stringOnnxTensorHashMap2);
                float[][][] result = (float[][][]) output2.get(0).getValue();
                String plateNo = decodePlate(maxScoreIndex(result[0]));
                System.err.println("车牌号码："+plateNo);
                plateNumberList.add(plateNo);
                //车牌颜色识别
                float[][] color = (float[][]) output2.get(1).getValue();
                double[] colorSoftMax = softMax(floatToDouble(color[0]));
                Double[] colorRResult = decodeColor(colorSoftMax);
                carDetection.setPlateNo(plateNo);
                carDetection.setPlateColor( PLATE_COLOR[colorRResult[0].intValue()]);

                Point topLeft = new Point((bbox[0]-dw)/ratio, (bbox[1]-dh)/ratio);
                Point bottomRight = new Point((bbox[2]-dw)/ratio, (bbox[3]-dh)/ratio);
                Imgproc.rectangle(img, topLeft, bottomRight, new Scalar(0,255,0), thickness);
                // 框上写文字
                BufferedImage bufferedImage = matToBufferedImage(img);
                Point boxNameLoc = new Point((bbox[0]-dw)/ratio, (bbox[1]-dh)/ratio-3);
                Graphics2D g2d = bufferedImage.createGraphics();
                g2d.setFont(new Font("微软雅黑", Font.PLAIN, 20));
                g2d.setColor(Color.RED);
                g2d.drawString(PLATE_COLOR[colorRResult[0].intValue()]+"-"+plateNo, (int)((bbox[0]-dw)/ratio), (int)((bbox[1]-dh)/ratio-3)); // 假设的文本位置
                g2d.dispose();

                try {
                    ImageIO.write(bufferedImage, "jpg", new File(outputPath));
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }

            }
            System.out.printf("time：%d ms.", (System.currentTimeMillis() - start_time));

            System.out.println();



            // 弹窗展示图像
            HighGui.imshow("Display Image", Imgcodecs.imread(outputPath));
            // 按任意按键关闭弹窗画面，结束程序
            HighGui.waitKey();

        }

        HighGui.destroyAllWindows();
        //System.exit(0);
        System.out.println("识别完成，车牌信息："+plateNumberList);
        return plateNumberList;

    }

    public static void xywh2xyxy(float[] bbox) {
        float x = bbox[0];
        float y = bbox[1];
        float w = bbox[2];
        float h = bbox[3];

        bbox[0] = x - w * 0.5f;
        bbox[1] = y - h * 0.5f;
        bbox[2] = x + w * 0.5f;
        bbox[3] = y + h * 0.5f;
    }

    public static List<float[]> nonMaxSuppression(List<float[]> bboxes, float iouThreshold) {

        List<float[]> bestBboxes = new ArrayList<>();

        bboxes.sort(Comparator.comparing(a -> a[4]));

        while (!bboxes.isEmpty()) {
            float[] bestBbox = bboxes.remove(bboxes.size() - 1);
            bestBboxes.add(bestBbox);
            bboxes = bboxes.stream().filter(a -> computeIOU(a, bestBbox) < iouThreshold).collect(Collectors.toList());
        }

        return bestBboxes;
    }


    // 单纯为了显示中文演示使用，实际项目中用不到这个
    public static BufferedImage matToBufferedImage(Mat mat) {
        int type = BufferedImage.TYPE_BYTE_GRAY;
        if (mat.channels() > 1) {
            type = BufferedImage.TYPE_3BYTE_BGR;
        }
        int bufferSize = mat.channels() * mat.cols() * mat.rows();
        byte[] b = new byte[bufferSize];
        mat.get(0, 0, b); // 获取所有像素数据
        BufferedImage image = new BufferedImage(mat.cols(), mat.rows(), type);
        final byte[] targetPixels = ((DataBufferByte) image.getRaster().getDataBuffer()).getData();
        System.arraycopy(b, 0, targetPixels, 0, b.length);
        return image;
    }

    private static int[] maxScoreIndex(float[][] result){
        int[] indexes = new int[result.length];
        for (int i = 0; i < result.length; i++){
            int index = 0;
            float max = Float.MIN_VALUE;
            for (int j = 0; j < result[i].length; j++) {
                if (max < result[i][j]){
                    max = result[i][j];
                    index = j;
                }
            }
            indexes[i] = index;
        }
        return indexes;
    }

    public static float computeIOU(float[] box1, float[] box2) {

        float area1 = (box1[2] - box1[0]) * (box1[3] - box1[1]);
        float area2 = (box2[2] - box2[0]) * (box2[3] - box2[1]);

        float left = Math.max(box1[0], box2[0]);
        float top = Math.max(box1[1], box2[1]);
        float right = Math.min(box1[2], box2[2]);
        float bottom = Math.min(box1[3], box2[3]);

        float interArea = Math.max(right - left, 0) * Math.max(bottom - top, 0);
        float unionArea = area1 + area2 - interArea;
        return Math.max(interArea / unionArea, 1e-8f);

    }

    private static Double[] decodeColor(double[] indexes){
        double index = -1;
        double max = Double.MIN_VALUE;
        for (int i = 0; i < indexes.length; i++) {
            if (max < indexes[i]){
                max = indexes[i];
                index = i;
            }
        }
        return new Double[]{index, max};
    }



    public static double [] floatToDouble(float[] input){
        if (input == null){
            return null;
        }
        double[] output = new double[input.length];
        for (int i = 0; i < input.length; i++){
            output[i] = input[i];
        }
        return output;
    }

    private static String decodePlate(int[] indexes){
        int pre = 0;
        StringBuffer sb = new StringBuffer();
        for(int index : indexes){
            if(index != 0 && pre != index){
                sb.append(PLATE_NAME.charAt(index));
            }
            pre = index;
        }
        return sb.toString();
    }

    //返回最大值的索引
    public static int argmax(float[] a) {
        float re = -Float.MAX_VALUE;
        int arg = -1;
        for (int i = 0; i < a.length; i++) {
            if (a[i] >= re) {
                re = a[i];
                arg = i;
            }
        }
        return arg;
    }


    public static double[] softMax(double[] tensor){
        if(Arrays.stream(tensor).max().isPresent()){
            double maxValue = Arrays.stream(tensor).max().getAsDouble();
            double[] value = Arrays.stream(tensor).map(y-> Math.exp(y - maxValue)).toArray();
            double total = Arrays.stream(value).sum();
            return Arrays.stream(value).map(p -> p/total).toArray();
        }else{
            throw new NoSuchElementException("No value present");
        }
    }
    public static Map<String, String> getImagePathMap(String imagePath){
        Map<String, String> map = new TreeMap<>();
        File file = new File(imagePath);
        if(file.isFile()){
            map.put(file.getName(), file.getAbsolutePath());
        }else if(file.isDirectory()){
            for(File tmpFile : Objects.requireNonNull(file.listFiles())){
                map.putAll(getImagePathMap(tmpFile.getPath()));
            }
        }
        return map;
    }


}

剩余一些工具类代码可以看文章最后GitHub地址