Java调用TensorFlow库预测图片质量

最新推荐文章于 2025-02-24 18:11:31 发布

springzzj

最新推荐文章于 2025-02-24 18:11:31 发布

阅读量670

点赞数

文章标签： tensorflow java 深度学习

本文链接：https://blog.csdn.net/weixin_42566547/article/details/103988193

版权

概述

本文参考官方的java使用TensorFlow库的例子，将深度学习模型保存成pb文件，在java环境中加载模型并做预测。

环境安装

安装pip

yum -y install epel-release
yum install python-pip
pip install --upgrade pip

安装TensorFlow、Keras、numpy

pip install tensorflow  //安装的是最新的tensorflow2.1版本
pip install keras
pip install numpy

Maven配置

在pom.xml中增加如下配置，加载java的tensorflow库

<dependency>
    <groupId>org.tensorflow</groupId>
    <artifactId>tensorflow</artifactId>
    <version>1.15.0</version>
</dependency>

加载模型

InputStream inputStream = ImageRecognize.class.getResourceAsStream(MODEL_PATH);
Graph graph = new Graph();
graph.importGraphDef(IOUtils.toByteArray(inputStream));
Session session = new Session(graph);

其中graph和session都是线程安全的，可以使用单例，不需要每次请求都重新加载模型和Session。

图片预处理

我们使用的是Xception模型，它要求输入的图片大小是480 * 480 * 3，并且需要对图片做预处理，每个rgb值归一化到[-1,1]区间。下面我们介绍两种预处理的方式：

BufferedImage预处理

BufferedImage bufferedImage = new BufferedImage(480, 480, BufferedImage.TYPE_INT_RGB);
Graphics graphics = bufferedImage.getGraphics();

InputStream in = new ByteArrayInputStream(imageData);
Image srcImage = ImageIO.read(in);
graphics.drawImage(srcImage, 0, 0, 480, 480, null); //将图片大小转换为480*480

int w = bufferedImage.getWidth();
int h = bufferedImage.getHeight();
float[][][][] imgTensor = new float[1][h][w][3];
for (int i = 0; i < h; i++) {
     for (int j = 0; j < w; j++) {
              int pixel = bufferedImage.getRGB(j, i); // 下面三行代码将一个数字转换为RGB数字，同时归一化到[-1,1]区间
              imgTensor[0][i][j][0] = (float) ((pixel & 0xff0000) >> 16) / 127.5f - 1;
              imgTensor[0][i][j][1] = (float) ((pixel & 0xff00) >> 8) / 127.5f - 1;
              imgTensor[0][i][j][2] = (float) ((pixel & 0xff)) / 127.5f - 1;
      }
 }
return Tensors.create(imgTensor);

TensorFlow预处理

TensorFlow的预处理参考了LabelImage.java调用方式，它是使用TensorFlow Graph的一些预定义好的Operator来对图片做预处理。

  private Tensor<Float> getImageTensor(byte[] imageBytes){
      Graph g = new Graph();
      GraphBuilder b = new GraphBuilder(g);

      final int H = IMAGE_HEIGTH;
      final int W = IMAGE_WIDTH;
      final float mean = 1f;
      final float scale = 127.5f;

      final Output<String> input = b.constant("input", imageBytes);
      final Output<Float> output =
              b.sub(
                      b.div(
                              b.resizeBilinear(
                                      b.expandDims(
                                              b.cast(b.decodeJpeg(input, 3), Float.class), //解析jpeg文件
                                              b.constant("make_batch", 0) //扩展成4维Tensor
                                      ),
                                      b.constant("size", new int[]{H, W}) //resize图片成[H,W]大小
                              ),
                              b.constant("scale", scale) //每个值除以127.5f
                      ),
                      b.constant("mean", mean) //归一化到[-1,1]区间
              );
      try (Session s = new Session(g)) {
        // Generally, there may be multiple output tensors, all of them must be closed to prevent resource leaks.
        return s.runner().fetch(output.op().name()).run().get(0).expect(Float.class);
      }
  }

  static class GraphBuilder {
    GraphBuilder(Graph g) {
      this.g = g;
    }

    Output<Float> div(Output<Float> x, Output<Float> y) {
      return binaryOp("Div", x, y);
    }

    <T> Output<T> sub(Output<T> x, Output<T> y) {
      return binaryOp("Sub", x, y);
    }

    <T> Output<Float> resizeBilinear(Output<T> images, Output<Integer> size) {
      return binaryOp3("ResizeBilinear", images, size);
    }

    <T> Output<T> expandDims(Output<T> input, Output<Integer> dim) {
      return binaryOp3("ExpandDims", input, dim);
    }

    <T, U> Output<U> cast(Output<T> value, Class<U> type) {
      DataType dtype = DataType.fromClass(type);
      return g.opBuilder("Cast", "Cast")
              .addInput(value)
              .setAttr("DstT", dtype)
              .build()
              .<U>output(0);
    }

    Output<UInt8> decodeJpeg(Output<String> contents, long channels) {
      return g.opBuilder("DecodeJpeg", "DecodeJpeg")
              .addInput(contents)
              .setAttr("channels", channels)
              .build()
              .<UInt8>output(0);
    }

    <T> Output<T> constant(String name, Object value, Class<T> type) {
      try (Tensor<T> t = Tensor.<T>create(value, type)) {
        return g.opBuilder("Const", name)
                .setAttr("dtype", DataType.fromClass(type))
                .setAttr("value", t)
                .build()
                .<T>output(0);
      }
    }
    Output<String> constant(String name, byte[] value) {
      return this.constant(name, value, String.class);
    }

    Output<Integer> constant(String name, int value) {
      return this.constant(name, value, Integer.class);
    }

    Output<Integer> constant(String name, int[] value) {
      return this.constant(name, value, Integer.class);
    }

    Output<Float> constant(String name, float value) {
      return this.constant(name, value, Float.class);
    }

    private <T> Output<T> binaryOp(String type, Output<T> in1, Output<T> in2) {
      return g.opBuilder(type, type).addInput(in1).addInput(in2).build().<T>output(0);
    }

    private <T, U, V> Output<T> binaryOp3(String type, Output<U> in1, Output<V> in2) {
      return g.opBuilder(type, type).addInput(in1).addInput(in2).build().<T>output(0);
    }
    private Graph g;
  }

上面两种方式都做了尝试，我们是原始图片大小为640 * 640，机器是2.4GHz的CPU机器（没有用GPU），第一种预处理方法平均耗时在200ms左右，第二种预处理方法耗时为30ms左右，主要原因为TensorFlow内部对矩阵运算会做优化，而第一种方法手写的循环效率不高。后续会再测一下在GPU环境下的耗时。

模型预测

我们的xception模型中，输入节点的名字为input_1，输出节点的名字为output，对应着代码里的名字，需要完全一致。

    float result = -1;
    input = getImageTensor1(imageData);
    if ( input == null ) {
      return result;
    }

    List<Tensor<?>> results = session.runner().feed("input_1", input).fetch("output").run();
    if (results.size() > 0 && results.get(0).shape().length == 2) {
      long[] rshape = results.get(0).shape();
      int rs = (int) rshape[0];
      int rt = (int) rshape[1];
      float realResult[][] = new float[rs][rt];

      results.get(0).copyTo(realResult);
      for (int i = 0; i < rs; i++) {
        for (int j = 0; j < rt; j++) {
          result = realResult[i][j];
          break;
        }
      }
    }