[RPI.CM4] Coral TPU Accelerator-CSDN博客

本文链接：https://blog.csdn.net/chubohao1218/article/details/122645734

官网链接：https://coral.ai/products/accelerator/

1. Edge TPU runtime

A. Add Debian package repository to system:

echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

sudo apt update

B. Install the Edge TPU runtime:

sudo apt install libedgetpu1-std

sudo apt remove  python3-apt
sudo apt install python3-apt

2. PyCoral Library

PyCoral是一个建立在TensorFlow Lite库之上的Python库，以加快你的开发速度，并为Edge TPU提供额外的功能。

A. Install Pycoral on Linux

sudo apt-get install python3-pycoral

# or with pip

python3 -m pip install --extra-index-url https://google-coral.github.io/py-repo/ pycoral~=2.0

3. 测试

A.下载案例代码

mkdir coral && cd coral

git clone https://github.com/google-coral/pycoral.git

cd pycoral

B. 下载模型，标签和图片

bash examples/install_requirements.sh classify_image.py

C. 运行图片分类器

python3 examples/classify_image.py \
--model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
--labels test_data/inat_bird_labels.txt \
--input test_data/parrot.jpg

4. 加速自己的模型

A. 将tensorflow saved model 转成 .tflite model(整形量化)

train_dataset = reader.train_reader_tfrecord(
    data_path=config.trainpath,
    num_epochs=1,
    batch_Size=128)


for batch_idx, data_batch in enumerate(train_dataset):
    global data
    data = data_batch['data'].numpy().reshape(-1, 128, 54, 1)
    break

def representative_data_gen():
    global data
    for input_value in tf.data.Dataset.from_tensor_slices(data).batch(1).take(100):
        yield [input_value]

converter = tf.lite.TFLiteConverter.from_saved_model("saved_model/")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
# Ensure that if any ops can't be quantized, the converter throws an error
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set the input and output tensors to uint8 (APIs added in r2.3)
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model_quant = converter.convert()


# Save the model.
tflite_models_dir = pathlib.Path("tflite_models/")
tflite_models_dir.mkdir(exist_ok=True, parents=True)

# Save the quantized model:
tflite_model_quant_file = tflite_models_dir/"model_quant.tflite"
tflite_model_quant_file.write_bytes(tflite_model_quant)
print("OK")

B. 将量化的.tflite model 转成TPU用的edgetpu.tflite model

使用edgetpu-compiler软件，不过该软件已经不支持ARM，可以使用网络转换器

https://colab.research.google.com/github/google-coral/tutorials/blob/master/compile_for_edgetpu.ipynb#scrollTo=x47uW_lI1DoV

C. 加载并推理

有时候无法使用TPU，可能是因为Edge TPU runtime的版本不匹配（太新了），https://coral.ai/docs/edgetpu/compiler/#compiler-and-runtime-versions

apt list | grep libedgetpu1-std

从官网下载其他版本的Edge TPU runtime并安装，Software | Coral

# Load the TFLite model and allocate tensors.
interpreter = tflite.Interpreter(model_path="../tflite_model/model_quant_edgetpu.tflite",
  experimental_delegates=[tflite.load_delegate('libedgetpu.so.1.0')])
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
        output_details = interpreter.get_output_details()

        # 260ms
        merge_feature = tfLite.feature(audio_data, dis_data, mpu_x, mpu_y, mpu_z)

        # Test the model on random input data.
        # RPI4 USB3.0
        # unit8    45ms without tpu, 20ms with tpu
        # float32, 55ms without tpu

        # CM4 USB2.0
        # unit8 44ms without tpu, 77ms with tpu
        # float 48ms without tpu

        interpreter.set_tensor(input_details[0]['index'], merge_feature.astype(dtype=np.uint8).reshape(1, 128, 54, 1))
        interpreter.invoke()
        # The function `get_tensor()` returns a copy of the tensor data.
        # Use `tensor()` in order to get a pointer to the tensor.
        output_data = interpreter.get_tensor(output_details[0]['index'])

运行以上代码，如果USB TPU的灯闪烁，说明可以正常运行。