1 TensorRT

developer guide & sample code

2 Caffe


3 Nvidia Cuda


4 TensorRT Introducing GPU Inference Engine

5 AI 课程

GIE Build Phase

The GIE runtime needs three files to deploy a classification neural network:

  1. a network architecture file (deploy.prototxt),
  2. trained weights (net.caffemodel), and
  3. a label file to provide a name for each output class.

In addition, you must define the batch size and the output layer. Code Listing 1 illustrates how to convert a Caffe model to a GIE object. The builder (lines 4-7) is responsible for reading the network information. Alternatively, you can use the builder to define the network information if you don’t provide a network architecture file (deploy.prototxt).

GIE supports the following layer types.

  • Convolution: 2D
  • Activation: ReLU, tanh and sigmoid
  • Pooling: max and average
  • ElementWise: sum, product or max of two tensors
  • LRN: cross-channel only
  • Fully-connected: with or without bias
  • SoftMax: cross-channel only
  • Deconvolution
  1. IBuilder* builder = createInferBuilder(gLogger);
  3. // parse the caffe model to populate the network, then set the outputs
  4. INetworkDefinition* network = builder->createNetwork();
  6. CaffeParser parser;
  7. auto blob_name_to_tensor = parser.parse(“deploy.prototxt”,
  8. trained_file.c_str(),
  9. *network,
  10. DataType::kFLOAT);
  12. // specify which tensors are outputs
  13. network->markOutput(*blob_name_to_tensor->find("prob"));
  15. // Build the engine
  16. builder->setMaxBatchSize(1);
  17. builder->setMaxWorkspaceSize(1 << 30);
  18. ICudaEngine* engine = builder->buildCudaEngine(*network);