1 TensorRT
https://developer.nvidia.com/tensorrt
developer guide & sample code
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html
2 Caffe
code:
http://caffe.berkeleyvision.org/doxygen/hierarchy.html
3 Nvidia Cuda
http://www.nvidia.cn/object/cuda_education_cn_old.html
串行并行对比
https://www.nvidia.com/object/nvision08_gpu_v_cpu.html
4 TensorRT Introducing GPU Inference Engine
https://devblogs.nvidia.com/production-deep-learning-nvidia-gpu-inference-engine/
5 AI 课程
https://www.nvidia.com/en-us/deep-learning-ai/education/
GIE Build Phase
The GIE runtime needs three files to deploy a classification neural network:
- a network architecture file (deploy.prototxt),
- trained weights (net.caffemodel), and
- a label file to provide a name for each output class.
In addition, you must define the batch size and the output layer. Code Listing 1 illustrates how to convert a Caffe model to a GIE object. The builder (lines 4-7) is responsible for reading the network information. Alternatively, you can use the builder to define the network information if you don’t provide a network architecture file (deploy.prototxt).
GIE supports the following layer types.
- Convolution: 2D
- Activation: ReLU, tanh and sigmoid
- Pooling: max and average
- ElementWise: sum, product or max of two tensors
- LRN: cross-channel only
- Fully-connected: with or without bias
- SoftMax: cross-channel only
- Deconvolution
- IBuilder* builder = createInferBuilder(gLogger);
- // parse the caffe model to populate the network, then set the outputs
- INetworkDefinition* network = builder->createNetwork();
- CaffeParser parser;
- auto blob_name_to_tensor = parser.parse(“deploy.prototxt”,
- trained_file.c_str(),
- *network,
- DataType::kFLOAT);
- // specify which tensors are outputs
- network->markOutput(*blob_name_to_tensor->find("prob"));
- // Build the engine
- builder->setMaxBatchSize(1);
- builder->setMaxWorkspaceSize(1 << 30);
- ICudaEngine* engine = builder->buildCudaEngine(*network);