Caffe2的一些关键设计思想:
- 所有运算都抽象为Operator。
- Blob和Tensor的概念。
- Blob和Net都存放在Workspace中,一个Workspace中可以有多个Net,这些Net中使用到的相同名称的Blob实际对应于这个Workspace中的同一个Blob。
- 将Net转化为DAG运算图,从而多线程实现并行运算。
- 每个Operator绑定在一个Context上,这个Operator创建的Tensor分配的内存都在这个Context上。
- Net,Blob,Tensor不具体绑定Context,与Context的绑定只在Operator上实现。
- Memonger分析Blob重用,节约内存。
具体代码分析
caffe2/core
typeid.h, typeid.cc, types.h, types.cc
CaffeTypeId
TypeNameRegisterer, UninitializedTypeNameRegisterer
TypeMeta: PlacementNew, TypedCopy, TypedDestructor
CAFFE_KNOWN_TYPE
StorageOrder
allocator.h,allocator.cc
Allocator:CPUAllocator,MemoryDeleter,MemoryAllocationReporter,DefaultCPUAllocator,
blob.h, blob_serializer_base.h, blob_serialization.h, blob_serialization.cc, blob_serialization_gpu.cc
Blob,
BlobSerializerBase, TensorSerializer
BlobDeserializerBase, TensorDeserializer,
BlobStatGetter, BlobStatRegistry
common.h, common.cc, common_gpu.h, common_gpu.cc, common_omp.h
TIndex, CaffeMap,
DeviceGuard
common_cudnn.h, common_cudnn.cc, cudnn_wrappers.h
cudnnTypeWrapper,
cudnnTensorDescWrapper, cudnnFilterDescWrapper,
CuDNNWrapper, CuDNNWorkspace, CuDNNState,
context.h, context_gpu.h, context_gpu.cu
CPUContext,
CudaMemoryPoolType, ThreadLocalCUDAObjects, CUDAContext
PinnedCPUAllocator
TensorCUDA
Caffe2CudaInitializerHelper
db.h, db.cc
Cursor,Transaction,DB,DBReader,
event.h, event.cc, event_gpu.cc
Event
EventCreateFunction, EventRecordFunction, EventWaitFunction, EventFinishFunction
EventCreateFunctionRegisterer, EventRecordFunctionRegisterer, EventWaitFunctionRegisterer, EventFinishFunctionRegisterer
REGISTER_EVENT_CREATE_FUNCTION, REGISTER_EVENT_RECORD_FUNCTION, REGISTER_EVENT_WAIT_FUNCTION, REGISTER_EVENT_FINISH_FUNCTION
CudaEventWrapper
flags.h, flags.cc
graph.h, graph.cc, transform.h, transform.cc
Node, Graph,
Transform
init.h, init.cc, init_omp.cc
Caffe2InitializeRegistry, InitRegisterer
REGISTER_CAFFE2_INIT_FUNCTION, REGISTER_CAFFE2_EARLY_INIT_FUNCTION
logging.h, logging.cc, logging_is_google_glog.h, logging_is_not_google_glog.h
macros.h, macros.h.in
memonger.h, memonger.cc
optimize_inference_net
compute_blob_recycling_for_dag
module.h, module.cc
ModuleSchema
net.h, net.cc,
net_simple.h, net_simple.cc, net_simple_async.h, net_simple_async.cc,
net_dag.h, net_dag.cc, net_async_dag_gpu.h, net_async_dag_gpu.cc
NetBase, SimpleNet, AsyncSimpleNet,
OperatorNode, OpGraphNode,
DAGNetBase, DAGNet, AsyncDAGNet
observer.h
ObserverBase, Observable
operator.h, operator.cc, operator_schema.h, operator_schema.cc, operator_gradient.h
OperatorBase, Operator,
CPUOperatorRegistry, CUDAOperatorRegistry, GradientRegistry,
GradientWrapper, GradientOpsMeta, GradientMakerBase, NoGradient,
OpSchema, OpSchemaRegistry
plan_executor.h, plan_executor.cc
predictor.h, predictor.cc
tensor.h, tensor.cc, qtensor.h, qtensor.cc, qtensor_serialization.h, qtensor_serialization.cc
Tensor, TensorCPU, TensorPrinter, TensorCPUStatGetter
QTensor
registry.h
Registry, Registerer
scope_guard.h
static_tracepoint.h, static_tracepoint.elfx86.h
stats.h, stats.cc
timer.h
Timer
workspace.h, workspace.cc
StopOnSignal, Workspace
cudnnTypeWrapper,
cudnnTensorDescWrapper, cudnnFilterDescWrapper,