tensorflow超参数优化_tensorflow优化

最新推荐文章于 2024-10-08 14:46:34 发布

杨祥子

最新推荐文章于 2024-10-08 14:46:34 发布

阅读量525

点赞数

文章标签： tensorflow超参数优化

本文链接：https://blog.csdn.net/weixin_42226446/article/details/112293211

版权

本文探讨了TensorFlow模型的优化方法，包括增加并发量优化算子计算图，利用tf.data提高数据流处理效率，以及GPipe进行大规模模型训练。在部署方面，介绍了Graph Transform Tool (GTT)的常用优化选项，如删除训练特定操作，折叠Batch Normalization，以及TensorRT库带来的性能提升。此外，还提到了模型量化工具和TF-TRT解决不支持算子的问题。

摘要由CSDN通过智能技术生成

1 常用优化

　　官方提供的优化请参考官方文档:https://tensorflow.google.cn/guide/performance/overview?hl=zh-cn，列出的都是些最基本的常用优化方法，效果明显，更细的优化就不具体介绍了，像算子粒度的优化．

1.1 增加并发量提高算子计算图

　　提高算子和算子间的并发量对提高性能最明显，简单有效，是最常用的优化方法，尤其对CPU计算来说，下面给出三个修改并发量的参数，CPU计算时一定不要忘记这三个参数．

config.proto

// Map from device type name (e.g., "CPU" or "GPU" ) to maximum
// number of devices of that type to use. If a particular device
// type is not found in the map, the system picks an appropriate
// number.
map<string, int32> device_count = 1;

// The execution of an individual op (for some op types) can be
// parallelized on a pool of intra_op_parallelism_threads.
// 0 means the system picks an appropriate number.
int32 intra_op_parallelism_threads = 2;

// Nodes that perform blocking operations are enqueued on a pool of
// inter_op_parallelism_threads available in each process.
//
// 0 means the system picks an appropriate number.
//
// Note that the first Session created in the process sets the
// number of threads for all future sessions unless use_per_session_threads is
// true or session_inter_op_thread_pool is configured.
int32 inter_op_parallelism_threads = 5;
解释：
1)device_count, 告诉tf Session使用CPU数量上限，如果你的CPU数量较多，可以适当加大这个值;

2) inter_op_parallelism_threads和intra_op_parallelis