- 博客(0)
- 资源 (5)
- 收藏
- 关注
Tensorflow XLA详解.pdf
XLA(加速线性代数)是用于优化TensorFlow计算的线性代数的域特定编译器。
XLA 利用 JIT 编译技术分析用户在运行时创建的 TensorFlow 图表,根据实际运行时维度和类型将其专门化,将多个运算融合在一起并为它们生成高效的本机代码——适用于 CPU、GPU 之类的设备和自定义加速器(例如,Google 的 TPU)。
2020-01-14
虚拟与离散变量回归模型.pdf
前面五章所研究的回归模型,其变量都是取一些实际的数值,一般是连续的。实际工作中经常遇到变量取离散数值情形,它的回归模型需要给予特殊的考虑。在经济分析中还经常遇到因变量不是数值的情况,比如买与卖、升与降、有与无、盈与亏等。这些情况可以给予一个虚拟变量并赋以数值代表。这样的回归当然就更有特色了。本章就研究这一类回归模型。
2020-01-10
Training Deeper Models by GPU Memory Optimization on TensorFlow
With the advent of big data, easy-to-get GPGPU and progresses in neural network
modeling techniques, training deep learning model on GPU becomes a popular
choice. However, due to the inherent complexity of deep learning models and the
limited memory resources on modern GPUs, training deep models is still a nontrivial
task, especially when the model size is too big for a single GPU. In this paper,
we propose a general dataflow-graph based GPU memory optimization strategy,
i.e.,“swap-out/in”, to utilize host memory as a bigger memory pool to overcome
the limitation of GPU memory. Meanwhile, to optimize the memory-consuming
sequence-to-sequence (Seq2Seq) models, dedicated optimization strategies are
also proposed. These strategies are integrated into TensorFlow seamlessly without
accuracy loss. In the extensive experiments, significant memory usage reductions
are observed. The max training batch size can be increased by 2 to 30 times given
a fixed model and system configuration.
2020-01-10
CUDA优化2.pptx
CUDA存储优化,CPU-GPU 数据传输最小化。如果没有减少数据传输的话,将CPU代码移植到GPU可能无法提升性能,组团传输,内存传输与计算 时间重叠。
2020-01-09
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人