Algorithm Level: PyTorch Wrapper
算法级,PyTorch 包装器
The algorithm we use to get the quantized DNN model for inference is the WAGE from [13]. The Pytorch code is modified based on [14-16].
我们用来得到量化DNN模型的算法是[13]的WAGE。Pypouch代码在此基础上进行了修改。
The same algorithm is realized except that we move the scale term from weight to output to make it more suitable for hardware architecture.
同样的算法也实现了,除了我们将尺度项从权重移动到输出,使它更适合硬件体系结构。
后面术语太多,实在搞不定。。。先放一放
Algorithm Level: Inference Accuracy Estimation
算法级别:推理精度估计
这里频繁出现Inference(推理),查了一下,使用学习好的模型做预测的过程叫inference,也就是说,模型的运行过程就叫Inference
In this framework, the neural network model (or weights) is assumed to be pre-trained off-chip, and then mapped to the compute-in-memory (i.e. CIM) inference chip. Thus, the non-ideal effects of synaptic devices (such as nonlinearity, asymmetry and endurance during weight-update operation) are not considered in this inference version (V1.0-V1.3) but are consider in the training version (V2.0-V2.2). In
the contrast, the main factors that we introduced into the accuracy estimation of inference chip are: on/off ratio, ADC quantization effects, conductance variation and retention.
在这个框架里,模型(或权重)被认为是被预先训练的芯片外的,然后映射到内存中的计算(即CIM)推理芯片
(这里写出了CIM的含义)
因此,突触设备的非理想效应(如权重更新操作中的非线性、不对称性和持久性)在这个推理版本(V1.0-V1.3)中没有被考虑,而是在训练版本(V2.0-V2.2)中被考虑。
相比之下,我们引入推理芯片精度估计的主要因素是:开关比、ADC量化效应、电导变化和保留(retention 意义不明)。
As Fig.22 shows, to represent the weights from algorithm (floating-point) on the CIM architectures, due to the limited precision of synaptic devices, one ideal way is to normalize the weights to decimal integers, and then digitalize the integers to conductance levels.
如图所示。22显示,为了表示CIM架构上的算法(浮点)的权重,由于突触设备的精度有限,一种理想的方法是将权重归一化为小进制整数,然后将整数数字化为电导水平。
For example, as shown in Fig. 20, if we define the synaptic weight precision to be 4-bit (decimal integer 0 to 15), and represented by 2-bit (conductance level 0 to 3) synaptic devices, from algorithm, the floating-point weight “+0.8906” will be normalized to 15,and thus be mapped to two synaptic devices, one as LSB and one as MSB, and each of them are on conductance level 3 (i.e.15/4=3, 15%4=3).
就是举了个例子,不太难,略过了
1、 Conductance On/Off Ratio (开/关比率的电导?不太懂)
Ideally, the conductance levels of synaptic devices range from 0 to 2N, where N is the precision of synaptic devices. However, the minimum conductance can be regarded as 0 only if the conductance on/off ratio (=maximum conductance/minimum conductance) of synaptic devices is infinity, which is not feasible in current technology.
理想情况下,突触器件的电导水平在0到2N之间,其中N是突触器件的精度。然而,只有当突触器件的电导开关比(=最大电导/最小电导)为无穷大时,最小电导才能视为0,这在当前的技术中是不可行的。
One approach to remedy this situation is to eliminate the effect of the OFF-state current in every weight element with the aid of a dummy column. In this framework, as Fig. 23 shows, we map the algorithm weights (range [-1, +1]) to synaptic devices (conductance range [Gmin, Gmax]) in the synaptic arrays, while we set a group of dummy columns beside each synaptic array, and the devices in dummy columns are set to the middle conductance (Gmin+Gmax)/2. Such that, by subtracting the real outputs with the dummy outputs, the truncated
conductance range will be [-(Gmax-Gmin)/2, +(Gmax-Gmin)/2], which is zero-centered as [-1, +1], and the off-state current effects are perfectly removed.
本质是把非零值当成了0,平移数值可以消除这里的误差
2、 Conductance Variation 电导分布
It is well known that the synaptic devices involving drift and diffusion of the ions/vacancies show considerable variation from device to device, and even from pulse to pulse within one device. Thus, in inference chip, although the weight-update operation is not required, conductance variation is still a concern during initialization or programming of the synaptic arrays.
众所周知,涉及离子/空位漂移和扩散的突触装置在装置之间表现出相当大的变化,甚至是脉冲之间的脉冲。因此,在推理芯片中,虽然不需要权重更新操作,但在突触阵列的初始化或编程过程中,电导的变化仍然是一个问题。
原理大概不太难懂
3、 Retention
Retention refers to the ability of memory device to retain its programmed state over a long period of time.
保留是指存储器设备长时间保持其编程状态的能力。
However, there are no reported data for analog eNVM that shows such retention, which can be attributed to the instability of intermediate conductance states.
这里可能是在说,eNVM无法保存模拟信号,因为中间态不稳定
还介绍了使用方法,略了
4、ADC Quantization Effects ADC量化效应
For CIM architecture, there are mainly two read-out schemes.
对于CIM架构,主要有两种读出方案。
需要在ADC上面实现折中,具体先略了。。。