An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
2018
文章地址https://arxiv.org/pdf/1803.01271.pdf
代码地址https://github.com/LOCUSLAB/tcn
https://blog.csdn.net/qq_33331451/article/details/104810419
Temporal Convolutional Networks(TCN)
- Sequence Modeling
f : X T + 1 → Y T + 1 f:{\mathcal X}^{T+1}\rightarrow {\mathcal Y}^{T+1} f:XT+1→YT+1 - Causal Convolutions
tcn基于两个原则:输入与输出长度相同;未来没有可能泄露到过去。 - Dilated Convolutions
1-D序列输入 x ∈ R n {\bf x}\in{\Bbb R}^n x∈Rn,一个过滤器 f : { 0 , . . . , k − 1 } → R f:\{0,...,k-1\}\rightarrow {\Bbb R} f:{0,...,k−1}→R,膨胀卷积操作F F ( s ) = ( x ∗ d f ) ( s ) = ∑ i = 0 k − 1 f ( i ) ⋅ x s − d ⋅ i F(s)=({\bf x}*_df)(s)=\sum_{i=0}^{k-1}f(i)·{\bf x}_{s-d·i} F(s)=(x∗df)(s)=∑i=0k−1f(i)⋅xs−d⋅i,其中d是膨胀系数,k是过滤器大小,s-d·i是过去的方向。d=1时是普通卷积,使用较大的膨胀能够使顶级输出表示更广泛的输入从而有效扩展了convnet的接收域。两种方法增加TCN接收域:选择较大的过滤器尺寸k,增加膨胀系数d。
- Residual Connections
o = A c t i v a t i o n ( x + F ( x ) ) ) o={\rm Activation}({\bf x}+{{\mathcal F(\bf x)})}) o=Activation(x+F(x)))