keras timedistributed

最新推荐文章于 2023-02-19 22:46:21 发布

yjinyyzyq

最新推荐文章于 2023-02-19 22:46:21 发布

阅读量761

点赞数 1

分类专栏：应用理论

本文链接：https://blog.csdn.net/yjinyyzyq/article/details/90447108

版权

应用同时被 2 个专栏收录

185 篇文章 2 订阅

订阅专栏

理论

82 篇文章 0 订阅

订阅专栏

keras timedistributed 作用详解

关于其作用，Keras官方 API解释地更为简单粗暴：
keras.layers.TimeDistributed(layer)
这个封装器将一个层应用于输入的每个时间片。
输入至少为 3D，且第一个维度应该是时间所表示的维度。
考虑 32 个样本的一个 batch，其中每个样本是 10 个 16 维向量的序列。那么这个 batch 的输入尺寸为 (32, 10, 16)，而 input_shape 不包含样本数量的维度，为 (10, 16)。
你可以使用 TimeDistributed 来将 Dense 层独立地应用到这 10 个时间步的每一个：

In [2]: import keras                                                            
Using TensorFlow backend.

In [3]: from keras.models import Sequential                                     

In [4]: from keras.layers import TimeDistributed, Dense                         

In [5]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))              

In [6]: model.summary()                                                         
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_1 (TimeDist (None, 10, 8)             136       
=================================================================
Total params: 136
Trainable params: 136
Non-trainable params: 0
_________________________________________________________________

In [7]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(89), input_shape=(10, 16)))             

In [8]: model.summary()                                                         
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_2 (TimeDist (None, 10, 89)            1513      
=================================================================
Total params: 1,513
Trainable params: 1,513
Non-trainable params: 0
_________________________________________________________________

In [9]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16)))           

In [10]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_3 (TimeDist (None, None, 89)          1513      
=================================================================
Total params: 1,513
Trainable params: 1,513
Non-trainable params: 0
_________________________________________________________________

In [11]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16, 150)))     

In [12]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_4 (TimeDist (None, None, 16, 89)      13439     
=================================================================
Total params: 13,439
Trainable params: 13,439
Non-trainable params: 0
_________________________________________________________________

In [13]: 89 * 150 + 89                                                          
Out[13]: 13439

In [14]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16, 150, 100)))
    ...:                                                                        

In [15]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_5 (TimeDist (None, None, 16, 150, 89) 8989      
=================================================================
Total params: 8,989
Trainable params: 8,989
Non-trainable params: 0
_________________________________________________________________

In [16]: from keras.layers import Conv2D                                        

In [17]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(None, 299, 299, 3)))                                           

In [18]: model.summary()                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_6 (TimeDist (None, None, 297, 297, 64 1792      
=================================================================
Total params: 1,792
Trainable params: 1,792
Non-trainable params: 0
_________________________________________________________________

In [19]: 3* 3*3*64 + 64                                                                                                            
Out[19]: 1792

In [20]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(None, 299, 299, 5)))                                           

In [21]: model.summary()                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_7 (TimeDist (None, None, 297, 297, 64 2944      
=================================================================
Total params: 2,944
Trainable params: 2,944
Non-trainable params: 0
_________________________________________________________________

In [22]: 5*3*3*64 + 64                                                                                                             
Out[22]: 2944

为层包装器：这个包装器允许使用者对输入的每个时间步应用一个层，且各时间步上参数共享¹
作用对象：每个时间步, 。也就是时间步恒定（操作前操作后）, 这里的时间步一定程度上类似于第二batch_size(这里称为时间步或者时间batch)²

# 作为模型第一层
#input_shape = (None, 10, 16)
model = Sequential()
model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))
# 现在 model.output_shape = (None, 10, 8)

#input_shape = (None, 10, 299, 299, 3)
model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))
# out_shape = (None, 10, 299, 299, 64)

参数计算，该层和LSTM一样不同时间步共享参数³

In [21]: #coding:utf-8 
    ...: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(12,8)) 
    ...: out = TimeDistributed(Dense(units=10))(input_) 
    ...: #out = Dense(units=10)(input_) 
    ...: model = Model(inputs=input_,outputs=out) 
    ...: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 12, 8)             0         
_________________________________________________________________
time_distributed_6 (TimeDist (None, 12, 10)            90        
=================================================================
Total params: 90
Trainable params: 90
Non-trainable params: 0
_________________________________________________________________

In [24]: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(12,32,32,3)) 
    ...: out = TimeDistributed(Conv2D(filters=32,kernel_size=(3,3),padding='same'))(input_) 
    ...: model = Model(inputs=input_,outputs=out);model.summary()                                                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_4 (InputLayer)         (None, 12, 32, 32, 3)     0         
_________________________________________________________________
time_distributed_9 (TimeDist (None, 12, 32, 32, 32)    896       
=================================================================
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________

In [31]: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(32,32,2)) 
    ...: out = Conv2D(filters=32,kernel_size=(3,3),padding='same')(input_) 
    ...: model = Model(inputs=input_,outputs=out);model.summary()                                                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_11 (InputLayer)        (None, 32, 32, 2)         0         
_________________________________________________________________
conv2d_12 (Conv2D)           (None, 32, 32, 32)        608       
=================================================================
Total params: 608
Trainable params: 608
Non-trainable params: 0
_________________________________________________________________

$totalParameters = filters * kernelSize_{weight}*kernelSize_{hight}*inputChannels + bias$
$32 * 3 * 3 * 2 + 32$

TimeDistributed通过在LSTMs输出上应用相同的Dense层(相同的权重)，每次执行一个步骤来实现这个技巧。这样，输出层只需要一个到每个LSTM单元的连接(加上一个偏置)⁴。
这样相对于直接使用全连接神经网络减少了很多参数
举例：

若使用全连接由于不能全职共享输出为

shape	操作	参数
（None， timestep， input_dim）	输入	0
（None， timestep * input_dim)	reshape	0
（None， output_dim）	dense(output_dim)	timestep × input_dim ×output_dim + output_dim

若使用timedistributed

shape	操作	参数
（None， timestep， input_dim）	输入	0
（None， output_dim）	timedistributed(dense(output_dim), input_shape)	input_dim ×output_dim + output_dim

综合对比减少了timestep倍的参数

https://blog.csdn.net/LaoChengZier/article/details/88706642 ↩︎
https://keras.io/zh/layers/wrappers/ ↩︎
https://blog.csdn.net/zh_JNU/article/details/85160379 ↩︎
https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/ ↩︎

yjinyyzyq

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
keras timedistributed

keras timedistributed 作用详解为层包装器：这个包装器允许使用者对输入的每个时间步应用一个层1作用对象：每个时间步, 。也就是时间步恒定（操作前操作后）, 这里的时间步一定程度上类似于第二batch_size(这里称为时间步或者时间batch)2# 作为模型第一层#input_shape = (None, 10, 16)model = Sequential()...
复制链接

扫一扫