keras timedistributed

82 篇文章 0 订阅

keras timedistributed 作用详解

  • 关于其作用,Keras官方 API解释地更为简单粗暴:
  • keras.layers.TimeDistributed(layer)
    这个封装器将一个层应用于输入的每个时间片。
    输入至少为 3D,且第一个维度应该是时间所表示的维度。
    考虑 32 个样本的一个 batch, 其中每个样本是 10 个 16 维向量的序列。 那么这个 batch 的输入尺寸为 (32, 10, 16), 而 input_shape 不包含样本数量的维度,为 (10, 16)。
    你可以使用 TimeDistributed 来将 Dense 层独立地应用到 这 10 个时间步的每一个:
In [2]: import keras                                                            
Using TensorFlow backend.

In [3]: from keras.models import Sequential                                     

In [4]: from keras.layers import TimeDistributed, Dense                         

In [5]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))              

In [6]: model.summary()                                                         
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_1 (TimeDist (None, 10, 8)             136       
=================================================================
Total params: 136
Trainable params: 136
Non-trainable params: 0
_________________________________________________________________

In [7]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(89), input_shape=(10, 16)))             

In [8]: model.summary()                                                         
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_2 (TimeDist (None, 10, 89)            1513      
=================================================================
Total params: 1,513
Trainable params: 1,513
Non-trainable params: 0
_________________________________________________________________

In [9]:  
   ...: model = Sequential() 
   ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16)))           

In [10]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_3 (TimeDist (None, None, 89)          1513      
=================================================================
Total params: 1,513
Trainable params: 1,513
Non-trainable params: 0
_________________________________________________________________

In [11]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16, 150)))     

In [12]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_4 (TimeDist (None, None, 16, 89)      13439     
=================================================================
Total params: 13,439
Trainable params: 13,439
Non-trainable params: 0
_________________________________________________________________

In [13]: 89 * 150 + 89                                                          
Out[13]: 13439

In [14]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Dense(89), input_shape=(None, 16, 150, 100)))
    ...:                                                                        

In [15]: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_5 (TimeDist (None, None, 16, 150, 89) 8989      
=================================================================
Total params: 8,989
Trainable params: 8,989
Non-trainable params: 0
_________________________________________________________________

In [16]: from keras.layers import Conv2D                                        

In [17]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(None, 299, 299, 3)))                                           

In [18]: model.summary()                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_6 (TimeDist (None, None, 297, 297, 64 1792      
=================================================================
Total params: 1,792
Trainable params: 1,792
Non-trainable params: 0
_________________________________________________________________

In [19]: 3* 3*3*64 + 64                                                                                                            
Out[19]: 1792

In [20]:  
    ...: model = Sequential() 
    ...: model.add(TimeDistributed(Conv2D(64, (3, 3)), input_shape=(None, 299, 299, 5)))                                           

In [21]: model.summary()                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_7 (TimeDist (None, None, 297, 297, 64 2944      
=================================================================
Total params: 2,944
Trainable params: 2,944
Non-trainable params: 0
_________________________________________________________________

In [22]: 5*3*3*64 + 64                                                                                                             
Out[22]: 2944
  • 为层包装器:这个包装器允许使用者对输入的每个时间步应用一个层, 且各时间步上参数共享1
  • 作用对象:每个时间步, 。也就是时间步恒定(操作前操作后), 这里的时间步一定程度上类似于第二batch_size(这里称为时间步或者时间batch)2
# 作为模型第一层
#input_shape = (None, 10, 16)
model = Sequential()
model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))
# 现在 model.output_shape = (None, 10, 8)
#input_shape = (None, 10, 299, 299, 3)
model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))
# out_shape = (None, 10, 299, 299, 64)
  • 参数计算, 该层和LSTM一样不同时间步共享参数3
In [21]: #coding:utf-8 
    ...: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(12,8)) 
    ...: out = TimeDistributed(Dense(units=10))(input_) 
    ...: #out = Dense(units=10)(input_) 
    ...: model = Model(inputs=input_,outputs=out) 
    ...: model.summary()                                                        
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 12, 8)             0         
_________________________________________________________________
time_distributed_6 (TimeDist (None, 12, 10)            90        
=================================================================
Total params: 90
Trainable params: 90
Non-trainable params: 0
_________________________________________________________________
In [24]: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(12,32,32,3)) 
    ...: out = TimeDistributed(Conv2D(filters=32,kernel_size=(3,3),padding='same'))(input_) 
    ...: model = Model(inputs=input_,outputs=out);model.summary()                                                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_4 (InputLayer)         (None, 12, 32, 32, 3)     0         
_________________________________________________________________
time_distributed_9 (TimeDist (None, 12, 32, 32, 32)    896       
=================================================================
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________
In [31]: from keras.models import Input,Model 
    ...: from keras.layers import Dense,Conv2D,TimeDistributed 
    ...:   
    ...: input_ = Input(shape=(32,32,2)) 
    ...: out = Conv2D(filters=32,kernel_size=(3,3),padding='same')(input_) 
    ...: model = Model(inputs=input_,outputs=out);model.summary()                                                                                                                                           
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_11 (InputLayer)        (None, 32, 32, 2)         0         
_________________________________________________________________
conv2d_12 (Conv2D)           (None, 32, 32, 32)        608       
=================================================================
Total params: 608
Trainable params: 608
Non-trainable params: 0
_________________________________________________________________

t o t a l P a r a m e t e r s = f i l t e r s ∗ k e r n e l S i z e w e i g h t ∗ k e r n e l S i z e h i g h t ∗ i n p u t C h a n n e l s + b i a s totalParameters = filters * kernelSize_{weight}*kernelSize_{hight}*inputChannels + bias totalParameters=filterskernelSizeweightkernelSizehightinputChannels+bias
32 ∗ 3 ∗ 3 ∗ 2 + 32 32 * 3 * 3 *2 + 32 32332+32

  • TimeDistributed通过在LSTMs输出上应用相同的Dense层(相同的权重),每次执行一个步骤来实现这个技巧。这样,输出层只需要一个到每个LSTM单元的连接(加上一个偏置)4
    这样相对于直接使用全连接神经网络减少了很多参数
    举例:

若使用全连接 由于不能全职共享输出为

shape操作参数
(None, timestep, input_dim)输入0
(None, timestep * input_dim)reshape0
(None, output_dim)dense(output_dim)timestep × input_dim ×output_dim + output_dim

若使用timedistributed

shape操作参数
(None, timestep, input_dim)输入0
(None, output_dim)timedistributed(dense(output_dim), input_shape)input_dim ×output_dim + output_dim

综合对比减少了timestep倍的参数


  1. https://blog.csdn.net/LaoChengZier/article/details/88706642 ↩︎

  2. https://keras.io/zh/layers/wrappers/ ↩︎

  3. https://blog.csdn.net/zh_JNU/article/details/85160379 ↩︎

  4. https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/ ↩︎

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值