11类 building tree sky car sign road pedestrian fence pole sidewalk bicycle
time-step 3:feed a sequence of 3 frames to spatio-temporal networks.
FCN:defined on an image grid Ω of size W × H
Output:of size
Every point(i,j) (in) is a descriptor of size m for a region in It.
总计个LSTMs