How do we track object over time?
For example, the greener the car is saturated, it means the car is moving faster in the direction of down-left (reference is the color map on the right)
However, our perception of motion is often an illusion, for example:
If someone remove the color of the right frame, can we still tell what color would the balloon be? Yes it is still red, we know this because we know it is the same object, the same track, we know there is some correspondence between balloon pixels on the left with right.
It turns out that this basic idea is a signal that we can use to machine learning problem, where has to fill the colors for video, and by doing that, it learns to track automatically. The mechanism for tracking is going to emerge inside of this model.
We basically make this model watch a bunch of videos where it gets to see one frame with color and the rest of the frames don't have have any color (to be grayscaled). And we train with the model, predict all of the colors for the grayscale video. What you can then