The attention mechanism is first proposed to solve ”Sequence to Sequence” problems, i.e., input has the same influence on target, and sequence is easy to lose long-term information
注意机制最早是为了解决“Sequence to Sequence”问题,即输入对目标具有相同的影响,序列容易丢失长期信息
如何去理解对长期信息的丢失,求各位神通广大的大佬!!!