Intro
Existing work on automatic sleep staging can be categorized based on the types of signal input of the network.
There are two main categories: the first directly processes 1-dimensional raw signals [15], [18], [20], [22], [24], [25], [26] and the second ingests 2-dimensional time-frequency images as inputs [8], [9], [16], [17].
A time-frequency image is usually derived from a raw signal via some transformations, for example, short-time Fourier transform (STFT). It is, in general, considered as a higher-level representation of the raw signal.
However, one cannot conclude that the raw input is better than the time-frequency one as the performance of an automatic sleep staging system depends on many other factors, such as the amount of training data, the network architecture, etc.
Rather, they should be considered as two different views regardi