STFT完美重建原始信号

叶夏隐

已于 2024-01-03 11:34:16 修改

阅读量540

点赞数

文章标签： python 开发语言

于 2024-01-03 10:44:14 首次发布

原文链接：https://www.zhihu.com/question/280648561/answer/2521849999

版权

在进行ISTFT时，先对处理完的每帧频域信号做傅里叶逆变换，而后对逆变换的结果加窗（和分帧时加的窗类型、窗长、overlap保持一致），最后对加窗后的每帧信号重叠相加再除以每帧窗函数的平方重叠相加的结果，即可重建原始信号。

推导如下：

在这里插入图片描述

补：
我常用的做法是：帧长取为帧移的两倍；在正变换之前，对每帧信号乘以正弦窗；在逆变换之后、重叠相加之前，对每帧信号再乘以正弦窗。这样做的好处是：正变换之前加了窗，可以抑制频谱上的旁瓣；逆变换之后、重叠相加之前加了窗，可以避免合成后信号中有突变；正弦窗乘以正弦窗恰好等于汉宁窗，而汉宁窗重叠一半相加恰好恒等于 1，所以如果在正变换之后不做任何处理直接逆变换合成，可以得到原始信号。

作者：王赟 Maigo
链接：https://www.zhihu.com/question/280648561/answer/416245412
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

STFT：

def stft_basic(x, w, H=8, only_positive_frequencies=False):
    """Compute a basic version of the discrete short-time Fourier transform (STFT)
    Args:
        x (np.ndarray): Signal to be transformed
        w (np.ndarray): Window function
        H (int): Hopsize
        only_positive_frequencies (bool): Return only positive frequency part of spectrum (non-invertible)
            (Default value = False)
    Returns:
        X (np.ndarray): The discrete short-time Fourier transform
    """
    N = len(w)
    L = len(x)
    M = np.floor((L - N) / H).astype(int) + 1
    X = np.zeros((N, M), dtype='complex')
    for m in range(M):
        x_win = x[m * H:m * H + N] * w
        X_win = np.fft.fft(x_win)
        X[:, m] = X_win

    if only_positive_frequencies:
        K = 1 + N // 2
        X = X[0:K, :]
    return X

ISTFT：

def istft_basic(X, w, H, L):
    """Compute the inverse of the basic discrete short-time Fourier transform (ISTFT)
    Args:
        X (np.ndarray): The discrete short-time Fourier transform
        w (np.ndarray): Window function
        H (int): Hopsize
        L (int): Length of time signal
    Returns:
        x (np.ndarray): Time signal
    """
    N = len(w)
    M = X.shape[1]
    x_win_sum = np.zeros(L)
    w_sum = np.zeros(L)
    for m in range(M):
        x_win = np.fft.ifft(X[:, m])
        x_win = np.real(x_win) * w
        x_win_sum[m * H:m * H + N] = x_win_sum[m * H:m * H + N] + x_win
        w_shifted = np.zeros(L)
        w_shifted[m * H:m * H + N] = w * w
        w_sum = w_sum + w_shifted
    # Avoid division by zero
    w_sum[w_sum == 0] = np.finfo(np.float32).eps
    x_rec = x_win_sum / w_sum
    return x_rec, x_win_sum, w_sum

验证：

L = 256
t = np.arange(L) / L
omega = 4
x = np.sin(2 * np.pi * omega * t * t)

N = 64
H = 3 * N // 8
w_type = 'hann'
w = scipy.signal.get_window(w_type, N)
X = stft_basic(x, w=w, H=H)
x_rec, x_win_sum, w_sum = istft_basic(X, w=w, H=H, L=L)

plt.figure(figsize=(8, 3))
plt.plot(x, color=[0, 0, 0], linewidth=4, label='Original signal')
plt.plot(x_rec, color=[1, 0, 0], linewidth=4, label='Reconstructed signal')
plt.xlim([0,L-1])
plt.legend(loc='lower left')
plt.show()

在这里插入图片描述
可以看出原始信号与重建信号是完美重合的。

Ref:

[1].Invertibility of overlap-add processing
[2]. D. Griffin and Jae Lim, “Signal estimation from modified short-time Fourier transform,” in IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 2, pp. 236-243, April 1984, doi: 10.1109/TASSP.1984.1164317.

作者：Bear-boy
链接：https://www.zhihu.com/question/280648561/answer/2521849999
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。