对音频信号的处理可以通过 librosa.ifgram 方法获取 stft 短时傅立叶变换的矩阵,对该矩阵进行修改搬移,再进行 istft 逆转换获得处理后的音频信号。
y, sr = librosa.load(path)
frequencies, D = librosa.ifgram(y, sr=sr)
'''
中间对D进行处理就行了
'''
y = librosa.istft(D)
D为stft变换的矩阵,x 轴为时间序列,y轴为频率序列坐标对应frequencies,值为幅度。
由于D类型为numpy.ndarray,所以我们很方便就可以通过numpy库对矩阵处理。
- 回音
D = np.repeat(D, 2, axis=1)
- 间断
D[:,::2] = 0
- 音色
D = np.roll(D, 50, axis=0)
- 压缩频率
def _pool(D, poolsize): x = D.shape[1] // poolsize restsize = D.shape[1] % poolsize if restsize > 0: x += 1 rightlist = np.zeros([ D.shape[0], poolsize-restsize]) D = np.c_[D, rightlist] D = D.reshape( (-1, poolsize) ) D = D.sum(axis=1).reshape(-1,x) return D def rewardshape(D, shape): x = shape[0] - D.shape[0] y = shape[1] - D.shape[1] if x > 0: bottomlist = np.zeros([x, D.shape[1]]) D = np.r_[D, bottomlist] if y > 0: rightlist = np.zeros([ D.shape[0], y]) D = np.c_[D, rightlist] return D def pool(D, size=(3,3), shapeed=False): _shape = D.shape if size[1] > 1: D = _pool(D, size[1]) if size[0] > 1: D = _pool(D.T, size[0]).T if shapeed: D = rewardshape(D, _shape) return D