这边博客,主要记录librosa.
中关于CQT
与perceptual_weighting()
函数的理解。
1. CQT
def cqt(
y,
sr=22050,
hop_length=512,
fmin=None,
n_bins=84,
bins_per_octave=12,
tuning=0.0,
filter_scale=1,
norm=1,
sparsity=0.01,
window="hann",
scale=True,
pad_mode="reflect",
res_type=None,
dtype=None,
):
函数的接口如上所示, 其中
fmin: 最小的起始频率;
n_bins:
从fmin 开始, 总共有多少个细分的频率段,默认有 84;
bins_per_octave: 每一个音阶下,均匀分配多少个频率bins 出来;默认为 12;
所以84/12 = 7 , 算上开始的, 0-7总共8个音阶;
那么最高频率是算的呢?
已知, 从最低频率开始 fmin = 32Hz,
2
5
2^5
25,
由于总共八个音阶, 算上开始的, 所以这八个音阶对应的各自频率如下:
2
5
2^5
25 = 32Hz,
2
6
2^6
26=64Hz,
2
7
2^7
27=128Hz,
2
8
2^8
28=256Hz,
2
9
2^9
29=512Hz,
2
10
2^{10}
210=1024,
2
11
2^{11}
211,
2
12
2^{12}
212,
由以上可知, 2 12 2^{12} 212 = 4096 Hz,
1.1 参数的设置
通过上面的计算可以知道, 采样率 必须大于 》 上述最高截止频率的 2倍;
否则,会出现如下 采样率过低的问题;
fmin, filters个数, 若是使用默认配置参数时, 采样率过低(低于 4186Hz x 2),会出现如下情况:
How can I extract CQT from audio with sampling rate 8000 Hz (librosa)
I wrote following codes.
sound_clip, s = librosa.load(fn, sr=8000)
cqtpec = librosa.cqt(y=sound_clip, sr=s)
But there was an error.
librosa.util.exceptions.ParameterError: Filter pass-band lies beyond Nyquist
Use a lower n_bins or a lower fmin. With the default fmin of 32.7Hz (musical C1), n_bins = 84, and bins_per_octave = 12, the highest bin falls 7 octaves higher, at 4186Hz (C8), but with a sampling rate of 8000Hz you can only deal with frequencies up to 4000Hz, so if you keep fmin the same, n_bins needs to be no more than 83.
1.2 hop length 设定
hop_len 帧移动的长度,
假设 参数中,设置的 f_min = 32 Hz, = 2^5,
那么 hop_len 帧移动的长度在设置的时候,必须是32的倍数;
才能确保在输出后, 输出正确的帧数;
spect = librosa.cqt(waveform, sr=9000, hop_length=188, fmin=32, filter_scale=1 )
n_octaves - 1, n_octaves
librosa.util.exceptions.ParameterError: hop_length must be a positive integer multiple of 2^5 for 6-octave CQT/VQT
reference:
-
https://blog.csdn.net/qq_44250700/article/details/119956311#t6;
-
https://stackoverflow.com/questions/43838718/how-can-i-extract-cqt-from-audio-with-sampling-rate-8000hz-librosa