信号处理琴声项目Signal Acquisition and Processing Project

最新推荐文章于 2022-03-18 09:56:31 发布

海底捞淡水鱼

最新推荐文章于 2022-03-18 09:56:31 发布

阅读量1.1k

点赞数

分类专栏：信号处理 MATLAB 文章标签：信号处理 matlab 算法傅立叶分析

本文链接：https://blog.csdn.net/qq_37030400/article/details/108248557

版权

信号处理同时被 2 个专栏收录

5 篇文章 1 订阅

订阅专栏

MATLAB

4 篇文章 0 订阅

订阅专栏

Project description

The project is structured in three parts:

Part I: generate a full scale consisting of 12 semitones

For the first part of your project, you are asked to generate a range of music, starting from note La3 at 440 Hz. To do that, you can address the following steps:
• Start from 𝑓0=440 𝐻𝑧 and multiply this frequency by r = 1.05946 to go to the next frequency (semitone),
• Generate the full scale made up of 12 semitones, each of duration equal to few hundreds of 𝑚𝑠, and listen to the signal,
• Up / down two ranges,
• Play two consecutive scales with both hands, …

Matlab plays music by the sound (Y, fs, bits) function. The three parameters of this function represent the input signal, sampling rate, and bit rate. Let me talk about the setting of the sampling rate fs, the sound range that the human ear can hear is 20 ~ 20000Hz. According to the sampling theorem, fs only needs to be greater than 40,000. The sampling rate is set here using the MP3 standard, that is, fs = 44.1k. Besides, the input signal Y, Y is generally a sine wave, such as A * sin (2 * pi * w * t). Among them, A controls the size of the sound, w controls the level of the sound, and the range of t controls the length of the sound. Therefore, in theory, any sound can be produced using this formula, but the tone and quality cannot be controlled. The default bit rate is sufficient, and this parameter is omitted.
Therefore, the standard sound la can be played using the following formula:

fs=44100;
t=0: 1/fs: 0.5;
la = sin(2*pi*440*t);
sound(la, fs)

Based on the above, we can get all the tones of the piano.

clc
clear
fs=44100;
t=0:1/fs:0.5;
c4_2=key(60, 2, fs);  %make 12 semitones
cup4_2=key(61, 2, fs);
d4_2=key(62, 2, fs);
dup4_2=key(63, 2, fs);
e4_2=key(64, 2, fs);
f4_2=key(65, 2, fs);
fup4_2=key(66, 2, fs);
g4_2=key(67, 2, fs);
gup4_2=key(68, 2, fs);
a4_2=key(69, 2, fs);
aup4_2=key(70, 2, fs);
b4_2=key(71, 2, fs);
part1=[c4_2 cup4_2]; % Make two consecutive scales into a group
part2=[d4_2 dup4_2];
part3=[e4_2 f4_2];
part4=[fup4_2 g4_2];
part5=[gup4_2 a4_2];
part6=[aup4_2 b4_2]
legend1=[part1];
sound(legend1,fs) % Play two consecutive scales

function g=key(p, n, fs)
t=0 : 1/fs : 2/n;
g=sin(2*pi* fre(p) *t);
end
function f = fre(p)
f=440*2^((p-69)/12);
end

Part II: transcript the music notes from some audio input recording

This part aims to transcript piano music at the basic level. That is, one note is played at a certain time. The piano audio samples provided come from a real source (piano recording), therefore the audio file itself is not perfectly sound in tune. Some of the problems needed to solve are overtone analysis, noise reduction and frequency separation.
Fast Fourier Transform (spectrogram) or wavelet analysis (scalogram) should be used for time-frequency analysis (a good explanation of these two analysis methods can be found at: http://mudasir.hubpages.com/hub/wavelets1).
For example, using Matlab, you will be able to create a spectrogram/scalogram of the audio signal.
HINT: you may need to perform low-pass filtering before the time-frequency analysis.
You will need to write algorithms for frequency quantization and de-noising, and finally plot the extracted notes against their duration. In Figure 5 you have an example of such transcription obtained for the sample audio file piano.wav.
Perform spectrum analysis (take ech1 as an example)

clear sound
[audio, fs] = audioread ('ech1.wav');% sound reading
audio = audio (:, 1);% dual channel to single channel
n = length (audio);
T = 1 / fs;% sampling interval
t = (0: n-1) * T;% time axis
f = (0: n-1) / n * fs;% frequency axis
% Fast Fourier Transform
audio_fft = fft (audio, n) * T;
% sound (audio, fs);
% Design IIR low-pass filter
rp = 1;
rs = 60;
Ft = fs;
Fp = 250;
Fs = 500;
wp = 2 * pi * Fp / Ft;
ws = 2 * pi * Fs / Ft;% find the boundary frequency of the analog filter to be designed
[N, wn] = buttord (wp, ws, rp, rs, 's');% low-pass filter order and cutoff frequency
[b, a] = butter (N, wn, 's'); %The parameters of the frequency response in the% S domain are: the transfer function of the filter
[bz, az] = bilinear (b, a, 0.5);% Using bilinear transformation to realize the transformation from frequency domain S domain to Z domain
figure (2);% low-pass filter characteristics
[h, w] = freqz (bz, az);
title ('IIR low-pass filter');
plot (w * fs / (2 * pi), abs (h));
grid;
% Filtering
z = filter (bz, az, s);
z_fft = fft (z);% filtered signal spectrum
figure (1);
% Draw the original audio time-domain wave
subplot (2,3,1);
plot (t, audio);
xlabel ('time / s');
ylabel ('amplitude');
title ('Initial signal waveform');
grid;
% Draw the original audio frequency domain spectrum
subplot (2,3,4);
audiof = abs (audio_fft);
plot (f (1: (n-1) / 2), audiof (1: (n-1) / 2));
title ('Initial signal spectrum');
xlabel ('frequency / Hz');
ylabel ('amplitude');
grid;
% Plot the filtered audio time-domain wave
subplot (2,3,3);
plot (t, z);
title ('Low-pass filtered signal waveform');
xlabel ('time / s');
ylabel ('amplitude');
grid;
% Plot the filtered audio frequency wave
subplot (2,3,6);
zf = abs (z_fft);
plot (f (1: (n-1) / 2), zf (1: (n-1) / 2));
title ('Spectrum of low-pass filtered signal');
xlabel ('frequency / Hz');
ylabel ('amplitude');
y = z;
% info = audioinfo ('ech1.wav');% get the audio format, number of bits, frequency, etc.
% sound (y, fs);
T = 1 / fs;% sampling time
t = (0: length (y) -1) * T;% time
f = (0: length (y) -1) * fs / length (y);
yz = y (:, 1);% left channel
n = length (yz);% points to transform
y1 = fft (yz, n);% Fourier transform n points to frequency domain
F = fs / length (yz);% spectral resolution, spectral interval
grid on
figure (2)
subplot (211);
Pyy = y1 (1: 1100 + 1);% according to the way of processing vector to get the range of the required sound signal
f = (1: 1100 + 1);
plot (f, 20 * log10 (abs (Pyy)))% energy spectrum
xlabel ('Frequency (Hz)')
ylabel ('Power (dB)')
subplot (212)
bar (f, abs (Pyy))% energy graph

LPF
在这里插入图片描述
The cutoff frequency is around 200 Hz.

The filtering effect is very obvious.

According to the way of processing vector to get the range of the required sound signal

Time-frequency analysis of audio after noise reduction (base on “piano.wav”)

clear sound
timestart = 0;
[audio, fs] = audioread ('ech7.wav');% sound reading
audio = audio (:, 1);% dual channel to single channel
n = length (audio);
T = 1 / fs;% sampling interval
t = (0: n-1) * T;% time axis
f = (0: n-1) / n * fs;% frequency axis
% Fast Fourier Transform
audio_fft = fft (audio, n) * T;
timeend = t;
wlen = 20480;% sets the window length. The longer the window, the worse the resolution, and the better the frequency resolution.
hop = 2000;% The step size of each translation, the minimum is 1. The smaller the image, the better the time accuracy, but the amount of calculation is large.
audio_fft = wkeep1 (audio_fft, n + 1 * wlen);% intermediate truncation
% Design IIR low-pass filter
rp = 1;
rs = 60;
Ft = fs;
Fp = 500;
Fs = 900;
wp = 2 * pi * Fp / Ft;
ws = 2 * pi * Fs / Ft;% find the boundary frequency of the analog filter to be designed
[N, wn] = buttord (wp, ws, rp, rs, 's');% low-pass filter order and cutoff frequency
[b, a] = butter (N, wn, 's'); %The parameters of the frequency response in the S domain are: the transfer function of the filter
[bz, az] = bilinear (b, a, 0.5);% Using bilinear transformation to realize the transformation from frequency domain S domain to Z domain
figure (2);% low-pass filter characteristics
[h, w] = freqz (bz, az);
title ('IIR low-pass filter');
plot (w * fs / (2 * pi), abs (h));
grid;

% In order to make the sound without high-frequency noise in the subsequent processing, I designed a high-pass filter and added it behind the low-pass filter.
% Design IIR high-pass filter
Fp = 150;
Fs1 = 240;
Ft = fs;
As = 100;
Ap = 1;
wp = 2 * pi * Fp / Ft;
ws = 2 * pi * Fs1 / Ft;
[n, wn] = ellipord (wp, ws, Ap, As, 's');
[b, a] = ellip (n, Ap, As, wn, 'high', 's');
[B, A] = bilinear (b, a, 1);
[h, w] = freqz (B, A);
figure (4);
plot (w * Ft / pi / 2, abs (h));
title ('IIR high-pass filter');
xlabel ('frequency');
ylabel ('amplitude');
grid on;
% Filtering
z = filter (bz, az, audio);
z = filter (B, A, z);
z_fft = fft (z);% filtered signal spectrum
audiowrite('ech7_filter.wav',z,fs);

sound (z, fs);% playback
% Do Short Time Fourier
h = hamming (wlen);% set the window length of Hamming window
f = 220: 1: 500;% set frequency scale
[tfr2, f, t2] = spectrogram (z, h, wlen-hop, f, fs, 'MinThreshold',-1000, 'reassigned', 'yaxis');% spectrum
axis xy;
tfr2 = tfr2 * 2 / wlen * 2;
figure
imagesc (t2 + timestart-wlen / fs / 2, f, abs (tfr2))% change of each frequency
shi = abs (tfr2);
clims = [50 100];
shi = shi.* 10000;
shi (shi <3800) = 0;
plot (shi');
imagesc (t2 + timestart-wlen / fs / 2, f, shi, clims);
axis xy;
set (gca, 'ytick', [262 277 294 311 330 349 370 392 415 440 466 494]);
set (gca, 'yticklabel', {'C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', ' A # ',' B '});% annotation

HPF
在这里插入图片描述
The cutoff frequency is around 650.

Time-frequency:
在这里插入图片描述

Energy curve with frequency varying with time (after multiplying 10000):
在这里插入图片描述
According to the curve, the maximum value of the energy at each frequency is obtained. Their maximum value is the frequency corresponding to the note.

Piano sheet music:
在这里插入图片描述
The yellow area in the figure is the note corresponding to the sound played by the piano keys. The size of the area reflects the strength of the sound and the range of overtones.
Compared with the shortcomings that the overtone can not be displayed in the example, this can express the characteristics of the piano sound more.

According to part1, you can get all the simulated keys sound.
Add a command on the basis of Part1 to convert the analog signal into a sound file and store it.

audiowrite('z.wav',legend1,fs);

Put this file in the Part2 program for analysis to get its sound score.
As shown in the figure, they can hardly make overtones, and after a delay, the sound will not change.

From C to B (simulation sound):
在这里插入图片描述
More examples:
ech25:

ech15:

ech7:
在这里插入图片描述

Part III: watermark the audio input with some text information

Once the notes are extracted, propose a method for inaudibly inserting a text message (for example, the name of the project developer) within the wav file. Therefore, in this part you are requested to perform the watermarking (either by LBS, or frequency-based) of your wav file. Finally, and if you still have time, estimate the SNR after the watermarking and propose some directions for improving it [1,2].

First, I wrote a program to measure the signal-to-noise ratio to check whether the SNR has changed after modifying the audio file. (Take ech7 as an example)

%Test program
[X, fs] = audioread ('ech7_filter.wav');
[Y, NOISE] = noisegen (X, 10);
mn = mean (NOISE);
snr = SNR_singlech (X, Y)

function snr = SNR_singlech (I, In)
% Calculate the signal to noise ratio function
% I: original signal
% In: noisy signal (ie. Original signal + noise signal)
snr = 0;
Ps = sum (sum ((I-mean (mean (I))). ^ 2));% signal power
Pn = sum (sum ((I-In). ^ 2));% noise power
snr = 10 * log10 (Ps / Pn);
end

function [Y, NOISE] = noisegen (X, SNR)
% noisegen add white Gaussian noise to a signal.
% [Y, NOISE] = NOISEGEN (X, SNR) adds white Gaussian NOISE to X. The SNR is in dB.
NOISE = randn (size (X));
NOISE = NOISE-mean (NOISE);
signal_power = 1 / length (X) * sum (X. * X);
noise_variance = signal_power / (10 ^ (SNR / 10));
NOISE = sqrt (noise_variance) / std (NOISE) * NOISE;
Y = X + NOISE;
end

Result:
在这里插入图片描述
In matlab, we can edit the header information of the file to add a watermark. This is actually a frequency-based method. The voice signal is decomposed into time-domain waveforms or frequency-domain graphics. For example, the spectrogram generated by Fourier transform. Restore them to the specified audio format, which can be mp3, wav and other audio formats, this is the reverse conversion process. Here, I still restore to wav file. Use the audio writing function to restore the processed sound frequency to sound and store it. We modify its header information will not affect the original sound signal.

[y,fs1]=audioread('ech7_filter.wav');
audioinfo('ech7_filter.wav')
audiowrite('ech7_filter_watermark2.wav',y,fs1,'BitsPerSample',16,'Comment','This is my new audio file.','Title','Final','Artist','zhouyu');
sound(y,fs1);
audioinfo('ech7_filter_watermark2.wav')

Result:
在这里插入图片描述
Test its signal-to-noise ratio:

Based on the results above, it seems that I changed the ‘comment’ ‘artist’ ‘title’ of the file, and neither the SNR nor ‘totalsamples’ of the sound changed. The effect met expectations and performed very well.

References

[1]https://blog.csdn.net/haoji007/article/details/81155482?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-6
[2]https://www.mathworks.com/help/matlab/
[3]https://blog.csdn.net/qq_36836639/article/details/78662613?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-3
[4] https://mathworld.wolfram.com/FourierTransform.html
[5]https://www.mathworks.com/help/aerotbx/ug/convmass.html
[6]https://blog.csdn.net/qq_33438733/article/details/79324763?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522159066430719725219960349%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=159066430719725219960349&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2_allfirst_rank_ecpm_v3~pc_rank_v3-3-79324763.first_rank_ecpm_v3_pc_rank_v3&utm_term=matlab+%E7%94%A8LBS%E5%A4%84%E7%90%86%E5%A3%B0%E9%9F%B3