I am trying to process an audio file in python using various modules like numpy, struct etc. But I am really having a hard time detecting silence in the file, as in where is the presence of silence. one on the methods I came across was to slide a window of fixed time interval over my audio signal and record the sum of squared elements. I am new to python and hardly aware of it thus unable to implement this method.
解决方案
If you are open to outside libraries, one of the quick way to do is using pydub.
pydub has a module called silence that has methods detect_silence and detect_nonsilent that may be useful in your case.
However, the only caviar is that silence needs to be at-least half a second.
Below is a sample implementation that I tried using an audio file. However, since silence in my case was less than half a second, only few of the silent ranges were correct.
You may want to try this and see if it works for you by tweaking min_silence_len and silence_thresh
Program
from pydub import AudioSegment,silence
myaudio = intro = AudioSegment.from_wav("a-z-vowels.wav")
silence = silence.detect_silence(myaudio, min_silence_len=1000, silence_thresh=-16)
silence = [((start/1000),(stop/1000)) for start,stop in silence] #convert to sec
print silence
Result
Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
================================ RESTART ================================
[(0, 1), (1, 14), (14, 20), (19, 26), (26, 27), (28, 30), (29, 32), (32, 34), (33, 37), (37, 41), (42, 46), (46, 47), (48, 52)]