Some Basic Audio Features
---------------------------------------
Theodoros Giannakopoulos
http:/www.di.uoa.gr/~tyiannak
---------------------------------------
Feature extraction (as in most pattern recognition problems) is maybe the most important step in audio classification tasks. The provided Matlab code computes some of the basic audio features for groups of sounds stored in WAV files. Furthermore, a simple class separability measure, based on feature histograms is used for measuring the ability of each feature to be used for classifying the given classes. Therefore, you can use the provided m-files for computing the features of an audio classification problem (i.e. specific audio classes) and understanding "how good" those features are for the specific classification task.
The features are calculated in a two-step way:
In particular, the following audio features and respective statistics are extracted for each audio segment:
Features Statistics
Energy Entropy Standard Deviation (std)
Signal Energy Std by Mean (average) Ratio
Zero Crossing Rate Std
Spectral Rolloff Std
Spectral Centroid Std
Spectral Flux Std by Mean Ratio
In order to compute the 6 feature statistics for a specific .wav file, you can use the computeAllStatistics(fileName, win, step).
After the features are calculated,
a) the histograms of each feature for all classes are estimated
b) a simple algorithm is used for estimating the separability of the audio classes. In other words, a measure that describes how "easily" the features will be classified. In the case of a multi-class classification problem, the measure is calculated for EACH CLASS opposed to ALL OTHER CLASSES, i.e. a measure value FOR EACH CLASS is computed. The algorithm is described in detail in http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=18791&objectType=FILE#.
EXAMPLE:
The main function of this demo is computeFeaturesDirectory(). The only recuired argument is a cell array with the names of the directories in which the .wav files of the respective classes are stored. For example, suppose you have three folders named MUSIC, SPEECH and NOISE, each one containing wav files with relevant audio content (i.e. wav files of segments containing music, speech and noise). In order to compute the audio features of those files simply write:
>> F = computeFeaturesDirectory({'music','speech','noise'});
---------------------------------------
Theodoros Giannakopoulos
http:/www.di.uoa.gr/~tyiannak
---------------------------------------