about marsyas

Even though I have used this tool for a long time, the usage only restricted to the tools it provided, say, bextract. According to the famous paper, 19 commonly used feature sets couldb be extracted by marsyas, which I do not think bextract can do that. bextract can only extract the timbral related informaiton, the 6 rhythmic content features and the 5 pitch content features can not......


After I reading the maillist discussions of marsyas, some ideas are gotten

http://sourceforge.net/mailarchive/forum.php?thread_name=4DC2285E.20105%40cs.uvic.ca&forum_name=marsyas-users

http://sourceforge.net/mailarchive/forum.php?thread_name=4A707F45.5010404%40cs.uvic.ca&forum_name=marsyas-users

the marsyas 0.1 version

http://sourceforge.net/mailarchive/forum.php?thread_name=4CFD5CD2.6040505%40cs.uvic.ca&forum_name=marsyas-users

about the pitchextrator:

http://sourceforge.net/mailarchive/forum.php?thread_name=AANLkTi%3Db1eefSFypEztuLaPY90Nq-wx3_JsYG6%3D722cs%40mail.gmail.com&forum_name=marsyas-users

feature extract on the whole file instead of the -sv (only operate on 30s)

http://sourceforge.net/mailarchive/forum.php?thread_name=4BCCD659.80905%40cs.uvic.ca&forum_name=marsyas-users


========================================================

To solve the problem stated above, I read a lot of the discussions on that forum, and finally got the 0.1 version.

It really take me a lot of time to make the source code compile and make(a lot of error appeared for missing some head files. The error can be googled to solve).

Finally when everything is done, I found that the version 0.1 is too simple to use....

(1) The supported feature extractors:

 3.3 Supported features extractors

MARSYAS has been designed to provide a flexible architecture that makes writing and combining new features easy. I use the term feature and feature vector somewhat interchangeably and that reflects how MARSYAS is designed. Features can either be single numbers or vectors. Standard features that have been used in computer audition and are supported by MARSYAS are:

  1. FFT This set of features is based on the Short Time Fourier Transform(STFT). They have been designed for Music/Speech classification. Morespecifically they are the means and variances of the spectral centroid,rolloff, flux and zerocrossings calculated every 20 msec (512 window at22050Hz sampling rate). The means and variances are calculated over a 1second window. In addition another feature called low energy is used resulting in a total of 9 (4 * 2 + 1) features.

    The centroid is the balancing point of the spectrum (the frequency wherethe energy of all frequencies bellow that frequency is equal to theenergy of all frequencies above that frequency) and is a measure ofbrightness and general spectral shape. Another measure of spectral shapeis the rolloff which is the 90 percentile of the power spectraldistribution (Centroid would be the 50 percentile). This is a mease ofthe "skewness" of the spectral shape. Flux is the 2-norm of thedifference between the magnitude of the Short Time Fourier Transform(STFT) spectrum evaluated at two successive sound frames. The STFT isnormalized in energy. Low energy is the percentage of 20 msec windows that have energy less than the average energy of the 1 second window (speech tends to have more silent frames than music).

  2. FFT_SEGM This set of features is similar to the FFT set with the addition of themean and variance of the RMS (root-mean-sqarred) energy of the signalresulting in a total of 10 (5 * 2) features. It is used forsegmentation. RMS can not be used in classification because we want theclassification decision to be invariant to loudness however forsegmentation it can be a very useful source of information.

  3. MFCC Mel-Frequency cepstral coeffiecients are perceptually motivated features used in Speech Recognition research.

  4. LPC Linear-prediction coefficients are features used in SpeechRecognition research. They are good for modelling voice signals.

  5. MPEG This is a set of features based on the filterbank used in the MPEG audio compression standard.

  6. SFX Set of features used for classification and clustering of sound effects.

  7. INSTR Set of features used for classification and clustering of isolatedmusical instrument tones.

  8. DWTC Set of features calculated using the Discrete Wavelet Transform.

  9. BEAT Set of features for representing the beat structure of music calculated using a beat detection algorithm based on the DiscreteWavelet Transform.

  10. SVFFT Single vector version of the FFT feature extractor. Basically the mean across the whole file of the feature vectors (the mean of all the rows of the Feature Matrix).

  11. MPITCH Set of feature for representing harmonic content based on multiple pitch analysis algorithm.

  12. GENRE Big set of features for representing musical genre. Single feature vector for the whole file. Consists of SVFFT, SVMFCC, MPITCH and BEAT appended together
(2) The output file format is not arff at all....


So I decided to give up using this.


Acctually version 0.4.1 should be reinstalled again. In order to extract the beat histogram feature, just append the -bf after the bextract commind. As for the pitch content features, still no ways to produce that...


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值