有关包络Spectral Envelope的疑问

在有关MFCC的许多文章中,很多都是引用国外一篇PPT的。

这篇地址在这里

其中有关于Spectral Envelope(包络)的理解我一直有一些疑问。

疑问

为什么有如下假定?

Spectrum = Spectral Envelope * Spectral Details

 Spectrum = Spectral Envelope * Spectral Details

正因为这个公式,才有后续的处理。但是对于这个我是挺好奇的。

对此我做了一些简单的实验。

实验

Step1. 找一个单一声源的音频
单一声源的音频

Step2. 对此音频做DFT得到频域数据

Step3. 对频域数据再次做DFT
如果存在Spectral Envelope和Spectral Details,那么必然会在这个结果中有体现。

伪代码如下:

for (int i = 0; i < N; ++i) {N
        double realVal = m1[i][0]/N;
        double imagVal = m1[i][1]/N;
        double powVal  = 2* (realVal*realVal +imagVal*imagVal);
        double absVal  = sqrt(powVal/2)*2;
        // 仅打印能量大于1.25
        if (absVal>1.25) {
            fprintf(stdout, "%10i (%10.4lf %10.4lf) %10.4lf %10.4lf\n", i,
                    realVal, imagVal, absVal, powVal);
        }
    }

因为样例音频的能量较小,所以pow和abs值都偏小,这里根据1.25为阈值过滤。

打印Step3数据如下 (举了某一帧为例)

 Frequency  (Real       Imag)        Abs       Power
      1605 (    0.5469     0.5966)     1.6187     1.3101
      1607 (   -0.3830    -0.6633)     1.5319     1.1734
      1608 (   -0.8168    -0.8465)     2.3527     2.7676
      1609 (   -0.6892    -0.2346)     1.4560     1.0600
      1610 (    0.3351     0.8297)     1.7896     1.6013
      1611 (    1.0922     0.9707)     2.9224     4.2701
      1614 (   -0.6581    -0.6849)     1.8997     1.8045
      1616 (    0.2837     0.6177)     1.3595     0.9242
      1617 (    0.6710     0.4794)     1.6494     1.3602
      1620 (   -0.4764    -0.4920)     1.3697     0.9381
      1622 (    0.7372     0.9301)     2.3736     2.8170
      1623 (    0.8836     0.5938)     2.1291     2.2666
      1625 (   -0.8374    -1.0777)     2.7296     3.7254
      1626 (   -1.1240    -0.8214)     2.7843     3.8762
      1628 (    0.8786     1.1128)     2.8357     4.0205
      1629 (    0.9656     0.6244)     2.2998     2.6446
      1631 (   -0.5870    -0.7584)     1.9180     1.8394
      1632 (   -0.7730    -0.5451)     1.8917     1.7893
      1634 (    0.6053     0.6120)     1.7215     1.4818
      1637 (   -0.6775    -0.7938)     2.0872     2.1782
      1638 (   -0.6324    -0.1233)     1.2886     0.8303
      1639 (    0.4665     0.8667)     1.9684     1.9374
      1640 (    1.0342     0.9270)     2.7777     3.8579
      1642 (   -0.4208    -0.8041)     1.8152     1.6474
      1643 (   -1.0795    -1.0763)     3.0488     4.6476
      1644 (   -0.7512    -0.2024)     1.5560     1.2106
      1645 (    0.4262     0.8774)     1.9509     1.9030
      1646 (    0.9605     0.8118)     2.5152     3.1630
      1649 (   -0.7451    -0.6711)     2.0055     2.0111
      1651 (    0.4292     0.5677)     1.4235     1.0131
      1654 (   -0.4299    -0.5846)     1.4513     1.0531
      1656 (    0.3248     0.7072)     1.5564     1.2112
      1657 (    0.8936     0.8522)     2.4697     3.0496
      1658 (    0.6013     0.2079)     1.2724     0.8095
      1660 (   -0.9155    -1.0261)     2.7503     3.7821
      1661 (   -0.8391    -0.4314)     1.8870     1.7803
      1663 (    0.7710     0.7510)     2.1526     2.3169
      1664 (    0.6116     0.3559)     1.4153     1.0015
      1666 (   -0.6445    -0.7511)     1.9795     1.9592
      1671 (   -0.4366    -0.7228)     1.6888     1.4261
      1672 (   -0.8308    -0.5387)     1.9803     1.9609
      1674 (    0.8466     0.9201)     2.5007     3.1268
      1675 (    0.7521     0.3795)     1.6849     1.4195
      1677 (   -0.8065    -0.9729)     2.5274     3.1938
      1678 (   -0.8567    -0.4725)     1.9567     1.9143
      1679 (    0.1235     0.6154)     1.2554     0.7880
      1680 (    0.7764     0.7284)     2.1292     2.2667
      1681 (    0.5659     0.3056)     1.2863     0.8272
      1683 (   -0.6042    -0.7033)     1.8545     1.7196
      1685 (    0.3171     0.5617)     1.2900     0.8321
      1686 (    0.6701     0.5375)     1.7181     1.4759
      1689 (   -0.7254    -0.5887)     1.8685     1.7456
      1691 (    0.7279     0.9073)     2.3265     2.7062
      1692 (    0.8275     0.5309)     1.9662     1.9330
      1694 (   -0.6795    -0.9210)     2.2890     2.6198
      1695 (   -0.9242    -0.6791)     2.2937     2.6306
      1697 (    0.7296     0.7948)     2.1578     2.3280
      1698 (    0.6813     0.4235)     1.6044     1.2870
      1700 (   -0.5638    -0.7306)     1.8457     1.7033
      1701 (   -0.6597    -0.3087)     1.4568     1.0611
      1703 (    0.5065     0.4439)     1.3470     0.9072
      1706 (   -0.7258    -0.6583)     1.9597     1.9203
      1708 (    0.4901     0.7264)     1.7525     1.5356
      1709 (    0.7369     0.5527)     1.8423     1.6971
      1711 (   -0.4683    -0.7648)     1.7936     1.6085
      1712 (   -0.8818    -0.7439)     2.3074     2.6620
      1714 (    0.6359     0.7758)     2.0063     2.0126
      1715 (    0.7287     0.5066)     1.7751     1.5755
      1717 (   -0.4906    -0.7183)     1.7397     1.5134
      1718 (   -0.6872    -0.3352)     1.5291     1.1691
      1720 (    0.6665     0.6527)     1.8656     1.7403
      1723 (   -0.6042    -0.6591)     1.7882     1.5988
      1725 (    0.3617     0.6106)     1.4193     1.0072
      1726 (    0.6514     0.5098)     1.6543     1.3684
      1728 (   -0.3398    -0.6308)     1.4331     1.0269
      1729 (   -0.8178    -0.7883)     2.2718     2.5805
      1731 (    0.5277     0.7073)     1.7648     1.5573
      1732 (    0.7133     0.5763)     1.8341     1.6819
      1734 (   -0.3499    -0.6897)     1.5467     1.1962
      1735 (   -0.8273    -0.6361)     2.0872     2.1782
      1737 (    0.4145     0.5149)     1.3220     0.8738
      1740 (   -0.5147    -0.6539)     1.6643     1.3849
      1742 (    0.2933     0.6117)     1.3568     0.9204
      1743 (    0.7094     0.5838)     1.8374     1.6879
      1745 (   -0.3245    -0.5805)     1.3300     0.8845
      1746 (   -0.7712    -0.7880)     2.2051     2.4313
      1748 (    0.4123     0.6631)     1.5617     1.2194
      1749 (    0.7214     0.6162)     1.8976     1.8004
      1752 (   -0.7097    -0.6012)     1.8603     1.7303
      1754 (    0.4279     0.6011)     1.4757     1.0888
      1755 (    0.5290     0.4104)     1.3390     0.8965
      1757 (   -0.3994    -0.6132)     1.4636     1.0711
      1758 (   -0.5697    -0.2795)     1.2691     0.8053
      1760 (    0.6547     0.6175)     1.7999     1.6198
      1763 (   -0.7711    -0.8903)     2.3557     2.7747
      1764 (   -0.6615    -0.1941)     1.3787     0.9504
      1765 (    0.2597     0.5811)     1.2731     0.8104
      1766 (    0.7059     0.6787)     1.9586     1.9181
      1769 (   -0.7353    -0.6805)     2.0038     2.0076
      1774 (   -0.3456    -0.5799)     1.3501     0.9114
      1775 (   -0.5955    -0.3409)     1.3724     0.9418
      1777 (    0.6860     0.7067)     1.9697     1.9399
      1778 (    0.6015     0.3305)     1.3726     0.9420
      1780 (   -0.7709    -0.9791)     2.4923     3.1058
      1781 (   -0.8193    -0.3706)     1.7984     1.6171
      1783 (    0.7863     0.7897)     2.2287     2.4836
      1784 (    0.6386     0.3798)     1.4859     1.1040
      1786 (   -0.6527    -0.6889)     1.8979     1.8011
      1794 (    0.6092     0.6963)     1.8502     1.7117
      1795 (    0.6403     0.4078)     1.5183     1.1527
      1797 (   -0.6073    -0.9037)     2.1776     2.3709
      1798 (   -0.8947    -0.5811)     2.1336     2.2761
      1800 (    0.6402     0.7746)     2.0097     2.0195
      1801 (    0.7210     0.5212)     1.7793     1.5829
      1803 (   -0.5467    -0.6950)     1.7685     1.5638
      1804 (   -0.6232    -0.3690)     1.4485     1.0490
      1809 (   -0.6457    -0.5318)     1.6731     1.3996
      1811 (    0.4876     0.6466)     1.6196     1.3115
      1812 (    0.6688     0.4906)     1.6589     1.3759
      1814 (   -0.4376    -0.7728)     1.7762     1.5774
      1815 (   -0.8517    -0.6448)     2.1364     2.2822
      1817 (    0.4681     0.6593)     1.6172     1.3076
      1818 (    0.6824     0.5767)     1.7870     1.5966
      1820 (   -0.3808    -0.5756)     1.3804     0.9527
      1821 (   -0.5773    -0.3722)     1.3739     0.9438
      1826 (   -0.5235    -0.5277)     1.4866     1.1050
      1828 (    0.3961     0.5850)     1.4130     0.9982
      1829 (    0.6447     0.5126)     1.6473     1.3568
      1831 (   -0.3013    -0.6583)     1.4480     1.0484
      1832 (   -0.8076    -0.6817)     2.1138     2.2341
      1834 (    0.3556     0.5394)     1.2922     0.8349
      1835 (    0.5755     0.5113)     1.5396     1.1851
      1838 (   -0.5621    -0.4479)     1.4374     1.0330
      1843 (   -0.5577    -0.6293)     1.6817     1.4140
      1846 (    0.6783     0.5931)     1.8021     1.6238
      1849 (   -0.7589    -0.7348)     2.1127     2.2317
      1852 (    0.5611     0.4844)     1.4826     1.0990
      1855 (   -0.5268    -0.4049)     1.3287     0.8828
      1857 (    0.4479     0.5029)     1.3469     0.9071
      1860 (   -0.4040    -0.5545)     1.3723     0.9415
      1863 (    0.6084     0.5940)     1.7006     1.4461
      1866 (   -0.7135    -0.7572)     2.0808     2.1648
      1869 (    0.6235     0.5931)     1.7211     1.4810
      1872 (   -0.6399    -0.5927)     1.7444     1.5214
      1877 (   -0.3922    -0.5688)     1.3818     0.9547
      1880 (    0.5493     0.5693)     1.5822     1.2517
      1883 (   -0.6106    -0.7427)     1.9230     1.8490
      1884 (   -0.6148    -0.3051)     1.3727     0.9422
      1886 (    0.5917     0.6357)     1.7368     1.5083
      1887 (    0.5719     0.3247)     1.3153     0.8650
      1889 (   -0.5916    -0.6428)     1.7472     1.5264
      1892 (    0.5014     0.4698)     1.3743     0.9443
      1895 (   -0.5455    -0.3918)     1.3432     0.9021
      1897 (    0.4741     0.5314)     1.4243     1.0144
      1900 (   -0.4978    -0.6858)     1.6949     1.4364
      1901 (   -0.6572    -0.4270)     1.5675     1.2285
      1903 (    0.4999     0.6011)     1.5636     1.2224
      1904 (    0.6008     0.4289)     1.4763     1.0898
      1906 (   -0.4765    -0.6516)     1.6144     1.3032
      1907 (   -0.5718    -0.3313)     1.3216     0.8733
      1912 (   -0.5328    -0.4542)     1.4002     0.9803
      1914 (    0.4262     0.5720)     1.4267     1.0177
      1915 (    0.5523     0.4458)     1.4195     1.0075
      1917 (   -0.4583    -0.7025)     1.6776     1.4071
      1918 (   -0.7217    -0.5466)     1.8107     1.6393
      1920 (    0.4318     0.5975)     1.4744     1.0869
      1921 (    0.6397     0.4978)     1.6212     1.3142
      1932 (    0.4983     0.4477)     1.3398     0.8975
      1934 (   -0.3503    -0.6434)     1.4650     1.0732
      1935 (   -0.7401    -0.6257)     1.9382     1.8784
      1937 (    0.3682     0.5738)     1.3636     0.9297
      1938 (    0.6561     0.5568)     1.7210     1.4809
      1941 (   -0.5630    -0.4198)     1.4046     0.9864
      1949 (    0.5201     0.4733)     1.4065     0.9892
      1951 (   -0.2489    -0.5846)     1.2707     0.8074
      1952 (   -0.7649    -0.7377)     2.1253     2.2584
      1955 (    0.6854     0.6165)     1.8438     1.6997
      1958 (   -0.5426    -0.4192)     1.3713     0.9403
      1966 (    0.5802     0.5520)     1.6017     1.2826
      1969 (   -0.6955    -0.7515)     2.0479     2.0969
      1972 (    0.6026     0.6063)     1.7097     1.4615
      1975 (   -0.5296    -0.4935)     1.4477     1.0479
      1980 (   -0.4691    -0.5739)     1.4825     1.0989
      1983 (    0.5817     0.6276)     1.7116     1.4647
      1984 (    0.5541     0.3526)     1.3135     0.8627
      1986 (   -0.6485    -0.7723)     2.0169     2.0338
      1987 (   -0.6461    -0.3541)     1.4735     1.0856
      1989 (    0.5044     0.5461)     1.4868     1.1054
      1992 (   -0.4451    -0.4639)     1.2859     0.8267
      1998 (   -0.5347    -0.3992)     1.3346     0.8906
      2000 (    0.5005     0.6019)     1.5656     1.2255
      2001 (    0.5902     0.4642)     1.5018     1.1276
      2003 (   -0.5160    -0.7321)     1.7913     1.6043
      2004 (   -0.6948    -0.4559)     1.6621     1.3813
      2006 (    0.4467     0.5232)     1.3758     0.9464
      2007 (    0.5014     0.3821)     1.2608     0.7948
      2009 (   -0.3997    -0.4930)     1.2694     0.8057
      2015 (   -0.5653    -0.5043)     1.5151     1.1478
      2018 (    0.5651     0.5139)     1.5276     1.1668
      2020 (   -0.3460    -0.6168)     1.4144     1.0002
      2021 (   -0.6781    -0.5294)     1.7206     1.4802
      2023 (    0.3921     0.5192)     1.3013     0.8467
      2024 (    0.5370     0.4306)     1.3767     0.9476
      2035 (    0.4505     0.4546)     1.2799     0.8191
      2038 (   -0.6117    -0.5310)     1.6200     1.3123
      2041 (    0.5376     0.4939)     1.4600     1.0658
      2044 (   -0.5229    -0.3653)     1.2757     0.8137

可以发现能量主要集中在“高频”部分。大部分有声音的帧基本都是如此。

看上去有点像“高频”部分是spectral details, “低频”部分就是spectral envolope。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值