如何用dds实现线性调频
The sound of birdsong is varied, beautiful, and relaxing. In the pre-Covid times, I made a focus timer which would play some recorded bird sounds during breaks, and I always wondered whether such sounds could be generated. After some trial and error, I landed on a proof-of-concept architecture which can both successfully reproduce a single chirp and has parameters which can be adjusted to alter the generated sound.
鸟鸣声多变,优美而轻松。 在Covid之前的时期,我制作了一个对焦计时器 ,该计时器会在休息时播放一些录制的鸟类声音,而我一直想知道是否会产生这样的声音。 经过一番尝试和错误之后,我进入了概念验证架构,该架构既可以成功复制单个chi声,又可以调整参数以更改生成的声音。
Since generating bird sounds seems like a somewhat novel application, I think it is worth sharing this approach. Along the way, I also learned how to take TensorFlow models apart and graft parts of them together. The code blocks below show how this is done. The full code can be found here.
由于生成鸟的声音似乎是一种新颖的应用程序,因此我认为值得分享这种方法。 在此过程中,我还学习了如何将TensorFlow模型分开并将它们的一部分移植在一起。 下面的代码块显示了如何完成此操作。 完整的代码可以在这里找到。
理论上的方法 (The approach in theory)
The generator will be composed two parts. The first part will take the entire sounds and encode key pieces of information about its overall shape in a small number of parameters.
发电机将由两部分组成。 第一部分将提取全部声音,并以少量参数对有关其总体形状的关键信息进行编码。
The second part will take a small bit of sound, along with the information about the overall shape, and predict the next little bit of sound.
第二部分将吸收少量声音以及有关整体形状的信息,并预测下一个声音。
The second part can be called iteratively on itself with adjusted parameters to produce an entirely new chirp!
第二部分可以通过调整后的参数自行调用,以产生全新的an!
编码参数 (Encoding the parameters)
An autoencoder structure is used for deriving the key parameters of the sound. This structure takes the entire soundwave and reduces it, through a series of (encoding) layers, down to a small number of components (the waist), before reproducing the sound in full from a series of expanding (decoding) layers. Once trained, the autoencoder model is cut off at the waist so that all it does is reduce the full sound down to the key parameters.
自动编码器结构用于导出声音的关键参数。 这种结构吸收了整个声波,并通过一系列(编码)层将其减小到少数组件(腰部),然后再从一系列扩展(解码)层中完全再现声音。 接受训练后,自动编码器模型会在腰部被切断,从而将整个声音降低到关键参数。
For the proof of concept, a single chirp was used; this chirp:
为了证明概念,使用了一个线性调频脉冲。 此chi: