本文,我们将创建一个简单的python机器学习算法,以便能够通过声音来诊断该人是否为患者。
我们将使用一组(健康者和帕金森病患者)音频文件库,通过对音频进行一些测量来构建我们的机器学习数据集。
建立机器学习数据集后,我们将使用SciKit Learn库训练线性回归模型。最后,我们将构建一个python库,这个库可以轻松集成到其他应用程序中。
数据集
首先,我们需要将音频文件转换成包含音频测量值以及患者是否健康标识的表格。
我们将要使用的音频文件(https://zenodo.org/record/2867216#.Xp4kVsgzaUl )。
让我们从导入必要的Python库开始。
import glob
import numpy as np
import pandas as pd
import parselmouth
from parselmouth.praat import call
接下来,我们将创建一个函数,该函数允许您对输入音频文件进行各种复杂的测量。这些测量是通过parselmouth库实现的,它允许在python代码中使用praat。
(https://parselmouth.readthedocs.io/en/stable/)
def measurePitch(voiceID, f0min, f0max, unit):
sound = parselmouth.Sound(voiceID) # read the sound
pitch = call(sound, "To Pitch", 0.0, f0min, f0max)
pointProcess = call(sound, "To PointProcess (periodic, cc)", f0min, f0max)#create a praat pitch object
localJitter = call(pointProcess, "Get jitter (local)", 0, 0, 0.0001, 0.02, 1.3)
localabsoluteJitter = call(pointProcess, "Get jitter (local, absolute)", 0, 0, 0.0001, 0.02, 1.3)
rapJitter = call(pointProcess, "Get jitter (rap)", 0, 0, 0.0001, 0.02, 1.3)
ppq5Jitter = call(pointProcess, "Get jitter (ppq5)", 0, 0, 0.0001, 0.02, 1.3)
localShimmer = call([sound, pointProcess], "Get shimmer (local)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
localdbShimmer = call([sound, pointProcess], "Get shimmer (local_dB)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
apq3Shimmer = call([sound, pointProcess], "Get shimmer (apq3)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
aqpq5Shimmer = call([sound, pointProcess], "Get shimmer (apq5)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
apq11Shimmer = call([sound, pointProcess], "Get shimmer (apq11)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
harmonicity05 = call(sound, "To Harmonicity (cc)", 0.01, 500, 0.1, 1.0)
hnr05 = call(harmonicity05, "Get mean", 0, 0)
harmonicity15 = call(sound, "To Harmonicity (cc)", 0.01, 1500, 0.1, 1.0)
hnr15 = call(harmonicity15, "Get mean", 0, 0)
harmonicity25 = call(sound, "To Harmonicity (cc)", 0.01, 2500, 0.1, 1.0)
hnr25 = call(harmonicity25, "Get mean", 0, 0)
harmonicity35 = call(sound, "To Harmonicity (cc)", 0.01, 3500, 0.1, 1.0)
hnr35 = call(harmonicity35, "Get mean", 0, 0)
harmonicity38 = call(sound, "To Harmonicity (cc)", 0.01, 3800, 0.1, 1.0)
hnr38 = call(harmonicity38, "Get mean", 0, 0)
return localJitter, localabsoluteJitter, rapJitter, ppq5Jitter, localShimmer, localdbShimmer, apq3Shimmer, aqpq5Shimmer, apq11Shimmer, hnr05, hnr15 ,hnr25 ,hnr35 ,hnr38
然后,我们为每种类型的测量创建一个列表,