音频信号处理（一）语音录制

最新推荐文章于 2023-06-27 14:57:55 发布

午夜零时

最新推荐文章于 2023-06-27 14:57:55 发布

阅读量2.2k

点赞数 2

文章标签：语音识别 linux

本文链接：https://blog.csdn.net/qq_55796594/article/details/120343559

版权

1.1打开一个音频的输入流

import pyaudio

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 16000

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

1.2代码分析

p = pyaudio.PyAudio() 创建一个PyAudio的

stream = p.open(format=FORMAT，channels=CHANNELS, rate=RATE,input=True,output=False, frames_per_buffer=CHUNK)

用open函数打开一个音频的输入流

文档中对各参数的说明
def __init__(self,
                 PA_manager,
                 rate,				#Sampling rate 采样率
                 channels,				#Number of channels音轨数
                 format,				#Sampling size and format. See |PaSampleFormat|.采样点的大小和类型，见PaSampleFormat
                 input=False,			#Specifies whether this is an input stream，Defaults to ``False``，是否为输入流，默认为否
                 output=False,			#Specifies whether this is an output stream，Defaults to ``False``.是否为输出流，默认为否
                 input_device_index=None,		#Index of Input Device to use.Unspecified (or ``None``) uses default device.Ignored if `input` is ``False``.
                 output_device_index=None,		#Index of Output Device to use.Unspecified (or ``None``) uses the default device.Ignored if `output` is ``False``.
                 frames_per_buffer=1024,		#Specifies the number of frames per buffer.
                 start=True,			#Start the stream running immediately.
                 input_host_api_specific_stream_info=None,	#Specifies a host API ，specific stream information data structure for input. See :py:class:`PaMacCoreStreamInfo`
                 output_host_api_specific_stream_info=None,	#Specifies a host API ，specific stream information data structure for output.See :py:class:`PaMacCoreStreamInfo`.
                 stream_callback=None):			#Specifies a callback function for *non-blocking* (callback) operation,which indicates *blocking* operation (i.e.,:py:func:`Stream.read` and :py:func:`Stream.write`).

简单来说我们要使用的参数就像例子中的：

format采样数据的格式，在文档中有给出定义好的格式变量

##### PaSampleFormat Sample Formats #####

paFloat32     #: 32 bit float
paInt32           #: 32 bit int
paInt24          #: 24 bit int
paInt16          #: 16 bit int
paInt8                 #: 8 bit int
paUInt8         #: 8 bit unsigned int
paCustomFormat t        #: a custom data format

channels音轨数一般取2

rate采样率即每秒采样多少次

input=True，Ture代表这是一条输入流，False代表这不是输入流

frames_per_buffer每个缓冲多少帧

设定好这些参数我们就打开了一条可以用于输入的音频流了

2.1录音

RECORD_SECONDS = 2
print("start recording......")

frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS) + 1):
    data = stream.read(CHUNK)
    frames.append(data)

print("end!")

stream.stop_stream()
stream.close()
p.terminate()

2.2代码分析

stream.read(CHUNK)每次读chunk个数据

通过int(RATE / CHUNK * RECORD_SECONDS)计算要读多少次，2秒*每秒的采样率/每次读多少数据=需要读多少次

frames.append(data)将读出的数据保存到列表中

stream.stop_stream() 停止输入流
stream.close() 关闭输入流
p.terminate() 终止portaudio

3.1保存

import wave

WAVE_OUTPUT_FILENAME = "Oldboy.wav"

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

3.2代码解释

wave.open(WAVE_OUTPUT_FILENAME, 'wb') 以’wb‘二进制流写的方式打开一个文件

wf.setnchannels(CHANNELS) 设置音轨数
wf.setsampwidth(p.get_sample_size(FORMAT)) 设置采样点数据的格式，和FOMART保持一致
wf.setframerate(RATE) 设置采样率与RATE要一致

wf.writeframes(b''.join(frames))将声音数据写入文件

wf.close() 关闭文件流，释放句柄

4.1

pyaudio_01.py-编解码文档类资源-CSDN下载

午夜零时

关注

2
点赞
踩
15

收藏

觉得还不错? 一键收藏
0
评论
音频信号处理（一）语音录制

1.1打开一个音频的输入流import pyaudioCHUNK = 1024FORMAT = pyaudio.paInt16CHANNELS = 2RATE = 16000p = pyaudio.PyAudio()stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True,
复制链接

扫一扫