linux音频编程,linux音频编程

最新推荐文章于 2024-06-24 15:03:32 发布

weixin_39943547

最新推荐文章于 2024-06-24 15:03:32 发布

阅读量182

点赞数

文章标签： linux音频编程

Hello. The text is copy from other blog , I delete and

addition some . The blog is : http://blog.csdn.net/heanyu/article/details/6339783 (thanks him).

Audio device(声卡)： (说明与区别)

/dev/dsp 、/dev/dspW、/dev/audio：读这个设备就相当于录音，写这个设备就相当于放音。

/dev/dsp与/dev/audio之间的区别在于采样的编码不同，

/dev/audio使用μ律编码，

/dev/dsp使用8-bit(无符号)线性编码，

/dev/dspW使用16-bit(有符号)线形编码。

/dev/audio主要是为了与SunOS兼容，所以尽量不要使用。

/dev/dsp is the digital sampling and digital recording

device, and probably the most important for multimedia

applications. Writing to the device accesses the D/A converter to

produce sound. Reading the device activates the A/D converter for

sound recording and analysis.

The name DSP comes from the term digital signal

processor, a specialized processor chip optimized for digital

signal analysis. Sound cards may use a dedicated DSP chip, or may

implement the functions with a number of discrete devices. Other

terms that may be used for this device are digitized voice

and PCM.

Some sounds cards provide more than one digital sampling device;

in this case a second device is available as /dev/dsp1.

Unless noted otherwise, this device operates in the same manner as

/dev/dsp.

The DSP device is really two devices in one. Opening for

read-only access allows you to use the A/D converter for sound

input. Opening for write only will access the D/A converter for

sound output. Generally speaking you should open the device either

for read only or for write only. It is possible to perform both

read and write on the device, albeit with some restrictions; this

will be covered in a later section.

Only one process can have the DSP device open at a time.

Attempts by another process to open it will fail with an error code

of EBUSY.

Reading from the DSP device returns digital sound samples

obtained from the A/D converter. Figure 14-2(a)

shows a conceptual diagram of this process. Analog data is

converted to digital samples by the analog to digital converter

under control of the kernel sound driver and stored in a buffer

internal to the kernel. When an application program invokes the

read system call, the data is transferred to the calling program's

data buffer. It is important to understand that the sampling rate

is dependent on the kernel driver, and not the speed at which the

application program reads it.

访问/dev/dsp时：

1.如果读取太慢(低于采样率)，就会抛弃超出的数据，产生空隙。如果太快，就会阻塞进程。

2.Writing a sequence of digital sample values to the DSP device

produces sound output. This process is illustrated in Figure 14-2(b). Again, the format can be defined using ioctl

calls, but defaults to the values given above for the read system

call (8-bit unsigned data, mono, 8 kHz sampling).

写入一系列数字采样就能产生声音。通过ioctl系统调用能够改变数据的格式。

If the data are written too slowly, there will be dropouts or

pauses in the sound output. Writing the data faster than the

sampling rate will simply cause the kernel sound driver to block

the calling process until the sound card hardware is ready to

process the new data. Unlike some devices, there is no support for

non-blocking I/O.

如果写入太慢，声音数据就会有间断。如果写入太快，就会阻塞进程；不像一些设备，没有non-blocking I/O。

When reading from /dev/dsp you will never encounter an

end-of-file condition. If data is read too slowly (less than the

sampling rate), the excess data will be discarded, resulting in

gaps in the digitized sound. If you read the device too quickly,

the kernel sound driver will block your process until the required

amount of data is available.

The input source depends on the mixer setting (which I will look

at shortly); the default is the microphone input. The format of the

digitized data depends on which ioctl calls have been used to set

up the device. Each time the device is opened, its parameters are

set to default values. The default is 8-bit unsigned samples, using

one channel (mono), and an 8 kHz sampling rate.

Writing a sequence of digital sample values to the DSP device

produces sound output. This process is illustrated in Figure 14-2(b).

Again, the format can be defined using ioctl calls, but defaults to

the values given above for the read system call (8-bit unsigned

data, mono, 8 kHz sampling).

If the data are written too slowly, there will be dropouts or

pauses in the sound output. Writing the data faster than the

sampling rate will simply cause the kernel sound driver to block

the calling process until the sound card hardware is ready to

process the new data. Unlike some devices, there is no support for

non-blocking I/O.

If you don't like the defaults, you can change them through

ioctl calls. In general you should set the parameters after

opening the device, and before any calls to read or write.

You should also set the parameters in the order in which they are

described below.

All DSP ioctl calls take a third argument that is a pointer to

an integer. Don't try to pass a constant; you must use a variable.

The call will return -1 if an error occurs, and set the global

variable errno.

If the hardware doesn't support the exact value you call for,

the sound driver will try to set the parameter to the closest

allowable value. For example, with my sound card, selecting a

sampling rate of 9000 Hz will result in an actual rate of 9009 Hz

being used.

If a parameter is out of range, the driver will set it to the

closest value (i.e., the upper or lower limit). For example,

attempting to use 16-bit sampling with an 8-bit sound card will

result in the driver selecting 8 bits, but no error will be

returned. It is up to you, the programmer, to verify that the value

returned is acceptable to your application.

All of the ioctl calls for the DSP device are names starting

with SOUND_PCM. Calls in the form

SOUND_PCM_READ_XXX are used to return just the

current value of a parameter. To change the values, the ioctl calls

are named like SOUND_PCM_WRITE_XXX. As discussed

above, these calls also return the selected value, which is not

necessarily the same as the value passed to the sound driver.

The ioctl constants are defined in the header file

linux/soundcard.h. Let's examine each of them in detail.

SOUND_PCM_WRITE_BITS

Sets the sample size, in bits. Valid choices are 8 and 16, but

some cards do not support 16.

SOUND_PCM_READ_BITS

Returns the current sample size, which should be either 8 or 16

bits.

SOUND_PCM_WRITE_CHANNELS

Sets the number of channels--1 for mono, 2 for stereo. When

running in stereo mode, the data is interleaved when read or

written, in the format left-right-left-right.... Remember that some

sound cards do not support stereo; check the actual number of

channels returned in the argument.

SOUND_PCM_READ_CHANNELS

Returns the current number of channels, either 1 or 2.

SOUND_PCM_WRITE_RATE

Sets the sampling rate in samples per second. Remember that all

sound cards have a limit on the range; the driver will round the

rate to the nearest speed supported by the hardware, returning the

actual (rounded) rate in the argument. Typical lower limits are 4

kHz; upper limits are 13, 15, 22, or 44 kHz.

SOUND_PCM_READ_RATE

Returns just the current sampling rate. This is the rate used by

the kernel, which may not be exactly the rate given in a previous

call to SOUND_PCM_WRITE_RATE, because of the previously discussed

rounding.

你可以通过ioctl控制设备(这是当然的，任何能被系统识别的设备都必须提供这一个接口。要记住操作系统是有这方面的优势的。)如果你所设定的参数不适合，那么会自动选择最为接近的参数。如果超出了范围也会选择上限。不会返回错误。

使用SOUND_PCM_READ_XXX

来读取现在的设备参数。使用SOUND_PCM_WRITE_XXX来设置设备参数。下面介绍几个重要参数：

SOUND_PCM_WRITE_BITS

Sets the sample size, in bits. Valid choices are 8 and 16, but

some cards do not support 16.

设置数据样本的位数，8或16位。

SOUND_PCM_READ_BITS

Returns the current sample size, which should be either 8 or 16

bits.

SOUND_PCM_WRITE_CHANNELS

Sets the number of channels--1 for mono, 2 for stereo. When

running in stereo mode, the data is interleaved when read or

written, in the format left-right-left-right.... Remember that some

sound cards do not support stereo; check the actual number of

channels returned in the argument.

设置声道。

SOUND_PCM_READ_CHANNELS

Returns the current number of channels, either 1 or 2.

SOUND_PCM_WRITE_RATE

Sets the sampling rate in samples per second. Remember that all

sound cards have a limit on the range; the driver will round the

rate to the nearest speed supported by the hardware, returning the

actual (rounded) rate in the argument. Typical lower limits are 4

kHz; upper limits are 13, 15, 22, or 44 kHz.

设置采样率。

SOUND_PCM_READ_RATE

Returns just the current sampling rate. This is the rate used by

the kernel, which may not be exactly the rate given in a previous

call to SOUND_PCM_WRITE_RATE, because of the previously discussed

rounding.

I will now illustrate programming of the DSP device with a short

example. I call the program in Example 14-2

parrot. It records a few seconds of audio, saving it to an

array in memory, then plays it back.Reading and Writing the /dev/dsp Device

#include

#define LENGTH 3

#define RATE 8000

#define SIZE 8

#define CHANNELS 1

unsigned char buf[LENGTH*RATE*SIZE*CHANNELS/8];

int main()

{

int fd;

int arg;

int status;

fd = open("/dev/dsp", O_RDWR);

if (fd < 0) {

perror("open of /dev/dsp failed");

exit(1);

}

arg = SIZE;

status = ioctl(fd, SOUND_PCM_WRITE_BITS, &arg);

if (status == -1)

perror("SOUND_PCM_WRITE_BITS ioctl failed");

if (arg != SIZE)

perror("unable to set sample size");

arg = CHANNELS;

status = ioctl(fd, SOUND_PCM_WRITE_CHANNELS, &arg);

if (status == -1)

perror("SOUND_PCM_WRITE_CHANNELS ioctl failed");

if (arg != CHANNELS)

perror("unable to set number of channels");

arg = RATE;

status = ioctl(fd, SOUND_PCM_WRITE_RATE, &arg);

if (status == -1)

perror("SOUND_PCM_WRITE_WRITE ioctl failed");

while (1) {

printf("Say something:/n");

status = read(fd, buf, sizeof(buf));

if (status != sizeof(buf))

perror("read wrong number of bytes");

printf("You said:/n");

status = write(fd, buf, sizeof(buf));

if (status != sizeof(buf))

perror("wrote wrong number of bytes");

status = ioctl(fd, SOUND_PCM_SYNC, 0);

if (status == -1)

perror("SOUND_PCM_SYNC ioctl failed");

}

The source file starts by including a number of standard header

files, including linux/soundcard.h. Then some constants are

defined for the sound card settings used in the program, which

makes it easy to change the values used. A static buffer is defined

to hold the sound data.

I first open the DSP device for both read and write and check

that the open was successful. Next I set the sampling parameters

using ioctl calls. Notice that a variable must be used because the

driver expects a pointer. In each case I check for an error from

the ioctl call (a return value of -1), and that the values actually

used are within range. This programming may appear to be overly

cautious, but I consider it good coding practice that pays off when

trying to debug the code. Note that I do not check that the actual

sampling rate returned matches the selected rate because of the

sampling rate rounding previously described.

I then run in a loop, first prompting the user to speak, then

reading the sound data into the buffer. Once the data is received,

I warn the user, then write the same data back to the DSP device,

where it should be heard. This repeats until the program is

interrupted with Control-C.

The SOUND_PCM_SYNC ioctl has not yet been mentioned. I'll show

what this is used for in the section titled "Advanced Sound

Programming," later in this chapter.

Try compiling and running this program. Then make some

enhancements:

Make the parameters selectable using command-line options

(sample rate, size, time). See the effect on sound quality with

different sampling rates.

Reverse the sound samples (and listen for hidden messages), or

play them back at a different sampling rate from the one at which

they were recorded.

Automatically start recording when the voice starts and stop

when silence occurs (or a maximum time is reached). Hints: for

8-bit unsigned data the zero value is 0x80, but you will likely see

values that vary around this level due to noise. Set a noise

threshold (or better yet, measure the background noise level at the

start of the program).

Bonus question: modify the program so that it can recognize the

words that are spoken.

weixin_39943547

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫