原文来自:https://blog.csdn.net/lincaig/article/details/79633771
对于实现voip,pjsip是一个非常优秀的开源项目。其实现了复杂的sip信令交互和音频的传输建立。
1、媒体流的传递过程
我们来结合代码分析下媒体流的传递。
conference.c模块是用来做音频设备和媒体数据流之间的桥接作用,它与媒体流和音频设备之间的数据传递都是通过pjmedia_port接口来实现的。pjmedia_port定义如下(省略了其他字段):
- typedef struct pjmedia_port
- {
- pj_status_t (*put_frame)(struct pjmedia_port *this_port,
- pjmedia_frame *frame);
- pj_status_t (*get_frame)(struct pjmedia_port *this_port,
- pjmedia_frame *frame);
- } pjmedia_port;
媒体stream对象要实现pjmedia_port的方法,作为接口交给conference管理,被动的被conference调用。conference通过get_frame得到stream中解码后的pcm数据,通过put_frame将pcm传递给stream来编码、传输。
conferece内部需要实现一个index为0的port,其对应的pjmedia_port叫master_port。master_port作为与音频设备之间的接口,被动的被sound device调用。音频设备采集的pcm通过put_frame传递给conference,conference接下来传递给所有监听他的音频流。音频设备播放是会通过get_frame从conference获取pcm数据,这些pcm数据是所有被conference监听流mix后的pcm数据。
conference还要充当混音合流的角色。它会将多个输入的stream流的PCM数据混音后,再交给音频设备播放。也能将音频采集的pcm和某路流A混音后,传递个streamB编码发送。
2、音频混音分析
上述提到的master_port需要实现put_frame和get_frame接口。
- /*
- * Recorder (or passive port) callback.
- */
- static pj_status_t put_frame(pjmedia_port *this_port,
- pjmedia_frame *frame)
- {
- pj_status_t status;
- status = pjmedia_delay_buf_put(port->delay_buf, (pj_int16_t*)frame->buf);
- return status;
- }
删除我们分析不比较的代码。
我们看到put_frame方法是将数据保存进了一个delay_buf。由1我们知道这个接口是被sound device调用的,但是这里仅仅做了数据的保存,没有将数据发送给监听的stream。这是为何呢?
其实媒体数据的发送过程是在了get_frame里实现的,为何这么做我们在后面分析。
- /*
- * Player callback.
- */
- static pj_status_t get_frame(pjmedia_port *this_port,
- pjmedia_frame *frame)
- {
- pjmedia_conf *conf = (pjmedia_conf*) this_port->port_data.pdata;
- pjmedia_frame_type speaker_frame_type = PJMEDIA_FRAME_TYPE_NONE;
- unsigned ci, cj, i, j;
- pj_int16_t *p_in;
- TRACE_((THIS_FILE, "- clock -"));
- /* Check that correct size is specified. */
- pj_assert(frame->size == conf->samples_per_frame *
- conf->bits_per_sample / 8);
- /* Must lock mutex */
- pj_mutex_lock(conf->mutex);
- /* Reset port source count. We will only reset port's mix
- * buffer when we have someone transmitting to it.
- */
- for (i=0, ci=0; i<conf->max_ports && ci < conf->port_cnt; ++i) {
- struct conf_port *conf_port = conf->ports[i];
- /* Skip empty port. */
- if (!conf_port)
- continue;
- /* Var "ci" is to count how many ports have been visited so far. */
- ++ci;
- /* Reset buffer (only necessary if the port has transmitter) and
- * reset auto adjustment level for mixed signal.
- */
- conf_port->mix_adj = NORMAL_LEVEL;
- if (conf_port->transmitter_cnt) {
- pj_bzero(conf_port->mix_buf,
- conf->samples_per_frame*sizeof(conf_port->mix_buf[0]));
- }
- }
上述代码初始化了每个port的合流调整值mix_adj为NORMAL_LEVEL,NORMAL_LEVEL的值为128。当mix_adj值为NORMAL_LEVEL时,合流后的音频数据不做调整。若mix_adj为200,需要对mix_buf的每个采样做处理:
mix_buf[i] = mix_buf[i] * 200 / 128
这里要注意的是,mix_buf保存的不是这个port本身的数据,而是其监听流的数据。
假如有三个流对象streamA、streamB和streamC,若streamA监听了streamB和streamC,那么streamA的transmitter_cnt值为2,streamB和streamC的listener_cnt为1。streamB和streamC的数据会被conference 混合进streamA的mix_buf中,最终通过streamA发送出去。
- /* Get frames from all ports, and "mix" the signal
- * to mix_buf of all listeners of the port.
- */
- for (i=0, ci=0; i < conf->max_ports && ci < conf->port_cnt; ++i) {
- struct conf_port *conf_port = conf->ports[i];
- pj_int32_t level = 0;
- /* Skip empty port. */
- if (!conf_port)
- continue;
- /* Var "ci" is to count how many ports have been visited so far. */
- ++ci;
- /* Skip if we're not allowed to receive from this port. */
- if (conf_port->rx_setting == PJMEDIA_PORT_DISABLE) {
- conf_port->rx_level = 0;
- continue;
- }
- /* Also skip if this port doesn't have listeners. */
- if (conf_port->listener_cnt == 0) {
- conf_port->rx_level = 0;
- continue;
- }
- /* Get frame from this port.
- * For passive ports, get the frame from the delay_buf.
- * For other ports, get the frame from the port.
- */
- if (conf_port->delay_buf != NULL) {
- pj_status_t status;
- status = pjmedia_delay_buf_get(conf_port->delay_buf,
- (pj_int16_t*)frame->buf);
- if (status != PJ_SUCCESS) {
- conf_port->rx_level = 0;
- continue;
- }
- } else {
- pj_status_t status;
- pjmedia_frame_type frame_type;
- status = read_port(conf, conf_port, (pj_int16_t*)frame->buf,
- conf->samples_per_frame, &frame_type);
- if (status != PJ_SUCCESS) {
- /* bennylp: why do we need this????
- * Also see comments on similar issue with write_port().
- PJ_LOG(4,(THIS_FILE, "Port %.*s get_frame() returned %d. "
- "Port is now disabled",
- (int)conf_port->name.slen,
- conf_port->name.ptr,
- status));
- conf_port->rx_setting = PJMEDIA_PORT_DISABLE;
- */
- conf_port->rx_level = 0;
- continue;
- }
- /* Check that the port is not removed when we call get_frame() */
- if (conf->ports[i] == NULL) {
- conf_port->rx_level = 0;
- continue;
- }
- /* Ignore if we didn't get any frame */
- if (frame_type != PJMEDIA_FRAME_TYPE_AUDIO) {
- conf_port->rx_level = 0;
- continue;
- }
- }
遍历所有port,查看其是否被其他port监听,若listener_cnt为0,直接continue,若有,从这个port中读取pcm数据。
这里读取pcm数据有两个方式,一直是从delay_buf,正好就是我们我们在第1节中提到的录音回调,这个port是一个特殊的media_port,叫master_port,index为0,;其他普通的port都是通过read_port调用各stream对象的get_frame得到。
- p_in = (pj_int16_t*) frame->buf;
- /* Adjust the RX level from this port
- * and calculate the average level at the same time.
- */
- if (conf_port->rx_adj_level != NORMAL_LEVEL) {
- for (j=0; j<conf->samples_per_frame; ++j) {
- /* For the level adjustment, we need to store the sample to
- * a temporary 32bit integer value to avoid overflowing the
- * 16bit sample storage.
- */
- pj_int32_t itemp;
- itemp = p_in[j];
- /*itemp = itemp * adj / NORMAL_LEVEL;*/
- /* bad code (signed/unsigned badness):
- * itemp = (itemp * conf_port->rx_adj_level) >> 7;
- */
- itemp *= conf_port->rx_adj_level;
- itemp >>= 7;
- /* Clip the signal if it's too loud */
- if (itemp > MAX_LEVEL) itemp = MAX_LEVEL;
- else if (itemp < MIN_LEVEL) itemp = MIN_LEVEL;
- p_in[j] = (pj_int16_t) itemp;
- level += (p_in[j]>=0? p_in[j] : -p_in[j]);
- }
- } else {
- for (j=0; j<conf->samples_per_frame; ++j) {
- level += (p_in[j]>=0? p_in[j] : -p_in[j]);
- }
- }
- level /= conf->samples_per_frame;
- /* Convert level to 8bit complement ulaw */
- level = pjmedia_linear2ulaw(level) ^ 0xff;
- /* Put this level to port's last RX level. */
- conf_port->rx_level = level;
上述代码根据设置的rx_adj_level,调整每个sample的值。根据调整后的sample值的绝对值累加值,计算出平均sample的值level。将level转换成8bit的u律,保存进rx_level。
- // Ticket #671: Skipping very low audio signal may cause noise
- // to be generated in the remote end by some hardphones.
- /* Skip processing frame if level is zero */
- //if (level == 0)
- // continue;
- /* Add the signal to all listeners. */
- for (cj=0; cj < conf_port->listener_cnt; ++cj)
- {
- struct conf_port *listener;
- pj_int32_t *mix_buf;
- listener = conf->ports[conf_port->listener_slots[cj]];
- /* Skip if this listener doesn't want to receive audio */
- if (listener->tx_setting != PJMEDIA_PORT_ENABLE)
- continue;
- mix_buf = listener->mix_buf;
- if (listener->transmitter_cnt > 1) {
- /* Mixing signals,
- * and calculate appropriate level adjustment if there is
- * any overflowed level in the mixed signal.
- */
- unsigned k, samples_per_frame = conf->samples_per_frame;
- pj_int32_t mix_buf_min = 0;
- pj_int32_t mix_buf_max = 0;
- for (k = 0; k < samples_per_frame; ++k) {
- mix_buf[k] += p_in[k];
- if (mix_buf[k] < mix_buf_min)
- mix_buf_min = mix_buf[k];
- if (mix_buf[k] > mix_buf_max)
- mix_buf_max = mix_buf[k];
- }
- /* Check if normalization adjustment needed. */
- if (mix_buf_min < MIN_LEVEL || mix_buf_max > MAX_LEVEL) {
- int tmp_adj;
- if (-mix_buf_min > mix_buf_max)
- mix_buf_max = -mix_buf_min;
- /* NORMAL_LEVEL * MAX_LEVEL / mix_buf_max; */
- tmp_adj = (MAX_LEVEL<<7) / mix_buf_max;
- if (tmp_adj < listener->mix_adj)
- listener->mix_adj = tmp_adj;
- }
- } else {
- /* Only 1 transmitter:
- * just copy the samples to the mix buffer
- * no mixing and level adjustment needed
- */
- unsigned k, samples_per_frame = conf->samples_per_frame;
- for (k = 0; k < samples_per_frame; ++k) {
- mix_buf[k] = p_in[k];
- }
- }
- } /* loop the listeners of conf port */
- } /* loop of all conf ports */
上述代码将此port的pcm数据拷贝进它listener port的mix_buf里。
1、若listener port仅监听一个port,即当前的port,只要将pcm数据简单拷贝进mix_buf里即可;
2、若listener port监听多个port,需将当前port的数据累加到mix_buf,计算累加后的最大值mix_buf_max和最小值mix_buf_min。当MAX(-mix_buf_min, mix_buf_max)大于MAX_LEVEL时,计算tmp_adj值:MAX_LEVEL * 128 / mix_buf_max。更新port->mix_adj为tmp_adj,若tmp_adj变小。
- /* Time for all ports to transmit whetever they have in their
- * buffer.
- */
- for (i=0, ci=0; i<conf->max_ports && ci<conf->port_cnt; ++i) {
- struct conf_port *conf_port = conf->ports[i];
- pjmedia_frame_type frm_type;
- pj_status_t status;
- if (!conf_port)
- continue;
- /* Var "ci" is to count how many ports have been visited. */
- ++ci;
- status = write_port( conf, conf_port, &frame->timestamp,
- &frm_type);
- if (status != PJ_SUCCESS) {
- /* bennylp: why do we need this????
- One thing for sure, put_frame()/write_port() may return
- non-successfull status on Win32 if there's temporary glitch
- on network interface, so disabling the port here does not
- sound like a good idea.
- PJ_LOG(4,(THIS_FILE, "Port %.*s put_frame() returned %d. "
- "Port is now disabled",
- (int)conf_port->name.slen,
- conf_port->name.ptr,
- status));
- conf_port->tx_setting = PJMEDIA_PORT_DISABLE;
- */
- continue;
- }
- /* Set the type of frame to be returned to sound playback
- * device.
- */
- if (i == 0)
- speaker_frame_type = frm_type;
- }
遍历所有port,通过write_port往stream里put_frame数据。后面会分析write_port()。
- /* Return sound playback frame. */
- if (conf->ports[0]->tx_level) {
- TRACE_((THIS_FILE, "write to audio, count=%d",
- conf->samples_per_frame));
- pjmedia_copy_samples( (pj_int16_t*)frame->buf,
- (const pj_int16_t*)conf->ports[0]->mix_buf,
- conf->samples_per_frame);
- } else {
- /* Force frame type NONE */
- speaker_frame_type = PJMEDIA_FRAME_TYPE_NONE;
- }
- /* MUST set frame type */
- frame->type = speaker_frame_type;
- pj_mutex_unlock(conf->mutex);
- #ifdef REC_FILE
- if (fhnd_rec == NULL)
- fhnd_rec = fopen(REC_FILE, "wb");
- if (fhnd_rec)
- fwrite(frame->buf, frame->size, 1, fhnd_rec);
- #endif
- return PJ_SUCCESS;
- }
数据返回。前面我们知道get_frame方法是被音频设备调用的,conference的index为0的port用来给音频设备提供数据。直接从此port的mix_buf拷贝数据。