最近调研和研发关于c++支持ws和wss协议的网络底层,意外发现了Openssl内部出现死循环的情况。网络底层采用boost::asio和Openssl的方式支持wss协议。
平时使用都是正常,等到最近压测的时候发现,一段时间后会出现死循环的情况,经过一段时间的调查发现竟然是Openssl底层导致的死循环,死循环的堆栈的信息如下:
19:40:03.441 d:\tddownload\code\openssl-1.1.0i_test\crypto\bio\bss_bio.c (325): bio_write
19:40:03.441 d:\tddownload\code\openssl-1.1.0i_test\crypto\bio\bio_lib.c (232): BIO_write
19:40:03.441 d:\tddownload\code\openssl-1.1.0i_test\ssl\record\rec_layer_s3.c (930): ssl3_write_pending
19:40:03.441 d:\tddownload\code\openssl-1.1.0i_test\ssl\record\rec_layer_s3.c (881): do_ssl3_write
后来又写了一些测试代码,发现实际上是无符号整形溢出。溢出的接口名称:bio_write
源码如下:
static int bio_write(BIO *bio, const char *buf, int num_)
{
size_t num = num_;
size_t rest;
struct bio_bio_st *b;
BIO_clear_retry_flags(bio);
if (!bio->init || buf == NULL || num == 0)
return 0;
b = bio->ptr;
assert(b != NULL);
assert(b->peer != NULL);
assert(b->buf != NULL);
b->request = 0;
if (b->closed) {
/* we already closed */
BIOerr(BIO_F_BIO_WRITE, BIO_R_BROKEN_PIPE);
return -1;
}
assert(b->len <= b->size);
if (b->len == b->size) {
BIO_set_retry_write(bio); /* buffer is full */
return -1;
}
/* we can write */
if (num > b->size - b->len)
num = b->size - b->len;
/* now write "num" bytes */
rest = num;
assert(rest > 0);
do { /* one or two iterations */
size_t write_offset;
size_t chunk;
assert(b->len + rest <= b->size);
write_offset = b->offset + b->len;
if (write_offset >= b->size)
write_offset -= b->size;
/* b->buf[write_offset] is the first byte we can write to. */
if (write_offset + rest <= b->size)
chunk = rest;
else
/* wrap around ring buffer */
chunk = b->size - write_offset;
memcpy(b->buf + write_offset, buf, chunk);
b->len += chunk;
assert(b->len <= b->size);
rest -= chunk;
buf += chunk;
}
while (rest);
return num;
}
死循环是因为,chunk会大于rest,然后导致rest溢出,然后chunk的值一直为0,导致死循环。
刚接触Openssl不确定是否是因为使用上的问题导致的,不过代码确实有不够健壮的地方。
目前该问题已经反馈到Openssl的github的issues上,希望官方组织能及时修复该问题。