qemu live migration 优化 1(compress and xbzrle)

版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/yadehuiyin/article/details/81010018

qemu本身对动态迁移有丰富的优化项,通过qemu monitor可以查看

(qemu) info migrate_capabilities
xbzrle: off
rdma-pin-all: off
auto-converge: off
zero-blocks: off
compress: off
events: off
postcopy-ram: off
x-colo: off
release-ram: off
block: off
return-path: off
pause-before-switchover: off
x-multifd: off
dirty-bitmaps: off
postcopy-blocktime: off
late-block-activate: off

有部分优化选项比较常见,比如postcopy-ram, compress。查询了一下这些优化项,有些优化项完全没有资料,看来只能看源码来了解它的作用了。接下来会尝试这些优化项并学习其原理,看看它们对于迁移优化的程度。


1. compress

打开compress的选项后,服务器会在迁移前对ram中的数据做压缩,在测试中,压缩的比率差异很大,从10%到80%,对于带宽不足的虚拟机迁移效率提升有所帮助。但是做压缩的时候会对cpu有额外的消耗,并且压缩也会耗时,所以在带宽足够大的情况下,迁移前进行压缩反而会导致动态迁移时间变长。

可以开启压缩和解压的多线程,可以加速压缩的速率,默认情况下配置的是compress 8 threads, decompress 2 threads,由于用的压缩算法是zlib,压缩和解压的速率差4倍,所以配置多线程时建议压缩线程数是解压线程数的四倍。

此外,可以对压缩的比率和压缩速度进行调节。level 0代表不压缩,level 1代表压缩速度最快但是压缩比率最低,level 9代表压缩比率最高,但是相应压缩速度也最慢。

虚拟机中模拟内存占用,脚本

[root@localhost ~]# cat make_cache_2.2G.sh
#!/bin/bash -x
mount -t ramfs ramfs z/
cp 800m_file z/1
cp 800m_file z/2
[root@localhost ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           4.9G        180M        2.5G         11M        2.2G        4.0G
Swap:          4.0G          0B        4.0G

把两个800m的文件放到ramfs中,可以看到memory中buff/cache已经上升到了2.2G


在qemu monitor 中配置

源端服务器

(qemu) migrate_set_capability compress on
(qemu) migrate_set_parameter compress-threads 1
(qemu) migrate_set_parameter compress-level 1

目的端服务器

(qemu) migrate_set_capability compress on
(qemu) migrate_set_parameter decompress-threads 2

迁移结果对比

 no compress

compressed threads: 8

decompressed threads: 2 

compression level: 1

total time(msec):3317237697 (13.6%↑)
downtime(msec):13117 (87%)
transfered ram(kB):37286023464038 (7.1%↓)
throughput(mbps):921.04752.96 (18.2%↓)
total ram(kB):53744005374400
p.s. 绿色代表会导致迁移效率下降,红色代表能提高迁移效率


2. xbzrle

Xor Based Zero Run Length Encoding

简单说来,这个技术就是用亦或的方式找到memory 变化的部分(不开启xbzrle时,会把整张memory page进行传输),然后压缩再传输给目的服务器,从而减少了上传数据量,使得dirty pages能够尽快减少到可以down机再完整迁移的数量。尤其是对于那些memory write intensive workload的虚拟机的动态迁移,会有很大的帮助。准确的说是,不开启这个选项,由于虚拟机 memory wirte很频繁,dirty pages的数量始终维持在一个数量级, 使得迁移一直进行,无法进入到down机把剩余dirty page迁移过去的阶段。

为了找到memory变化的部分,原始memory会被储存在源服务器的cache里用以做比较,储存memory的cache对cache命中率有影响,还未对这个做深入的研究。

为了做这个测试,写了一个内存读写的程序。

#include <stdlib.h>
#include <stdio.h>
int main()
{
    long int buf_length;
    long int buf_num;
    char c[100];
    buf_length = 4096;
    int compare_result;


    printf("Enter memory you want to test: ");
    gets(c);
    if (strcmp(c,"128m")==0)
        compare_result = 1;
    else if(strcmp(c,"256m")==0)
        compare_result = 2;
    else if(strcmp(c,"512m")==0)
        compare_result = 3;
    else if(strcmp(c,"1g")==0)
        compare_result = 4;
    else if(strcmp(c,"2g")==0)
        compare_result = 5;
    else
        compare_result = 0;

    switch(compare_result)
        {
        case 1:
                buf_num = 32768;
                printf("start generate 128M memory r/w load, ctrl+c to quit. \n");
                break;
        case 2:
                buf_num = 65536;
                printf("start generate 256M memory r/w load, ctrl+c to quit. \n");
                break;
        case 3:
                buf_num = 131072;
                printf("start generate 512M memory r/w load, ctrl+c to quit. \n");
                break;
        case 4:
                buf_num = 262144;
                printf("start generate 1G memory r/w load, ctrl+c to quit. \n");
                break;
        case 5:
                buf_num = 524288;
                printf("start generate 2G memory r/w load, ctrl+c to quit. \n");
                break;
        default:
                buf_num = 0;
                printf("please input following size 128m, 256m, 512, 1g, 2g. ");
                break;
        }

    if(buf_num!=0){
        printf("use free -h to monitor ");
        char *buf = (char *) calloc(buf_num, buf_length);
        while (1) {
                long int i;
                for (i = 0; i < buf_num * 4 ; i++) {
                 buf[i * buf_length / 4 ]++;
                }
                printf(".");
        }
    }
    else{
        printf("please input correct size. \n");
    }
}

可以根据输入占用128M - 2G内存,并进行读写。

在qemu monitor 中配置

源端服务器

(qemu) migrate_set_capability xbzrle on
(qemu)  migrate_set_cache_size 256m

默认的cache size是64M,最好配置cashe size大于被读写的内存,否则xbzrle cache miss 会很大,会导致迁移一直持续。

未打开xbzrle,在虚拟机中开启内存占用128M,再进行动态迁移。可以看到已经传输了11G的memory,迁移还未完成

(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: off
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off
Migration status: active
total time: 99833 milliseconds
expected downtime: 4872 milliseconds
setup: 4 milliseconds
transferred ram: 11202151 kbytes
throughput: 891.65 mbps
remaining ram: 240188 kbytes
total ram: 5374400 kbytes
duplicate: 1258520 pages
skipped: 0 pages
normal: 2792316 pages
normal bytes: 11169264 kbytes
dirty sync count: 23
page size: 4 kbytes
multifd bytes: 0 kbytes
dirty pages rate: 31494 pages

当打开xbzrle时,可以看到迁移很快就完成了

(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: off
capabilities: xbzrle: on rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off
Migration status: completed
total time: 6405 milliseconds
downtime: 16 milliseconds
setup: 4 milliseconds
transferred ram: 601813 kbytes
throughput: 770.35 mbps
remaining ram: 0 kbytes
total ram: 5374400 kbytes
duplicate: 1246173 pages
skipped: 0 pages
normal: 146870 pages
normal bytes: 587480 kbytes
dirty sync count: 8
page size: 4 kbytes
multifd bytes: 0 kbytes
cache size: 268435456 bytes
xbzrle transferred: 2232 kbytes
xbzrle pages: 87614 pages
xbzrle cache miss: 46297
xbzrle cache miss rate: 15.33
xbzrle overflow : 0

可以看到info migrate多出了几项和xbzrle有关的参数,看字面意义就能理解,这里就不做解释了。


展开阅读全文

Compress the String

08-19

Dan is playing a game with Ben. Ben gives Dan a long string S, and Dan needs to compress S to a list or short strings: s[1], s[2], ..., s[N]. S only contains lowercase letters. Each s[i] can contain lowercase letters and digits, but only digits between i+1 and N, inclusive, are allowed for s[i]. For example, when N = 4, allowed digits for s[2] will be '3' and '4'.nnIn order to decompress such a list of strings into a single string, we'll apply string decompress algorithm for each string one by one, in reverse order (from s[N] to s[1]). The decompressing algorithm for a string is easy: just replace each digit in the string by the corresponding decompressed string with that digit as index. That is to say, for digit i in the string, it will be replaced by the decompressed string s[i]. Because we are applying decompressing algorithm in reverse order, s[i] will always be decompressed before it is used to replace a digit in other strings. When all the strings are decompressed, the decompressed string of s[1] will be the final result. If the decompressing result is S, we say S can be compressed to this list of strings.nnNow Ben decides the number of short strings Dan can use (N), as well as the length limit for each short string. Dan needs to decide whether it is possible to compress S to N short strings under this limit.nnInputnnThere are multiple test cases (no more than 150). For each case, there will be three lines. The first line gives an integer N (1 ≤ N ≤ 4), which is the number of short strings Dan can use. The second line gives N integers L[1], ... L[N] (1 ≤ L[i] ≤ 4), which means the length of string s[i] should be at most L[i]. The third line gives the string S. The length of S will be between 1 and 500, inclusive. S will only contain lowercase letters.nnOutputnnFor each case, if it is possible to compress S into N strings and the length of each string is no more than the limit, output "Yes". Otherwise, output "No".nnSample Inputnn1n1naan4n4 4 4 4ntttttttttttttttnSample OutputnnNonYes 问答

没有更多推荐了,返回首页