StarRocks源码阅读系列(3)compaction 压缩机制

前言

本文是基于StarRocks 2.3.3版本源码阅读总结,不同版本源码可能有较大变化,仅供参考。
由于StarRocks的be是用c++语言写的,我的c++水平一般,所以自己对源码的解读可能有不正确的地方,欢迎大佬们指正。

Compaction机制

在开始阅读源码之前先简单介绍一下StarRocks的compaction机制
StarRocks为了保证数据写入的高性能,每一次有新的数据写入的时候,并不会直接写到旧的数据文件中,而是将这些新数据单独写到一个新文件中,称为一个single file
如果数据目录中的single file过多,那么在查询的时候肯定性能会大幅降低,为此StarRocks有两种压缩机制来处理这些新写入的文件。

cumulative compaction累进压缩

这是一种解决小文件过多的轻量级压缩机制,它不会将single file数据目录中的base file压缩到一起,因为base file太大了,可能会造成IO徒增影响集群性能,而是将多个single file聚合到一起,聚合为一种cumulative file
默认情况下,每5个single file生成都会触发一次cumulative compaction
它受到以下配置影响:

参数名 默认值 备注
cumulative_compaction_check_interval_seconds 1 线程检测周期,默认1s
min_cumulative_compaction_num_singleton_deltas 5 触发cumulative compaction的最小singleton file数量
max_cumulative_compaction_num_singleton_deltas 1000 最大1次压缩多少个文件
cumulative_compaction_num_threads_per_disk 1 每个磁盘用来处理cumulative 线程的数量
cumulative_compaction_skip_window_seconds 30 跳过最新的single file时间,因为最新写入的数据可能会被马上查询,所以先不压缩

base compaction 基础压缩

在执行了若干次cumulative compaction后,细粒度的小文件问题得到了缓解,但是引入了新的小文件问题:cumulative file太多了,还是会影响查询。
因此在达到了某些条件后,系统开始执行base compaction 压缩,来将所有的cumulative合并为1个文件。
因为base compaction操作比较重,吃磁盘IO比较高,因此一般来说执行频率不是很高。
它受到以下配置影响:

参数名 默认值 备注
base_compaction_check_interval_seconds 60 线程的检测周期,60s
min_base_compaction_num_singleton_deltas 5 最小的single文件数量,指的是被cumulative后的sigle文件数量
max_base_compaction_num_singleton_deltas 100 单次BaseCompaction合并的最大segment数
base_compaction_num_threads_per_disk 1 每个磁盘 BaseCompaction 线程的数目
base_cumulative_delta_ratio 0.3 Cumulative文件大小达到Base文件大小的比例
base_compaction_interval_seconds_since_last_operation 86400 上一轮 BaseCompaction 距今的间隔,是触发 BaseCompaction 条件之一。

Base Compaction源码阅读

Status StorageEngine::start_bg_threads() {
   
    _update_cache_expire_thread = std::thread([this] {
    _update_cache_expire_thread_callback(nullptr); });
    Thread::set_thread_name(_update_cache_expire_thread, "cache_expire");
    LOG(INFO) << "update cache expire thread started";

    _unused_rowset_monitor_thread = std::thread([this] {
    _unused_rowset_monitor_thread_callback(nullptr); });
    Thread::set_thread_name(_unused_rowset_monitor_thread, "rowset_monitor");
    LOG(INFO) << "unused rowset monitor thread started";

    // start thread for monitoring the snapshot and trash folder
    _garbage_sweeper_thread = std::thread([this] {
    _garbage_sweeper_thread_callback(nullptr); });
    Thread::set_thread_name(_garbage_sweeper_thread, "garbage_sweeper");
    LOG(INFO) << "garbage sweeper thread started";

    // start thread for monitoring the tablet with io error
    _disk_stat_monitor_thread = std::thread([this] {
    _disk_stat_monitor_thread_callback(nullptr); });
    Thread::set_thread_name(_disk_stat_monitor_thread, "disk_monitor");
    LOG(INFO) << "disk stat monitor thread started";

    // convert store map to vector
    std::vector<DataDir*> data_dirs;
    for (auto& tmp_store : _store_map) {
   
        data_dirs.push_back(tmp_store.second);
    }
    int32_t data_dir_num = data_dirs.size();

    if (!config::enable_event_based_compaction_framework) {
   
        // base and cumulative compaction threads
        int32_t base_compaction_num_threads_per_disk =
                std::max<int32_t>(1, config::base_compaction_num_threads_per_disk);
        int32_t cumulative_compaction_num_threads_per_disk =
                std::max<int32_t>(1, config::cumulative_compaction_num_threads_per_disk);
        int32_t base_compaction_num_threads = base_compaction_num_threads_per_disk * data_dir_num;
        int32_t cumulative_compaction_num_threads = cumulative_compaction_num_threads_per_disk * data_dir_num;

        // calc the max concurrency of compaction tasks
        int32_t max_compaction_concurrency = config::max_compaction_concurrency;
        if (max_compaction_concurrency < 0 ||
            max_compaction_concurrency > base_compaction_num_threads + cumulative_compaction_num_threads) {
   
            max_compaction_concurrency = base_compaction_num_threads + cumulative_compaction_num_threads;
        }
        vectorized::Compaction::init(max_compaction_concurrency);

        _base_compaction_threads.reserve(base_compaction_num_threads);
        for (uint32_t i = 0; i < base_compaction_num_threads; ++i) {
   
            _base_compaction_threads.emplace_back([this, data_dir_num, data_dirs, i] {
   
                _base_compaction_thread_callback(nullptr, data_dirs[i % data_dir_num]);
            });
            Thread::set_thread_name(_base_compaction_threads.back(), "base_compact");
        }
        LOG(INFO) << "base compaction threads started. number: " << base_compaction_num_threads;

        _cumulative_compaction_threads.reserve(cumulative_compaction_num_threads);
        for (uint32_t i = 0; i < cumulative_compaction_num_threads; ++i) {
   
            _cumulative_compaction_threads.emplace_back([this, data_dir_num, data_dirs, i] {
   
                _cumulative_compaction_thread_callback(nullptr, data_dirs[i % data_dir_num]);
            });
            Thread::set_thread_name(_cumulative_compaction_threads.back(), "cumulat_compact");
        }
        LOG(INFO) << "cumulative compaction threads started. number: " << cumulative_compaction_num_threads;
    } else {
   
        // new compaction framework

        // compaction_manager must init_max_task_num() before any comapction_scheduler starts
        _compaction_manager->init_max_task_num();
        _compaction_scheduler = std::thread([] {
   
            CompactionScheduler compaction_scheduler;
            compaction_scheduler.schedule();
        });
        Thread::set_thread_name(_compaction_scheduler, "compact_sched");
        LOG(INFO) << "compaction scheduler started";

        _compaction_checker_thread = std::thread([this] {
    compaction_check(); });
        Thread::set_thread_name(_compaction_checker_thread, "compact_check");
        LOG(INFO) << "compaction checker started";
    }

    int32_t update_compaction_num_threads_per_disk =
            std::max<int32_t>(1, config::update_compaction_num_threads_per_disk);
    int32_t update_compaction_num_threads = update_compaction_num_threads_per_disk * data_dir_num;
    _update_compaction_threads.reserve(update_compaction_num_threads);
    for (uint32_t i = 0; i < update_compaction_num_threads; ++i) {
   
        _update_compaction_threads.emplace_back([this, data_dir_num, data_dirs, i] {
   
            _update_compaction_thread_callback(nullptr, data_dirs[i % data_dir_num]);
        });
        Thread::set_thread_name(_update_compaction_threads.back(), "update_compact");
    }
    LOG(INFO) << "update compaction threads started. number: " << update_compaction_num_threads;

    // tablet checkpoint thread
    for (auto data_dir : data_dirs) {
   
        _tablet_checkpoint_threads.emplace_back([this, data_dir] {
    _tablet_checkpoint_callback((void*)data_dir); });
        Thread::set_thread_name(_tablet_checkpoint_threads.back(), "tablet_check_pt");
    }
    LOG(INFO) << "tablet checkpoint thread started";

    // fd cache clean thread
    _fd_cache_clean_thread = std::thread([this] {
    _fd_cache_clean_callback(nullptr); });
    Thread::set_thread_name(_fd_cache_clean_thread, "fd_cache_clean");
    LOG(INFO) << "fd cache clean thread started";

    // path scan and gc thread
    if (config::path_gc_check) {
   
        for (auto data_dir : get_stores()) {
   
            _path_scan_threads.emplace_back([this, data_dir] {
    _path_scan_thread_callback((void*)data_dir); });
            _path_gc_threads.emplace_back([this, data_dir] {
    _path_gc_thread_callback((void*)data_dir); });
            Thread::set_thread_name(_path_scan_threads.back(), "path_scan");
            Thread::set_thread_name(_path_gc_threads.back(), "path_gc");
        }
        LOG(INFO) << "path scan/gc threads started. number:" << get_stores().size();
    }

    LOG(INFO) << "all storage engine's background threads are started.";
    return Status::OK();
}

首先查看BE启动时的线程创建函数。
if (!config::enable_event_based_compaction_framework) { 这行开始看
首先看到了一个enable_event_based_compaction_framework的判断,这个配置源码里面默认值为false
顾名思义,应该是未来StarRocks会有一套新的压缩框架,那咱们接下来分别阅读新旧压缩框架的源码
旧压缩框架

 // base and cumulative compaction threads
        int32_t base_compaction_num_threads_per_disk =
                std::max<int32_t>(1, config::base_compaction_num_threads_per_disk);
        int32_t cumulative_compaction_num_threads_per_disk =
                std::max<int32_t>(1, config::cumulative_compaction_num_threads_per_disk);
        int32_t base_compaction_num_threads = base_compaction_num_threads_per_disk * data_dir_num;
        int32_t cumulative_compaction_num_threads = cumulative_compaction_num_threads_per_disk * data_dir_num;

        // calc the max concurrency of compaction tasks
        int32_t max_compaction_concurrency = config::max_compaction_concurrency;
        if (max_compaction_concurrency < 0 ||
            max_compaction_concurrency > base_compaction_num_threads + cumulative_compaction_num_threads) {
   
            max_compaction_concurrency = base_compaction_num_threads + cumulative_compaction_num_threads;
        }

这几行用处不大,就是单纯的读取配置,获取base compaction和cumulative compaction的线程数
并且根据配置和上面线程数之和的到一个最大并发线程数max_compaction_concurrency vectorized::Compaction::init(max_compaction_concurrency);
这行就是把最大并发压缩线程数量放到一个变量中,未来每次有新的压缩任务运行时,都会判断一下当前并发线程数是否已经达到了这个值,如果达到了,就先不运行。

       for (uint32_t i = 0; i < base_compaction_num_threads; ++i) {
   
           _base_compaction_threads.emplace_back([this, data_dir_num, data_dirs, i] {
   
               _base_compaction_thread_callback(nullptr, data_dirs[i % data_dir_num]);
           });
           Thread::set_thread_name(_base_compaction_threads.back(), "base_compact");
       }
       LOG(INFO) << "base compaction threads started. number: " << base_compaction_num_threads;

       _cumulative_compaction_threads.reserve(cumulative_compaction_num_threads);
       for (uint32_t i = 0; i < cumulative_compaction_num_threads; ++i) {
   
           _cumulative_compaction_threads.emplace_back([this, data_dir_num, data_dirs, i] {
   
               _cumulative_compaction_thread_callback(nullptr, data_dirs[i % data_dir_num]);
           });
           Thread::set_thread_name(_cumulative_compaction_threads.back(), "cumulat_compact");
       }
       LOG(INFO) << "cumulative compaction threads started. number: " << cumulative_compaction_num_threads;

创建具体的压缩线程,并且指定了线程扫描目录。
新压缩框架

compaction_manager must init_max_task_num() before any comapction_scheduler starts
_compaction_manager->init_max_task_num();
_compaction_scheduler = std::thread([] {
   
    CompactionScheduler compaction_scheduler;
    compaction_scheduler.schedule();
});
Thread::set_thread_name(_compaction_scheduler, "compact_sched");
LOG(INFO) << "compaction scheduler started";

_compaction_checker_thread = std::thread([this] {
    compaction_check(); });
Thread::set_thread_name(_compaction_checker_thread, "compact_check");
LOG(INFO) << "compaction checker started";

新压缩框架首先调用init_max_task_num初始化最大的压缩任务数。
然后并不会初始化创建所有的压缩线程,而是创建一个调度线程和一个检查线程。
调度线程定时进行压缩条件判断,每次有新的压缩任务满足条件时,如果当前运行的压缩任务数量没有达到最大的压缩任务数,就启动一个临时线程去处理压缩。
好处就是在压缩频率不高的场景下,剔除了那些闲置的压缩线程。但是在压缩频率很高的场景下,这个工作方式可能会降低压缩性能,因为只有1个线程去调度压缩任务了。

void CompactionManager::init_max_task_num() {
   
    if (config::base_compaction_num_threads_per_disk >= 0 && config::cumulative_compaction_num_threads_per_disk >= 0) {
   
        _max_task_num = static_cast<int32_t>(
                StorageEngine::instance()->get_store_num() *
                (config::cumulative_compaction_num_threads_per_disk + config::base_compaction_num_threads_per_disk));
    } else {
   
        // When cumulative_compaction_num_threads_per_disk or config::base_compaction_num_threads_per_disk is less than 0,
        // there is no limit to _max_task_num if max_compaction_concurrency is also less than 0, and here we set maximum value to be 20.
        _max_task_num = std::
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

lixiaoer666

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值