osd shard queue

关于该话题,在OSD的类中有两个成员重要

  • 成员1:ShardedThreadPool osd_op_tp

    • 初始化

      osd_op_tp(cct, "OSD::osd_op_tp", "tp_osd_tp", get_num_op_threads())
      

      该变量在构造函数中进行初始化,其中get_num_op_threads的实现如下,返回该线程池的工作线程的数量。

      int OSD::get_num_op_threads()
      {
        if (cct->_conf->osd_op_num_threads_per_shard)
          return get_num_op_shards() * cct->_conf->osd_op_num_threads_per_shard;
        if (store_is_rotational)
          return get_num_op_shards() * cct->_conf->osd_op_num_threads_per_shard_hdd;
        else
          return get_num_op_shards() * cct->_conf->osd_op_num_threads_per_shard_ssd;
      }
      
    • 关键函数

      void ShardedThreadPool::start()
      {
        ldout(cct, 10) << "start" << dendl;
      
        shardedpool_lock.lock();
        start_threads();
        shardedpool_lock.unlock();
        ldout(cct, 15) << "started" << dendl;
      }
      

      start_threads函数会去启动num_threads的工作线程。

      void ShardedThreadPool::shardedthreadpool_worker(uint32_t thread_index)
      {
                       ...
          wq->_process(thread_index, hb);
      }
      

      shardedthreadpool_worker是线程的处理函数,最终会调用业务队列的处理函数进行处理,在这里就是 void OSD::ShardedOpWQ::_process(uint32_t thread_index, heartbeat_handle_d *hb)函数

  • 成员2:ShardedOpWQ op_shardedwq,该成员是一个内部类

    • 初始化:
      ①op_shardedwq(
        this,
        cct->_conf->osd_op_thread_timeout,
        cct->_conf->osd_op_thread_suicide_timeout,
        &osd_op_tp)
        
        ②ShardedWQ(time_t ti, time_t sti, ShardedThreadPool* tp): BaseShardedWQ(ti, sti), 
          sharded_pool(tp) {
          tp->set_wq(this);
        }
    

    ①第四个参数就是上面提到的成员1:ShardedThreadPool,决定了该队列处理线程的个数。

    ②把work_queue设置到了threadpool

    • 关键函数
    void OSD::ShardedOpWQ::_enqueue(OpQueueItem&& item) {
      uint32_t shard_index =
        item.get_ordering_token().hash_to_shard(osd->shards.size());
    
      OSDShard* sdata = osd->shards[shard_index];
      assert (NULL != sdata);
      unsigned priority = item.get_priority();
      unsigned cost = item.get_cost();
      sdata->shard_lock.lock();
    
      dout(20) << __func__ << " " << item << dendl;
      if (priority >= osd->op_prio_cutoff)
        sdata->pqueue->enqueue_strict(
          item.get_owner(), priority, std::move(item));
      else
        sdata->pqueue->enqueue(
          item.get_owner(), priority, cost, std::move(item));
      sdata->shard_lock.unlock();
    
      std::lock_guard l{sdata->sdata_wait_lock};
      sdata->sdata_cond.notify_one();
    }
    

    OSDShard可以理解成一个队列的分片,每个分片里面实现了一个队列。

    void OSD::ShardedOpWQ::_process(uint32_t thread_index, heartbeat_handle_d *hb) {
          uint32_t shard_index = thread_index % osd->num_shards;
          auto& sdata = osd->shards[shard_index];
              ...
          OpQueueItem item = sdata->pqueue->dequeue();
             ...
          auto r = sdata->pg_slots.emplace(token, nullptr);
          if (r.second) {
            r.first->second = make_unique<OSDShardPGSlot>();
          }
          OSDShardPGSlot *slot = r.first->second.get();
          pg->lock();
          auto qi = std::move(slot->to_process.front());
          qi.run(osd, sdata, pg, tp_handle);
    }
    

    在该函数中会加一把pg的大锁,从上面可以看出最终会调用入队的op的run函数,对于普通的pg op,对应着函数如下:

    void PGOpItem::run(
      OSD *osd,
      OSDShard *sdata,
      PGRef& pg,
      ThreadPool::TPHandle &handle)
    {
      osd->dequeue_op(pg, op, handle);
      pg->unlock();
    }
    
    

调用dequeue_op以后就直接释放了pg lock。
dequeue_op以写为例,本地事务提交到bluestore以后,函数执行结束,然后释放pg lock。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
cpu_sys_in_millis cpu_user_in_millis merge_threads merge_queue merge_active merge_rejected merge_largest merge_completed bulk_threads bulk_queue bulk_active bulk_rejected bulk_largest bulk_completed warmer_threads warmer_queue warmer_active warmer_rejected warmer_largest warmer_completed get_largest get_completed get_threads get_queue get_active get_rejected index_threads index_queue index_active index_rejected index_largest index_completed suggest_threads suggest_queue suggest_active suggest_rejected suggest_largest suggest_completed fetch_shard_store_queue fetch_shard_store_active fetch_shard_store_rejected fetch_shard_store_largest fetch_shard_store_completed fetch_shard_store_threads management_threads management_queue management_active management_rejected management_largest management_completed percolate_queue percolate_active percolate_rejected percolate_largest percolate_completed percolate_threads listener_active listener_rejected listener_largest listener_completed listener_threads listener_queue search_rejected search_largest search_completed search_threads search_queue search_active fetch_shard_started_threads fetch_shard_started_queue fetch_shard_started_active fetch_shard_started_rejected fetch_shard_started_largest fetch_shard_started_completed refresh_rejected refresh_largest refresh_completed refresh_threads refresh_queue refresh_active optimize_threads optimize_queue optimize_active optimize_rejected optimize_largest optimize_completed snapshot_largest snapshot_completed snapshot_threads snapshot_queue snapshot_active snapshot_rejected generic_threads generic_queue generic_active generic_rejected generic_largest generic_completed flush_threads flush_queue flush_active flush_rejected flush_largest flush_completed server_open rx_count rx_size_in_bytes tx_count tx_size_in_bytes
06-02

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值