C++编程：使用C++多线程和POSIX库模拟CPU密集型工作

置顶橘色的喵

已于 2024-10-01 13:10:48 修改

阅读量872

点赞数 6

分类专栏：性能优化、功能优化 C++ 文章标签： c++ pthread cpu密集型多线程 BusyWait cpu占用

于 2024-08-25 21:24:58 首次发布

本文链接：https://blog.csdn.net/stallion5632/article/details/141535247

版权

C++ 同时被 2 个专栏收录

87 篇文章 3 订阅

订阅专栏

性能优化、功能优化

49 篇文章 1 订阅

订阅专栏

文章目录

0. 引言

本文利用C++与POSIX线程库（pthread）编写多线程程序，以模拟不同负载下的CPU资源占用情况。
该工具应用在Linux编程： C++程序线程CPU使用率监控与分析小工具

使用该工具打印结果如下所示：

$ sudo ./dummp_worker 2 0.3
Process Name: dummp_worker
Process ID: 23636
Number of Workers: 2
Occupation Rate: 0.30
Thread Name: worker_0, Thread ID: 23637, Priority: -2
Thread Name: worker_1, Thread ID: 23638, Priority: -3

1. 设计思路

本文的代码设计旨在创建一个多线程的工作池（worker pool），每个线程在运行期间根据指定的占用比例模拟CPU密集型工作。以下是代码实现中的几个核心技术要点：

线程命名与管理：为每个线程设置唯一名称，有助于在调试和监控时轻松识别不同线程。
CPU亲和性设置：通过设置线程的CPU亲和性（affinity），确保每个线程绑定到特定的CPU核，避免频繁的上下文切换，从而提升性能。
线程调度策略与优先级：采用实时调度策略（SCHED_FIFO），并为每个线程分配不同的优先级，以更好地控制线程的执行顺序和响应时间。
忙等待与系统调用优化：使用自旋等待（busy-waiting）和系统调用相结合的策略，提高线程对CPU资源的利用率。

2. 代码实现与详解

2.1 忙等待机制：`BusyWait` 函数

忙等待（busy-waiting）是一种常见的CPU资源占用方法。在本例中，BusyWait 函数实现了一个简易的忙等待循环。

void BusyWait(std::size_t nanosec) {
  const auto t0 = std::chrono::high_resolution_clock::now();

  while (std::chrono::duration_cast<std::chrono::nanoseconds>(
           std::chrono::high_resolution_clock::now() - t0).count() < nanosec) {
    getpid();       // 简单的系统调用，切换到内核模式
    sched_yield();  // 让出处理器给其他线程，进行内核交互
  }
}

函数解析：

getpid() 和 sched_yield() 系统调用用于模拟线程的实际工作负载。
- getpid()：虽然是一个简单的系统调用，但它迫使线程进入内核模式，增加了内核CPU时间的消耗。
- sched_yield()：请求内核调度器将CPU时间片让给其他线程，进一步增加了内核参与调度的次数。

这种设计既确保了线程的高占用率，又避免了在忙等待期间完全占用CPU资源。

2.2 核心工作函数：`Work`

Work函数定义了每个线程的核心行为和策略，包括线程命名、CPU亲和性设置、调度策略和优先级设置等。

[[noreturn]] void Work(float percentage, int thread_id) {
  assert(percentage >= 0.0f && percentage <= 1.0f);
  constexpr float kPeriod = 1'000'000.0f;

  // 设置线程名称
  const std::string thread_name = "worker_" + std::to_string(thread_id);
  (void)pthread_setname_np(pthread_self(), thread_name.c_str());

  // 设置CPU亲和性
  cpu_set_t cpuset;
  CPU_ZERO(&cpuset);
  CPU_SET(static_cast<int>(thread_id % std::thread::hardware_concurrency()), &cpuset);
  (void)pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);

  // 设置调度策略和优先级
  struct sched_param param;
  param.sched_priority = sched_get_priority_min(SCHED_FIFO) + thread_id;
  if (pthread_setschedparam(pthread_self(), SCHED_FIFO, &param) != 0) {
    std::cerr << "Failed to set thread scheduling policy and priority for thread " << thread_id << "\n";
  }

  while (true) {
    BusyWait(static_cast<std::size_t>(kPeriod * percentage));
    std::this_thread::sleep_for(std::chrono::nanoseconds(static_cast<std::size_t>(kPeriod * (1.0f - percentage))));
  }
}

关键步骤：

线程命名：通过pthread_setname_np，为每个线程设置一个唯一的名称（例如worker_0，worker_1），便于调试和监控。
CPU亲和性设置：通过pthread_setaffinity_np将线程绑定到特定的CPU核心（根据thread_id），避免线程在多个核心之间频繁切换，提高缓存命中率。
调度策略和优先级设置：
- 使用SCHED_FIFO调度策略，确保线程按照先进先出的顺序执行。
- 使用pthread_setschedparam设置线程优先级。优先级由线程ID决定，以模拟不同的调度策略和响应时间。
工作循环：
- 线程按照指定比例先进行忙等待（模拟CPU密集型任务），然后进入睡眠状态释放CPU资源。
- 这种设计确保了线程在指定时间窗口内合理占用CPU，同时在其余时间内不占用CPU资源。

2.3 主函数：`main`

主函数负责初始化和启动多个worker线程，并在程序结束时清理所有线程资源。

int main(int argc, char* argv[]) {
  if (argc < 3) {
    std::cout << "Args: worker_num occupation_rate.\n";
    return 0;
  }
  const int num = std::stoi(argv[1]);
  const float percentage = std::stof(argv[2]);
  if (num < 1) {
    std::cout << "Error: num of workers less than 1.\n";
    return 0;
  }
  if (percentage < 0.0f || percentage > 1.0f) {
    std::cout << "Error: occupation rate should be between [0.0, 1.0].\n";
    return 0;
  }
  std::cout << "num of workers: " << num << "\n"
            << "occupation rate: " << percentage << "\n";

  // 创建和启动worker线程
  std::vector<std::unique_ptr<std::thread>> threads;
  threads.reserve(num);
  for (int i = 0; i < num; ++i) {
    threads.push_back(std::make_unique<std::thread>(worker_app::Work, percentage, i));
  }

  // 等待所有线程完成
  for (auto& td : threads) {
    if (td->joinable()) {
      td->join();
    }
  }

  return 0;
}

3. CPU使用模式分析

用户态CPU使用（User CPU）：
- 在Work函数的主循环中，线程主要在BusyWait函数中消耗CPU时间。此时线程处于用户态（User Mode），不断执行忙等待循环，模拟了一个典型的CPU密集型任务。
内核态CPU使用（Kernel CPU）：
- BusyWait函数中的getpid()和sched_yield()系统调用会导致线程从用户态切换到内核态，增加了内核CPU的负载。
- 尤其是sched_yield()，它显式请求内核进行上下文切换，这会导致较高的内核CPU使用率。

4. 完整代码

//  g++ -o dummp_worker dummp_worker.cc -O2 -pthread
#include <cassert>
#include <chrono>
#include <cstdio>  // For fprintf
#include <cstring> // For strerror
#include <pthread.h>
#include <sched.h>
#include <string>
#include <sys/prctl.h>   // For prctl()
#include <sys/syscall.h> // For SYS_gettid
#include <system_error>  // For std::system_error
#include <thread>
#include <unistd.h> // For getpid() and syscall()
#include <vector>

namespace worker_app {

// Get the current thread's system thread ID
pid_t GetThreadID() { return static_cast<pid_t>(syscall(SYS_gettid)); }

// Get the priority of the thread
int GetThreadPriority(pthread_t thread) {
  int policy;
  struct sched_param param;
  if (pthread_getschedparam(thread, &policy, &param) != 0) {
    return -1; // Failed to get priority
  }
  return param.sched_priority;
}

// Busy-wait function, adding system calls to avoid excessive compiler
// optimizations
void BusyWait(std::size_t nanosec) {
  const auto t0 = std::chrono::high_resolution_clock::now();

  while (std::chrono::duration_cast<std::chrono::nanoseconds>(
             std::chrono::high_resolution_clock::now() - t0)
             .count() < nanosec) {
    // Simple system calls to ensure switching to kernel mode
    getpid();
    sched_yield();
  }
}

// Worker thread function
[[noreturn]] void Work(float percentage, int thread_id) {
  assert(percentage >= 0.0f && percentage <= 1.0f);
  constexpr float kPeriod = 1'000'000.0f;

  // Set thread name
  std::string thread_name = "worker_" + std::to_string(thread_id);
  if (pthread_setname_np(pthread_self(), thread_name.c_str()) != 0) {
    fprintf(stderr, "Failed to set thread name for thread %d: %s\n", thread_id,
            strerror(errno));
  }

  // Set CPU affinity to ensure the thread runs on a specific CPU core
  cpu_set_t cpuset;
  CPU_ZERO(&cpuset);
  CPU_SET(thread_id % static_cast<int>(std::thread::hardware_concurrency()),
          &cpuset);
  if (pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset) != 0) {
    fprintf(stderr, "Failed to set CPU affinity for thread %d: %s\n", thread_id,
            strerror(errno));
  }

  // Set thread scheduling policy and priority
  struct sched_param param;
  int min_priority = sched_get_priority_min(SCHED_FIFO);
  if (min_priority == -1) {
    fprintf(stderr, "Failed to get minimum priority: %s\n", strerror(errno));
  }
  param.sched_priority = min_priority + thread_id;
  if (pthread_setschedparam(pthread_self(), SCHED_FIFO, &param) != 0) {
    fprintf(stderr, "Failed to set thread scheduling for thread %d: %s\n",
            thread_id, strerror(errno));
  }

  // Print thread information with adjusted priority
  pid_t tid = GetThreadID();
  int sched_priority = GetThreadPriority(pthread_self());
  if (sched_priority == -1) {
    fprintf(stderr, "Failed to get thread priority for thread %d.\n",
            thread_id);
  }

  // Adjust priority to match 'top' display (PR = -(sched_priority + 1))
  int display_priority = -(sched_priority + 1);
  fprintf(stdout, "Thread Name: %s, Thread ID: %d, Priority: %d\n",
          thread_name.c_str(), tid, display_priority);

  // Main loop for the thread
  while (true) {
    BusyWait(static_cast<std::size_t>(kPeriod * percentage));
    std::this_thread::sleep_for(std::chrono::nanoseconds(
        static_cast<std::size_t>(kPeriod * (1.0f - percentage))));
  }
}

} // namespace worker_app

// Get process name
std::string GetProcessName() {
  char name[256] = {0};
  if (prctl(PR_GET_NAME, name, 0, 0, 0) != 0) {
    return "Unknown";
  }
  return std::string(name);
}

int main(int argc, char *argv[]) {
  if (argc < 3) {
    fprintf(stdout, "Usage: %s <worker_num> <occupation_rate>\n", argv[0]);
    return 0;
  }

  int num = 0;
  float percentage = 0.0f;

  try {
    num = std::stoi(argv[1]);
    percentage = std::stof(argv[2]);
  } catch (const std::invalid_argument &e) {
    fprintf(stderr,
            "Invalid arguments. Please provide valid integers and floats.\n");
    return 1;
  } catch (const std::out_of_range &e) {
    fprintf(stderr, "Arguments out of range.\n");
    return 1;
  }

  if (num < 1) {
    fprintf(stderr, "Error: Number of workers must be at least 1.\n");
    return 1;
  }

  if (percentage < 0.0f || percentage > 1.0f) {
    fprintf(stderr, "Error: Occupation rate should be between [0.0, 1.0].\n");
    return 1;
  }

  // Print process information
  std::string process_name = GetProcessName();
  pid_t pid = getpid();
  fprintf(stdout,
          "Process Name: %s\nProcess ID: %d\nNumber of Workers: %d\nOccupation "
          "Rate: %.2f\n",
          process_name.c_str(), pid, num, percentage);

  // Create and start worker threads
  std::vector<std::thread> threads;
  threads.reserve(num);
  for (int i = 0; i < num; ++i) {
    try {
      threads.emplace_back(worker_app::Work, percentage, i);
    } catch (const std::system_error &e) {
      fprintf(stderr, "Failed to create thread %d: %s\n", i, e.what());
    }
  }

  // Main thread waits for all worker threads (since worker threads run
  // indefinitely, the main thread will block here)
  for (auto &th : threads) {
    if (th.joinable()) {
      th.join();
    }
  }

  return 0;
}