Ubuntu 16.04系统中使用GCC 9.1及Intel TBB库运行C++17 STL并行算法库

严正声明:本文系作者davidhopper原创,未经许可,不得转载。

C++17标准的最引人入胜之处就是STL并行算法库。如下述代码auto_parallel.cpp所示,仅仅在原有的STL算法中添加一个处理策略参数std::execution::par,就可以让其具备并行计算的能力:

#include <algorithm>
#include <execution>
#include <iostream>
#include <random>
#include <vector>

bool Odd(int n) { return n % 2; }

int main() {
  std::vector<int> numbers(500);

  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<int> dis(0, 100000);

  auto rand_num([&dis, &gen]() mutable { return dis(gen); });

  std::generate(std::execution::seq, std::begin(numbers), std::end(numbers),
                rand_num);
  std::sort(std::execution::par, std::begin(numbers), std::end(numbers));

  std::reverse(std::execution::par, std::begin(numbers), std::end(numbers));

  auto odds(std::count_if(std::execution::par, std::begin(numbers),
                          std::end(numbers), Odd));

  std::cout << "Reverse sorted numbers: " << std::endl;
  size_t i = 1;
  for (const auto& number : numbers) {
    std::cout << number << ", ";
    if (i % 10 == 0) {
      std::cout << std::endl;
    }
    ++i;
  }
  std::cout << std::endl;

  std::cout << (100.0 * odds / numbers.size()) << "% of the numbers are odd.\n";
  
  return 0;
}

Ubuntu 16.04系统或其他Linux系统中最常用的C++编译器为GCC。对于想在Ubuntu 16.04系统或其他Linux系统中使用C++ 17 STL并行算法库的同学来说,有一个坏消息,也有一个好消息。坏消息是,GCC 9.1.0(2019年5月发布)以下版本不支持C++ 17 STL并行算法库。好消息是GCC 9.1.0已支持C++ 17 STL并行算法库,但需要安装Intel TBB 2018或更新库,如下图所示:
1

一、安装GCC 9.1

GCC 9.1的安装方法可参见我的另一篇博客:《Ubuntu 16.04系统中GCC 9.1编译器安装方法及C++17标准测试示例》,该博客写于2018年3月,当时GCC的最新版本是GCC 7.3,前几天我计划将该博客更新至GCC 9.1,结果发现除了将版本号从7.3.0修改为9.1.0外,不用做任何更改就可以顺利地将GCC 9.1安装成功。
GCC 9.1安装成功后,如果忽略上图中“需要安装Intel TBB 2018或更新库”的建议,直接使用如下命令编译auto_parallel.cpp

g++ -g -Wall -std=c++17 auto_parallel.cpp -o auto_parallel

则会出现下图所示的错误:
2
提示简单明了,就是需要安装Intel TBB 2018或更新库。

二、安装Intel TBB库

进入GitHub TBB的Release页,找到最新的发布版(2019年8月4日的最新版是:2019_U8),使用如下命令下载、解压、构建:

# 1.下载源代码
wget https://github.com/intel/tbb/archive/2019_U8.tar.gz

# 2.解压源代码
tar xzvf 2019_U8.tar.gz
rm 2019_U8.tar.gz

# 3. 构建Intel TBB库
cd tbb-2019_U8
# 注意,要将系统的默认gcc编译器更新为gcc 9.1,操作方法见我的另一篇博客:
# https://blog.csdn.net/davidhopper/article/details/79681695
make compiler=gcc stdver=c++17 tbb_build_prefix=my_tbb_build

TBB库没有提供安装脚本,我自己写了一个,将tbb-2019_U8安装到/usr/local目录(可以通过修改MY_LOCAL_DIR改变安装位置),内容如下:

#!/bin/bash
#

MY_LOCAL_DIR="/usr/local"
TBB_ROOT_DIR="${MY_LOCAL_DIR}/tbb-2019_U8"

if [ ! -d ${TBB_ROOT_DIR} ]; then  
  sudo mkdir ${TBB_ROOT_DIR}
else
  sudo rm -rf ${TBB_ROOT_DIR}/*
fi

sudo cp -r include ${TBB_ROOT_DIR}/include
sudo mkdir ${TBB_ROOT_DIR}/lib
sudo cp build/my_tbb_build_release/*so* ${TBB_ROOT_DIR}/lib

if [ -e ${MY_LOCAL_DIR}/include/tbb ]; then  
  sudo rm -f ${MY_LOCAL_DIR}/include/tbb
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbb.so ]; then  
  sudo rm -f ${MY_LOCAL_DIR}/lib/libtbb.so
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbbmalloc.so ]; then  
  sudo rm -f ${MY_LOCAL_DIR}/lib/libtbbmalloc.so
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so ]; then  
  sudo rm -f ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so
fi

sudo ln -s ${TBB_ROOT_DIR}/include/tbb ${MY_LOCAL_DIR}/include/tbb
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbb.so.2 ${MY_LOCAL_DIR}/lib/libtbb.so
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbbmalloc.so.2 ${MY_LOCAL_DIR}/lib/libtbbmalloc.so
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbbmalloc_proxy.so.2 ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so

if [ -z "${LD_LIBRARY_PATH}" ]; then
  echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib" >> ~/.bashrc  
else  
  tbb_result=$(echo ${LD_LIBRARY_PATH} | grep "${TBB_ROOT_DIR}/lib")
  if [ -z "${tbb_result}" ]; then    
    echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib:${LD_LIBRARY_PATH}" >> ~/.bashrc    
  fi
fi
source ~/.bashrc

你可以在TBB库源代码的顶级目录tbb-2019_U8中使用vimgedit等文本编辑工具新建一个文件:install_tbb_2019_U8.sh,将上述内容粘贴到该文件中,然后执行如下命令进行安装:

bash install_tbb_2019_U8.sh

说明:不知什么原因,在脚本文件中使用source ~/.bashrc命令,似乎不能刷新环境设置,按照《shell脚本无法使用source的原因及解决方法》中的方法修改,也没有效果,于是我只能在命令行内单独调用:

source ~/.bashrc

让环境更改生效。

2021年2月14日更新

TBB已更名为oneTBB,下载网址为:https://github.com/oneapi-src/oneTBB/releases,目前最新版本是oneTBB 2021.5.0。但最新的oneTBB不支持GCC 9.1以及9.3(Ubuntu 20.04 2021年以后的版本默认就是GCC 9.3),具体问题请参见:Can’t run C++17 parallel algorithms with GCC on Linux
好消息是,有个更为简洁的办法安装TBB,命令如下:

sudo apt update
sudo apt install libtbb-dev

以上命令会将tbb库安装在系统用户目录/usr

oneTBB支持GCC11.0,使用CMake构建,下面给出基于源代码的构建安装方法:

# 1.下载源代码
wget https://github.com/oneapi-src/oneTBB/archive/refs/tags/v2021.5.0.tar.gz

# 2.解压源代码
tar xzvf v2021.5.0.tar.gz
rm v2021.5.0.tar.gz

# 3. 构建Intel TBB库
cd v2021.5.0/oneTBB-2021.5.0
# 注意,要将系统的默认gcc编译器更新为GCC 9.1或以上版本(Ubuntu 20.04 2021年以后的版本默认就是GCC 9.3),操作方法见我的另一篇博客:
# https://blog.csdn.net/davidhopper/article/details/79681695
mkdir build && cd build
# Configure: customize CMAKE_INSTALL_PREFIX and disable TBB_TEST to avoid tests build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local -DTBB_TEST=OFF ..
# Build
cmake --build .
# Install
cmake --install .
# oneTBB is in /usr/local

使用vimgedit等文本编辑工具新建一个文件:setup_tbb.sh,内容如下:

MY_LOCAL_DIR="/usr/local"
TBB_ROOT_DIR="${MY_LOCAL_DIR}/onetbb"
if [ -z "${LD_LIBRARY_PATH}" ]; then
  echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib" >> ~/.bashrc  
else  
  tbb_result=$(echo ${LD_LIBRARY_PATH} | grep "${TBB_ROOT_DIR}/lib")
  if [ -z "${tbb_result}" ]; then    
    echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib:${LD_LIBRARY_PATH}" >> ~/.bashrc    
  fi
fi
source ~/.bashrc

执行如下命令让路径生效:

bash setup_tbb.sh

说明:不知什么原因,在脚本文件中使用source ~/.bashrc命令,似乎不能刷新环境设置,于是我只能在命令行内单独调用:

source ~/.bashrc

让环境变量更改生效。

三、GCC 9.1构建C++17 STL并行算法库示例:

3.1 auto_parallel.cpp

第一个示例自然是前面提到的auto_parallel.cpp,再次列出代码如下:

#include <algorithm>
#include <execution>
#include <iostream>
#include <random>
#include <vector>

bool Odd(int n) { return n % 2; }

int main() {
  std::vector<int> numbers(500);

  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<int> dis(0, 100000);

  auto rand_num([&dis, &gen]() mutable { return dis(gen); });

  std::generate(std::execution::seq, std::begin(numbers), std::end(numbers),
                rand_num);
  std::sort(std::execution::par, std::begin(numbers), std::end(numbers));

  std::reverse(std::execution::par, std::begin(numbers), std::end(numbers));

  auto odds(std::count_if(std::execution::par, std::begin(numbers),
                          std::end(numbers), Odd));

  std::cout << "Reverse sorted numbers: " << std::endl;
  size_t i = 1;
  for (const auto& number : numbers) {
    std::cout << number << ", ";
    if (i % 10 == 0) {
      std::cout << std::endl;
    }
    ++i;
  }
  std::cout << std::endl;

  std::cout << (100.0 * odds / numbers.size()) << "% of the numbers are odd.\n";
  
  return 0;
}

构建指令为:

g++ -g -Wall -std=c++17 -L/usr/local/lib -ltbb auto_parallel.cpp -o auto_parallel
# 如果使用sudo apt install libtbb-dev命令安装tbb,则不需要显式指定链接目录
g++ -g -Wall -std=c++17 -ltbb auto_parallel.cpp -o auto_parallel

运行程序:

./auto_parallel

结果如下:

Reverse sorted numbers: 
99991, 99956, 99370, 99011, 98993, 98981, 98836, 98576, 98464, 98411, 
97285, 96689, 96660, 96100, 95775, 95528, 94881, 94839, 94465, 93932, 
93867, 93640, 93387, 93375, 93306, 93302, 93103, 92928, 92872, 92859, 
92765, 92710, 92519, 92383, 92305, 92232, 92200, 92137, 92077, 92072, 
92001, 91957, 91705, 91522, 91489, 91437, 91315, 91229, 91197, 91179, 
91150, 91112, 91098, 91059, 90982, 90817, 90418, 90291, 90071, 89910, 
89855, 89602, 88569, 88497, 88282, 88273, 88108, 87806, 87787, 87620, 
87279, 86355, 86054, 85923, 85733, 85608, 85345, 85130, 84761, 84312, 
84197, 83980, 83852, 83755, 83335, 83151, 83139, 82632, 82220, 81650, 
81485, 81381, 80833, 80563, 80562, 80491, 80470, 80125, 80083, 80036, 
79975, 79939, 79927, 79914, 79798, 79699, 79682, 79612, 79400, 79296, 
78798, 78797, 78712, 78499, 78161, 78074, 78065, 77930, 77732, 77617, 
77427, 77366, 77302, 76982, 76818, 76781, 75824, 75807, 75610, 75443, 
75263, 75048, 74507, 74184, 74112, 73989, 73884, 73508, 73332, 72878, 
72798, 72711, 72639, 72570, 72565, 72316, 72316, 71951, 71942, 71916, 
71829, 70775, 70696, 70426, 70311, 70283, 70161, 70116, 69820, 69720, 
68564, 68360, 68059, 68001, 67812, 67711, 67474, 67410, 67308, 67114, 
67024, 67003, 66834, 66634, 66505, 65919, 65851, 65816, 65735, 65628, 
65400, 65196, 65121, 65065, 64987, 64472, 63859, 63843, 63753, 63741, 
63726, 63128, 62964, 62869, 62684, 62501, 62441, 62414, 62361, 62296, 
62210, 62188, 61771, 61509, 60815, 60789, 60601, 60510, 60393, 60226, 
59992, 59981, 59828, 59613, 59264, 59173, 58870, 57837, 57258, 56758, 
56740, 56270, 55979, 55892, 55859, 55805, 55617, 55546, 55274, 54997, 
54772, 54524, 54507, 54140, 53961, 53906, 53633, 52991, 52822, 52714, 
52693, 52522, 52314, 52141, 51854, 51812, 51799, 51387, 51249, 51229, 
51105, 51053, 51026, 50633, 50435, 50270, 49542, 49511, 49392, 49279, 
48932, 48788, 48658, 48564, 48434, 48401, 48383, 48328, 48278, 47334, 
47114, 46873, 46859, 46801, 46686, 46582, 46371, 46076, 45859, 45846, 
45701, 45419, 44464, 43838, 43693, 43653, 43470, 42733, 42552, 42397, 
42367, 42039, 41890, 41801, 41773, 41565, 41489, 41437, 41101, 40909, 
40330, 40167, 39949, 39833, 39798, 39268, 38811, 38786, 38522, 38305, 
38284, 38027, 37926, 37750, 37706, 37575, 37540, 37194, 37059, 36782, 
36633, 36376, 36270, 35649, 35551, 35476, 35343, 35104, 34856, 34607, 
34409, 34270, 34177, 34063, 33730, 33691, 33476, 33010, 32897, 32766, 
32388, 31735, 31733, 31504, 31308, 31223, 30920, 30844, 30822, 30300, 
30162, 30122, 29802, 29737, 29118, 28531, 28427, 28089, 27986, 27770, 
27737, 27715, 27704, 27651, 27510, 27462, 27298, 26792, 26663, 26581, 
26563, 26404, 26343, 26032, 25898, 25763, 25658, 25608, 25579, 25471, 
25083, 24818, 24722, 24680, 24621, 24396, 23675, 23270, 23004, 22565, 
22291, 22093, 21966, 21793, 21785, 21405, 21373, 21238, 21111, 20981, 
20957, 20887, 20852, 20800, 20683, 20382, 20246, 20023, 19488, 19418, 
19307, 19156, 19144, 19011, 18833, 18819, 17824, 17631, 17303, 17209, 
17179, 16434, 16379, 16291, 16081, 15966, 15877, 15791, 15754, 15693, 
15219, 14438, 14434, 14072, 13609, 13413, 12932, 12863, 12717, 12472, 
12238, 11919, 11598, 11427, 11204, 11119, 10149, 9763, 9628, 9489, 
9392, 9291, 9245, 9222, 8918, 8756, 8739, 8269, 8069, 7987, 
7960, 7839, 7771, 7667, 7589, 7467, 7427, 7425, 7276, 6582, 
6577, 6523, 6379, 6277, 6039, 5674, 5591, 5459, 5434, 5148, 
4910, 4668, 4104, 4056, 3792, 3621, 3284, 2560, 2080, 1832, 
1792, 1632, 1445, 1169, 959, 719, 640, 243, 150, 22, 

51.4% of the numbers are odd.

注意1:如果运行程序出现如下错误,请参考我的博客《Ubuntu 16.04系统中GCC 9.1编译器安装方法及C++17标准测试示例》4.1节加以解决:

./auto_parallel: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /usr/local/tbb-2019_U8/lib/libtbb.so.2)

注意2:如果不显式地指定TBB库的链接目录为我们安装的目录-L/usr/local/lib,GCC编译器会使用系统默认的TBB库目录(在我机器上为/usr/libx86_64-linux-gnu/),导致如下链接错误:

...
/usr/local/include/tbb/task_arena.h:157: undefined reference to `tbb::interface7::internal::isolate_within_arena(tbb::interface7::internal::delegate_base&, long)'
collect2: error: ld returned 1 exit status

3
如果确实不想显式指定TBB库的链接目录,可执行如下命令:

cd /usr/lib/x86_64-linux-gnu
# 如果存在以下三个文件,则备份之
sudo mv libtbb.so libtbb.so.bk
sudo mv libtbbmalloc_proxy.so libtbbmalloc_proxy.so.bk
sudo mv libtbbmalloc.so libtbbmalloc.so.bk
# 建立正确的符号链接
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbb.so.2 libtbb.so 
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbbmalloc.so.2 libtbbmalloc.so
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbbmalloc_proxy.so.2 libtbbmalloc_proxy.so

3.2 parallel_sort_test.cpp

第二个示例是排序测试parallel_sort_test.cpp,列出代码如下:

#include <algorithm>
#include <chrono>
#include <execution>
#include <iostream>
#include <random>
#include <vector>

void PrintDuration(std::chrono::steady_clock::time_point start,
                   std::chrono::steady_clock::time_point end,
                   const char *message) {
  auto diff = end - start;
  std::cout << message << ' '
            << std::chrono::duration<double, std::milli>(diff).count()
            << " ms\n";
}

template <typename T>
void Test(const T &policy, const std::vector<double> &data, const int repeat,
          const char *message) {
  for (int i = 0; i < repeat; ++i) {
    std::vector<double> curr_data(data);

    const auto start = std::chrono::steady_clock::now();
    std::sort(policy, curr_data.begin(), curr_data.end());
    const auto end = std::chrono::steady_clock::now();
    PrintDuration(start, end, message);
  }
  std::cout << '\n';
}

int main() {
  // Test samples and repeat factor
  constexpr size_t samples{5'000'000};
  constexpr int repeat{10};

  // Fill a vector with samples numbers
  std::random_device rd;
  std::mt19937_64 mre(rd());
  std::uniform_real_distribution<double> urd(0.0, 1.0);

  std::vector<double> data(samples);
  for (auto &e : data) {
    e = urd(mre);
  }

  // Sort data using different execution policies
  std::cout << "std::execution::seq\n";
  Test(std::execution::seq, data, repeat, "Elapsed time");

  std::cout << "std::execution::par\n";
  Test(std::execution::par, data, repeat, "Elapsed time");

  return 0;
}

构建指令为:

g++ -g -Wall -std=c++17 -L/usr/local/lib -ltbb parallel_sort_test.cpp -o parallel_sort_test

运行程序:

./parallel_sort_test

结果如下:

std::execution::seq
Elapsed time 2553.66 ms
Elapsed time 2586.73 ms
Elapsed time 2619.7 ms
Elapsed time 2561.58 ms
Elapsed time 2555.84 ms
Elapsed time 2589.82 ms
Elapsed time 2572.41 ms
Elapsed time 2547.43 ms
Elapsed time 2550.76 ms
Elapsed time 2592.11 ms

std::execution::par
Elapsed time 1045.65 ms
Elapsed time 1064.65 ms
Elapsed time 1072.01 ms
Elapsed time 1069.37 ms
Elapsed time 1160.24 ms
Elapsed time 1275.98 ms
Elapsed time 1432.63 ms
Elapsed time 1306.78 ms
Elapsed time 1052.09 ms
Elapsed time 1038.25 ms

可见并行版本效率提升了一倍。
我们开启GCC的优化选项-O2,再次构建执行:

g++ -g -Wall -std=c++17 -O2 -L/usr/local/lib -ltbb parallel_sort_test.cpp -o parallel_sort_test

运行程序:

./parallel_sort_test

结果如下:

std::execution::seq
Elapsed time 469.323 ms
Elapsed time 469.551 ms
Elapsed time 467.699 ms
Elapsed time 472.385 ms
Elapsed time 476.423 ms
Elapsed time 476.304 ms
Elapsed time 477.908 ms
Elapsed time 471.515 ms
Elapsed time 474.121 ms
Elapsed time 503.281 ms

std::execution::par
Elapsed time 195.322 ms
Elapsed time 191.604 ms
Elapsed time 189.077 ms
Elapsed time 185.162 ms
Elapsed time 183.868 ms
Elapsed time 184.853 ms
Elapsed time 222.734 ms
Elapsed time 188.112 ms
Elapsed time 189.43 ms
Elapsed time 198.443 ms

整体效率提升了4倍左右。

  • 13
    点赞
  • 41
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值