严正声明:本文系作者davidhopper原创,未经许可,不得转载。
C++17标准的最引人入胜之处就是STL并行算法库。如下述代码auto_parallel.cpp
所示,仅仅在原有的STL算法中添加一个处理策略参数std::execution::par
,就可以让其具备并行计算的能力:
#include <algorithm>
#include <execution>
#include <iostream>
#include <random>
#include <vector>
bool Odd(int n) { return n % 2; }
int main() {
std::vector<int> numbers(500);
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> dis(0, 100000);
auto rand_num([&dis, &gen]() mutable { return dis(gen); });
std::generate(std::execution::seq, std::begin(numbers), std::end(numbers),
rand_num);
std::sort(std::execution::par, std::begin(numbers), std::end(numbers));
std::reverse(std::execution::par, std::begin(numbers), std::end(numbers));
auto odds(std::count_if(std::execution::par, std::begin(numbers),
std::end(numbers), Odd));
std::cout << "Reverse sorted numbers: " << std::endl;
size_t i = 1;
for (const auto& number : numbers) {
std::cout << number << ", ";
if (i % 10 == 0) {
std::cout << std::endl;
}
++i;
}
std::cout << std::endl;
std::cout << (100.0 * odds / numbers.size()) << "% of the numbers are odd.\n";
return 0;
}
Ubuntu 16.04系统或其他Linux系统中最常用的C++编译器为GCC。对于想在Ubuntu 16.04系统或其他Linux系统中使用C++ 17 STL并行算法库的同学来说,有一个坏消息,也有一个好消息。坏消息是,GCC 9.1.0(2019年5月发布)以下版本不支持C++ 17 STL并行算法库。好消息是GCC 9.1.0已支持C++ 17 STL并行算法库,但需要安装Intel TBB 2018或更新库,如下图所示:
一、安装GCC 9.1
GCC 9.1的安装方法可参见我的另一篇博客:《Ubuntu 16.04系统中GCC 9.1编译器安装方法及C++17标准测试示例》,该博客写于2018年3月,当时GCC的最新版本是GCC 7.3,前几天我计划将该博客更新至GCC 9.1,结果发现除了将版本号从7.3.0修改为9.1.0外,不用做任何更改就可以顺利地将GCC 9.1安装成功。
GCC 9.1安装成功后,如果忽略上图中“需要安装Intel TBB 2018或更新库”的建议,直接使用如下命令编译auto_parallel.cpp
:
g++ -g -Wall -std=c++17 auto_parallel.cpp -o auto_parallel
则会出现下图所示的错误:
提示简单明了,就是需要安装Intel TBB 2018或更新库。
二、安装Intel TBB库
进入GitHub TBB的Release页,找到最新的发布版(2019年8月4日的最新版是:2019_U8),使用如下命令下载、解压、构建:
# 1.下载源代码
wget https://github.com/intel/tbb/archive/2019_U8.tar.gz
# 2.解压源代码
tar xzvf 2019_U8.tar.gz
rm 2019_U8.tar.gz
# 3. 构建Intel TBB库
cd tbb-2019_U8
# 注意,要将系统的默认gcc编译器更新为gcc 9.1,操作方法见我的另一篇博客:
# https://blog.csdn.net/davidhopper/article/details/79681695
make compiler=gcc stdver=c++17 tbb_build_prefix=my_tbb_build
TBB库没有提供安装脚本,我自己写了一个,将tbb-2019_U8
安装到/usr/local
目录(可以通过修改MY_LOCAL_DIR
改变安装位置),内容如下:
#!/bin/bash
#
MY_LOCAL_DIR="/usr/local"
TBB_ROOT_DIR="${MY_LOCAL_DIR}/tbb-2019_U8"
if [ ! -d ${TBB_ROOT_DIR} ]; then
sudo mkdir ${TBB_ROOT_DIR}
else
sudo rm -rf ${TBB_ROOT_DIR}/*
fi
sudo cp -r include ${TBB_ROOT_DIR}/include
sudo mkdir ${TBB_ROOT_DIR}/lib
sudo cp build/my_tbb_build_release/*so* ${TBB_ROOT_DIR}/lib
if [ -e ${MY_LOCAL_DIR}/include/tbb ]; then
sudo rm -f ${MY_LOCAL_DIR}/include/tbb
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbb.so ]; then
sudo rm -f ${MY_LOCAL_DIR}/lib/libtbb.so
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbbmalloc.so ]; then
sudo rm -f ${MY_LOCAL_DIR}/lib/libtbbmalloc.so
fi
if [ -e ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so ]; then
sudo rm -f ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so
fi
sudo ln -s ${TBB_ROOT_DIR}/include/tbb ${MY_LOCAL_DIR}/include/tbb
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbb.so.2 ${MY_LOCAL_DIR}/lib/libtbb.so
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbbmalloc.so.2 ${MY_LOCAL_DIR}/lib/libtbbmalloc.so
sudo ln -s ${TBB_ROOT_DIR}/lib/libtbbmalloc_proxy.so.2 ${MY_LOCAL_DIR}/lib/libtbbmalloc_proxy.so
if [ -z "${LD_LIBRARY_PATH}" ]; then
echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib" >> ~/.bashrc
else
tbb_result=$(echo ${LD_LIBRARY_PATH} | grep "${TBB_ROOT_DIR}/lib")
if [ -z "${tbb_result}" ]; then
echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib:${LD_LIBRARY_PATH}" >> ~/.bashrc
fi
fi
source ~/.bashrc
你可以在TBB库源代码的顶级目录tbb-2019_U8
中使用vim
或gedit
等文本编辑工具新建一个文件:install_tbb_2019_U8.sh
,将上述内容粘贴到该文件中,然后执行如下命令进行安装:
bash install_tbb_2019_U8.sh
说明:不知什么原因,在脚本文件中使用source ~/.bashrc
命令,似乎不能刷新环境设置,按照《shell脚本无法使用source的原因及解决方法》中的方法修改,也没有效果,于是我只能在命令行内单独调用:
source ~/.bashrc
让环境更改生效。
2021年2月14日更新
TBB已更名为oneTBB,下载网址为:https://github.com/oneapi-src/oneTBB/releases,目前最新版本是oneTBB 2021.5.0。但最新的oneTBB不支持GCC 9.1以及9.3(Ubuntu 20.04 2021年以后的版本默认就是GCC 9.3),具体问题请参见:Can’t run C++17 parallel algorithms with GCC on Linux。
好消息是,有个更为简洁的办法安装TBB,命令如下:
sudo apt update
sudo apt install libtbb-dev
以上命令会将tbb库安装在系统用户目录/usr
。
oneTBB支持GCC11.0,使用CMake构建,下面给出基于源代码的构建安装方法:
# 1.下载源代码
wget https://github.com/oneapi-src/oneTBB/archive/refs/tags/v2021.5.0.tar.gz
# 2.解压源代码
tar xzvf v2021.5.0.tar.gz
rm v2021.5.0.tar.gz
# 3. 构建Intel TBB库
cd v2021.5.0/oneTBB-2021.5.0
# 注意,要将系统的默认gcc编译器更新为GCC 9.1或以上版本(Ubuntu 20.04 2021年以后的版本默认就是GCC 9.3),操作方法见我的另一篇博客:
# https://blog.csdn.net/davidhopper/article/details/79681695
mkdir build && cd build
# Configure: customize CMAKE_INSTALL_PREFIX and disable TBB_TEST to avoid tests build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local -DTBB_TEST=OFF ..
# Build
cmake --build .
# Install
cmake --install .
# oneTBB is in /usr/local
使用vim
或gedit
等文本编辑工具新建一个文件:setup_tbb.sh
,内容如下:
MY_LOCAL_DIR="/usr/local"
TBB_ROOT_DIR="${MY_LOCAL_DIR}/onetbb"
if [ -z "${LD_LIBRARY_PATH}" ]; then
echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib" >> ~/.bashrc
else
tbb_result=$(echo ${LD_LIBRARY_PATH} | grep "${TBB_ROOT_DIR}/lib")
if [ -z "${tbb_result}" ]; then
echo "export LD_LIBRARY_PATH=${TBB_ROOT_DIR}/lib:${LD_LIBRARY_PATH}" >> ~/.bashrc
fi
fi
source ~/.bashrc
执行如下命令让路径生效:
bash setup_tbb.sh
说明:不知什么原因,在脚本文件中使用source ~/.bashrc
命令,似乎不能刷新环境设置,于是我只能在命令行内单独调用:
source ~/.bashrc
让环境变量更改生效。
三、GCC 9.1构建C++17 STL并行算法库示例:
3.1 auto_parallel.cpp
第一个示例自然是前面提到的auto_parallel.cpp
,再次列出代码如下:
#include <algorithm>
#include <execution>
#include <iostream>
#include <random>
#include <vector>
bool Odd(int n) { return n % 2; }
int main() {
std::vector<int> numbers(500);
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> dis(0, 100000);
auto rand_num([&dis, &gen]() mutable { return dis(gen); });
std::generate(std::execution::seq, std::begin(numbers), std::end(numbers),
rand_num);
std::sort(std::execution::par, std::begin(numbers), std::end(numbers));
std::reverse(std::execution::par, std::begin(numbers), std::end(numbers));
auto odds(std::count_if(std::execution::par, std::begin(numbers),
std::end(numbers), Odd));
std::cout << "Reverse sorted numbers: " << std::endl;
size_t i = 1;
for (const auto& number : numbers) {
std::cout << number << ", ";
if (i % 10 == 0) {
std::cout << std::endl;
}
++i;
}
std::cout << std::endl;
std::cout << (100.0 * odds / numbers.size()) << "% of the numbers are odd.\n";
return 0;
}
构建指令为:
g++ -g -Wall -std=c++17 -L/usr/local/lib -ltbb auto_parallel.cpp -o auto_parallel
# 如果使用sudo apt install libtbb-dev命令安装tbb,则不需要显式指定链接目录
g++ -g -Wall -std=c++17 -ltbb auto_parallel.cpp -o auto_parallel
运行程序:
./auto_parallel
结果如下:
Reverse sorted numbers:
99991, 99956, 99370, 99011, 98993, 98981, 98836, 98576, 98464, 98411,
97285, 96689, 96660, 96100, 95775, 95528, 94881, 94839, 94465, 93932,
93867, 93640, 93387, 93375, 93306, 93302, 93103, 92928, 92872, 92859,
92765, 92710, 92519, 92383, 92305, 92232, 92200, 92137, 92077, 92072,
92001, 91957, 91705, 91522, 91489, 91437, 91315, 91229, 91197, 91179,
91150, 91112, 91098, 91059, 90982, 90817, 90418, 90291, 90071, 89910,
89855, 89602, 88569, 88497, 88282, 88273, 88108, 87806, 87787, 87620,
87279, 86355, 86054, 85923, 85733, 85608, 85345, 85130, 84761, 84312,
84197, 83980, 83852, 83755, 83335, 83151, 83139, 82632, 82220, 81650,
81485, 81381, 80833, 80563, 80562, 80491, 80470, 80125, 80083, 80036,
79975, 79939, 79927, 79914, 79798, 79699, 79682, 79612, 79400, 79296,
78798, 78797, 78712, 78499, 78161, 78074, 78065, 77930, 77732, 77617,
77427, 77366, 77302, 76982, 76818, 76781, 75824, 75807, 75610, 75443,
75263, 75048, 74507, 74184, 74112, 73989, 73884, 73508, 73332, 72878,
72798, 72711, 72639, 72570, 72565, 72316, 72316, 71951, 71942, 71916,
71829, 70775, 70696, 70426, 70311, 70283, 70161, 70116, 69820, 69720,
68564, 68360, 68059, 68001, 67812, 67711, 67474, 67410, 67308, 67114,
67024, 67003, 66834, 66634, 66505, 65919, 65851, 65816, 65735, 65628,
65400, 65196, 65121, 65065, 64987, 64472, 63859, 63843, 63753, 63741,
63726, 63128, 62964, 62869, 62684, 62501, 62441, 62414, 62361, 62296,
62210, 62188, 61771, 61509, 60815, 60789, 60601, 60510, 60393, 60226,
59992, 59981, 59828, 59613, 59264, 59173, 58870, 57837, 57258, 56758,
56740, 56270, 55979, 55892, 55859, 55805, 55617, 55546, 55274, 54997,
54772, 54524, 54507, 54140, 53961, 53906, 53633, 52991, 52822, 52714,
52693, 52522, 52314, 52141, 51854, 51812, 51799, 51387, 51249, 51229,
51105, 51053, 51026, 50633, 50435, 50270, 49542, 49511, 49392, 49279,
48932, 48788, 48658, 48564, 48434, 48401, 48383, 48328, 48278, 47334,
47114, 46873, 46859, 46801, 46686, 46582, 46371, 46076, 45859, 45846,
45701, 45419, 44464, 43838, 43693, 43653, 43470, 42733, 42552, 42397,
42367, 42039, 41890, 41801, 41773, 41565, 41489, 41437, 41101, 40909,
40330, 40167, 39949, 39833, 39798, 39268, 38811, 38786, 38522, 38305,
38284, 38027, 37926, 37750, 37706, 37575, 37540, 37194, 37059, 36782,
36633, 36376, 36270, 35649, 35551, 35476, 35343, 35104, 34856, 34607,
34409, 34270, 34177, 34063, 33730, 33691, 33476, 33010, 32897, 32766,
32388, 31735, 31733, 31504, 31308, 31223, 30920, 30844, 30822, 30300,
30162, 30122, 29802, 29737, 29118, 28531, 28427, 28089, 27986, 27770,
27737, 27715, 27704, 27651, 27510, 27462, 27298, 26792, 26663, 26581,
26563, 26404, 26343, 26032, 25898, 25763, 25658, 25608, 25579, 25471,
25083, 24818, 24722, 24680, 24621, 24396, 23675, 23270, 23004, 22565,
22291, 22093, 21966, 21793, 21785, 21405, 21373, 21238, 21111, 20981,
20957, 20887, 20852, 20800, 20683, 20382, 20246, 20023, 19488, 19418,
19307, 19156, 19144, 19011, 18833, 18819, 17824, 17631, 17303, 17209,
17179, 16434, 16379, 16291, 16081, 15966, 15877, 15791, 15754, 15693,
15219, 14438, 14434, 14072, 13609, 13413, 12932, 12863, 12717, 12472,
12238, 11919, 11598, 11427, 11204, 11119, 10149, 9763, 9628, 9489,
9392, 9291, 9245, 9222, 8918, 8756, 8739, 8269, 8069, 7987,
7960, 7839, 7771, 7667, 7589, 7467, 7427, 7425, 7276, 6582,
6577, 6523, 6379, 6277, 6039, 5674, 5591, 5459, 5434, 5148,
4910, 4668, 4104, 4056, 3792, 3621, 3284, 2560, 2080, 1832,
1792, 1632, 1445, 1169, 959, 719, 640, 243, 150, 22,
51.4% of the numbers are odd.
注意1:如果运行程序出现如下错误,请参考我的博客《Ubuntu 16.04系统中GCC 9.1编译器安装方法及C++17标准测试示例》4.1节加以解决:
./auto_parallel: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /usr/local/tbb-2019_U8/lib/libtbb.so.2)
注意2:如果不显式地指定TBB
库的链接目录为我们安装的目录-L/usr/local/lib
,GCC编译器会使用系统默认的TBB
库目录(在我机器上为/usr/libx86_64-linux-gnu/
),导致如下链接错误:
...
/usr/local/include/tbb/task_arena.h:157: undefined reference to `tbb::interface7::internal::isolate_within_arena(tbb::interface7::internal::delegate_base&, long)'
collect2: error: ld returned 1 exit status
如果确实不想显式指定TBB
库的链接目录,可执行如下命令:
cd /usr/lib/x86_64-linux-gnu
# 如果存在以下三个文件,则备份之
sudo mv libtbb.so libtbb.so.bk
sudo mv libtbbmalloc_proxy.so libtbbmalloc_proxy.so.bk
sudo mv libtbbmalloc.so libtbbmalloc.so.bk
# 建立正确的符号链接
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbb.so.2 libtbb.so
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbbmalloc.so.2 libtbbmalloc.so
sudo ln -s /usr/local/tbb-2019_U8/lib/libtbbmalloc_proxy.so.2 libtbbmalloc_proxy.so
3.2 parallel_sort_test.cpp
第二个示例是排序测试parallel_sort_test.cpp
,列出代码如下:
#include <algorithm>
#include <chrono>
#include <execution>
#include <iostream>
#include <random>
#include <vector>
void PrintDuration(std::chrono::steady_clock::time_point start,
std::chrono::steady_clock::time_point end,
const char *message) {
auto diff = end - start;
std::cout << message << ' '
<< std::chrono::duration<double, std::milli>(diff).count()
<< " ms\n";
}
template <typename T>
void Test(const T &policy, const std::vector<double> &data, const int repeat,
const char *message) {
for (int i = 0; i < repeat; ++i) {
std::vector<double> curr_data(data);
const auto start = std::chrono::steady_clock::now();
std::sort(policy, curr_data.begin(), curr_data.end());
const auto end = std::chrono::steady_clock::now();
PrintDuration(start, end, message);
}
std::cout << '\n';
}
int main() {
// Test samples and repeat factor
constexpr size_t samples{5'000'000};
constexpr int repeat{10};
// Fill a vector with samples numbers
std::random_device rd;
std::mt19937_64 mre(rd());
std::uniform_real_distribution<double> urd(0.0, 1.0);
std::vector<double> data(samples);
for (auto &e : data) {
e = urd(mre);
}
// Sort data using different execution policies
std::cout << "std::execution::seq\n";
Test(std::execution::seq, data, repeat, "Elapsed time");
std::cout << "std::execution::par\n";
Test(std::execution::par, data, repeat, "Elapsed time");
return 0;
}
构建指令为:
g++ -g -Wall -std=c++17 -L/usr/local/lib -ltbb parallel_sort_test.cpp -o parallel_sort_test
运行程序:
./parallel_sort_test
结果如下:
std::execution::seq
Elapsed time 2553.66 ms
Elapsed time 2586.73 ms
Elapsed time 2619.7 ms
Elapsed time 2561.58 ms
Elapsed time 2555.84 ms
Elapsed time 2589.82 ms
Elapsed time 2572.41 ms
Elapsed time 2547.43 ms
Elapsed time 2550.76 ms
Elapsed time 2592.11 ms
std::execution::par
Elapsed time 1045.65 ms
Elapsed time 1064.65 ms
Elapsed time 1072.01 ms
Elapsed time 1069.37 ms
Elapsed time 1160.24 ms
Elapsed time 1275.98 ms
Elapsed time 1432.63 ms
Elapsed time 1306.78 ms
Elapsed time 1052.09 ms
Elapsed time 1038.25 ms
可见并行版本效率提升了一倍。
我们开启GCC的优化选项-O2
,再次构建执行:
g++ -g -Wall -std=c++17 -O2 -L/usr/local/lib -ltbb parallel_sort_test.cpp -o parallel_sort_test
运行程序:
./parallel_sort_test
结果如下:
std::execution::seq
Elapsed time 469.323 ms
Elapsed time 469.551 ms
Elapsed time 467.699 ms
Elapsed time 472.385 ms
Elapsed time 476.423 ms
Elapsed time 476.304 ms
Elapsed time 477.908 ms
Elapsed time 471.515 ms
Elapsed time 474.121 ms
Elapsed time 503.281 ms
std::execution::par
Elapsed time 195.322 ms
Elapsed time 191.604 ms
Elapsed time 189.077 ms
Elapsed time 185.162 ms
Elapsed time 183.868 ms
Elapsed time 184.853 ms
Elapsed time 222.734 ms
Elapsed time 188.112 ms
Elapsed time 189.43 ms
Elapsed time 198.443 ms
整体效率提升了4倍左右。