多线程介绍
几个关于多线程有关的概念:
进程:进程指正在运行的程序。确切的来说,当一个程序进入内存运行,即变成一个进程,进程是处于运行过程中的程序,并且具有一定独立功能。
线程:线程是进程中的一个执行单元,负责当前进程中程序的执行,一个进程中至少有一个线程。一个进程中是可以有多个线程的,这个应用程序也可以称之为多线程程序。
简而言之:一个程序运行后至少有一个进程,一个进程中可以包含多个线程。
程序运行原理
分时调度
所有线程轮流使用 CPU 的使用权,平均分配每个线程占用 CPU 的时间。
抢占式调度
优先让优先级高的线程使用 CPU,如果线程的优先级相同,那么会随机选择一个(线程随机性),Java使用的为抢占式调度。
实际上,CPU(中央处理器)使用抢占式调度模式在多个线程间进行着高速的切换。对于CPU的一个核而言,某个时刻,只能执行一个线程,而 CPU的在多个线程间切换速度相对我们的感觉要快,看上去就是在同一时刻运行。
其实,多线程程序并不能提高程序的运行速度,但能够提高程序运行效率,让CPU的使用率更高。
C++11引入了thread类,大大降低了多线程使用的复杂度,所需头文件<thread>
两个子线程并行执行,join函数会阻塞主流程,所以子线程都执行完成之后才继续执行主线程。可以使用detach将子线程从主流程中分离,独立运行,不会阻塞主线程:
在绑定的时候也可以同时给带参数的线程传入参数
多个线程同时对同一变量进行操作的时候,如果不对变量做一些保护处理,有可能导致处理结果异常:
可以使用线程互斥对象mutex保持数据同步。mutex类的使用需要包含头文件mutex
omp并行
OpenMP遇到parallel指令后创建的线程数量由如下过程决定:
1. if子句的结果
2. num_threads的设置
3. omp_set_num_threads()库函数的设置
4. OMP_NUM_THREADS环境变量的设置
5. 编译器默认实现(一般而言,默认实现的是总线程数等于处理器的核心数)
需要说明的是,上面的五点是依次优先级降低的。
main:
#include <iostream>
#include <vector>
#include <chrono>
#include <omp.h>
#include <thread>
#include <mutex>
std::mutex mutex;
void foo()
{
std::cout << "foo is started\n";
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "foo is done\n";
}
void bar()
{
std::cout << "bar is started\n";
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << "bar is done\n";
}
void Process(size_t i, std::vector<size_t>& datas) {
// std::lock_guard<std::mutex> lock(mutex); // 保证多线程访问安全
mutex.lock();
datas.emplace_back(i);
mutex.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(3000+i));
std::cout << i << ",\tthread id: " << std::hash<std::thread::id>{}(std::this_thread::get_id()) << " <--> " << omp_get_thread_num() << std::endl;
}
int main() {
std::cout << "main thread id: " << std::hash<std::thread::id>{}(std::this_thread::get_id()) << " <--> " << omp_get_thread_num() << std::endl;
// std::thread::id this_id = std::this_thread::get_id();
// std::cout << "main thread id: 0x" << std::hex << this_id << std::endl;
unsigned int max_thread = std::thread::hardware_concurrency();
unsigned int loop_times = 20;
unsigned int num_thread = std::min(max_thread, loop_times);
std::cout << "use_thread: " << num_thread << std::endl;
std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
std::vector<size_t> datas;
// omp_set_num_threads(num_thread);
// #pragma omp parallel for
#pragma omp parallel for schedule(dynamic) num_threads(num_thread)
for (size_t i = 0; i < loop_times; ++i) {
Process(i, datas);
}
std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
std::chrono::duration<double> time_used = std::chrono::duration_cast<std::chrono::duration<double>>(t2-t1);
std::cout << "use time: " << time_used.count() << std::endl;
for (const auto& data : datas) {
std::cout << data << " ";
}
std::cout << std::endl;
std::thread parallel_thread_1(foo);
std::thread parallel_thread_2(bar);
parallel_thread_1.join();
parallel_thread_2.join();
std::cout << "done!\n";
std::chrono::steady_clock::time_point t3 = std::chrono::steady_clock::now();
time_used = std::chrono::duration_cast<std::chrono::duration<double>>(t3-t2);
std::cout << "use time: " << time_used.count() << std::endl;
std::thread independent_thread(foo);
independent_thread.detach();
// std::this_thread::sleep_for(std::chrono::seconds(3));
std::cout << "exit!\n";
return 0;
}
CMakeLists.txt:
cmake_minimum_required(VERSION 2.8)
project(test)
FIND_PACKAGE(OpenMP REQUIRED)
if(OPENMP_FOUND)
message("OPENMP FOUND")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
#set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
#set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") # 设置c++11
set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin) # 设置可执行文件的输出目录
add_executable(test main.cc)
输出:
main thread id: 9106343501778466207 <--> 0
use_thread: 8
0, thread id: 13530764319357977496 <--> 6
1, thread id: 14911349865183214116 <--> 5
2, thread id: 93270682753825348 <--> 3
3, thread id: 17853822041977219895 <--> 4
4, thread id: 1602013600966058895 <--> 7
5, thread id: 9106343501778466207 <--> 0
6, thread id: 292010929987813385 <--> 2
7, thread id: 14396549945646696371 <--> 1
8, thread id: 13530764319357977496 <--> 6
9, thread id: 14911349865183214116 <--> 5
10, thread id: 93270682753825348 <--> 3
11, thread id: 17853822041977219895 <--> 4
12, thread id: 1602013600966058895 <--> 7
13, thread id: 9106343501778466207 <--> 0
14, thread id: 292010929987813385 <--> 2
15, thread id: 14396549945646696371 <--> 1
16, thread id: 13530764319357977496 <--> 6
17, thread id: 14911349865183214116 <--> 5
18, thread id: 93270682753825348 <--> 3
19, thread id: 17853822041977219895 <--> 4
use time: 9.05392
6 5 1 4 0 2 3 7 8 9 10 11 12 13 14 15 16 17 18 19
foo is started
bar is started
bar is done
foo is done
done!
use time: 2.00125
foo is started
exit!
#pragma omp critical // 子句的作用是限制一块区域最多只能有一个线程在里面运行,这就是临界区
{ ++cnt; }
std::thread 调用类的成员函数需要传递类的一个对象作为参数
#include <thread>
#include <iostream>
class bar {
public:
void foo() {
std::cout << "hello from member function" << std::endl;
}
};
int main()
{
std::thread t(&bar::foo, bar());
t.join();
}
如果是在类的成员函数中处理thread,传入 this 即可
std::thread spawn() {
return std::thread(&blub::test, this);
}
class A {
void foo(int i);
}
A a;
std::thread tt(&A::foo, a, 5);
#include <iostream>
#include <boost/thread.hpp>
void func1(const std::string &str) {
boost::this_thread::sleep(boost::posix_time::seconds(3));
std::cout << "call func1()..." << str << std::endl;
}
int main() {
//绑定bind函数function
boost::function<void()> f(std::bind(func1, "alan"));
boost::thread t(f);
// t.interrupt(); // 中断线程
t.timed_join(boost::posix_time::seconds(2)); // 超过2s结束线程
return 0;
}
多线程并行 (多线程之间需要等待,有时间浪费)
#include <chrono>
#include <thread>
#include <iostream>
#include <vector>
class Test {
public:
void Process(int &i) {
std::cout << "process [" << i << "] in thread " << std::this_thread::get_id() << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(i));
i = i * i; // square
}
};
int main() {
std::vector<int> datas;
std::cout << "input data: ";
for (int i = 0; i < 10; ++i) {
datas.push_back(i);
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
unsigned int max_thread = std::thread::hardware_concurrency();
std::cout << "max thread: " << max_thread << std::endl;
size_t thread_num = std::min(datas.size(), static_cast<size_t>(max_thread));
std::cout << "use thread: " << thread_num << std::endl;
Test test;
std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
for (size_t i = 0; i < datas.size();) {
std::vector<std::thread> threads;
for (size_t j = 0; i < datas.size() && j < thread_num; ++i, ++j) {
std::thread thread(&Test::Process, &test, std::ref(datas.at(i)));
threads.push_back(std::move(thread));
// threads.emplace_back(std::thread(&Test::Process, &test, std::ref(datas.at(i))));
}
for (auto &thread : threads) {
thread.join();
}
}
std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
std::chrono::duration<double> time_used =
std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1);
std::cout << "total use time: " << time_used.count() << std::endl;
std::cout << "output data: ";
for (int i = 0; i < 10; ++i) {
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
return 0;
}
输出:
input data: 0 1 2 3 4 5 6 7 8 9
max thread: 8
use thread: 8
process [0] in thread 140269431498496
process [1] in thread 140269423105792
process [2] in thread 140269414713088
process [4] in thread 140269397927680
process [3] in thread 140269406320384
process [6] in thread 140269381142272
process [7] in thread 140269372749568
process [5] in thread 140269389534976
process [8] in thread 140269372749568
process [9] in thread 140269381142272
total use time: 16.0009
output data: 0 1 4 9 16 25 36 49 64 81
多线程并行 (多线程之间不需要等待,没有时间浪费)
#include <chrono>
#include <thread>
#include <atomic>
#include <mutex>
#include <iostream>
#include <vector>
class Test {
public:
void Process(std::vector<int> *datas) {
while (idx_ < datas->size()) {
mutex_.lock();
size_t i = idx_++;
mutex_.unlock();
std::cout << "process [" << i << "] in thread " << std::this_thread::get_id() << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(i));
datas->at(i) = datas->at(i) * datas->at(i); // square
}
}
std::atomic<size_t> idx_{0};
std::mutex mutex_;
};
int main() {
std::vector<int> datas;
std::cout << "input data: ";
for (int i = 0; i < 10; ++i) {
datas.push_back(i);
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
unsigned int max_thread = std::thread::hardware_concurrency();
std::cout << "max thread: " << max_thread << std::endl;
size_t thread_num = std::min(datas.size(), static_cast<size_t>(max_thread));
std::cout << "use thread: " << thread_num << std::endl;
Test test;
std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
std::vector<std::thread> threads;
for (size_t i = 0; i < thread_num; ++i) {
std::thread thread(&Test::Process, &test, &datas);
threads.push_back(std::move(thread));
}
for (auto &thread : threads) {
thread.join();
}
std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
std::chrono::duration<double> time_used =
std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1);
std::cout << "total use time: " << time_used.count() << std::endl;
std::cout << "output data: ";
for (int i = 0; i < 10; ++i) {
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
return 0;
}
输出:
input data: 0 1 2 3 4 5 6 7 8 9
max thread: 8
use thread: 8
process [0] in thread 140702719661824
process [2] in thread 140702719661824
process [1] in thread 140702711269120
process [3] in thread 140702702876416
process [4] in thread 140702694483712
process [5] in thread 140702686091008
process [6] in thread 140702677698304
process [7] in thread 140702669305600
process [8] in thread 140702660912896
process [9] in thread 140702711269120
total use time: 10.0005
output data: 0 1 4 9 16 25 36 49 64 81
线程池 threadpool
在一些多线程的程序中,响应请求的个数(即线程)的个数过多的话就会造成系统资源损耗过多而宕机,一般最多线程是有上限的,而且每次创建线程和销毁线程都会大量损耗资源和时间。所以解决办法之一就是使用线程池控制线程个数,复用创建过的线程。线程池可以减少创建和切换线程的额外开销,利用已经存在的线程多次循环执行多个任务从而提高系统的处理能力。
1、boost threadpool
到http://threadpool.sourceforge.net/下载threadpool,然后把threadpool里面的boost文件夹拷贝到/usr/local/include/下。使用threadpool需要链接boost的两个共享库:boost_thread、boost_system(如果是静态链接那就还得动态链接pthread库),并且include <boost/threadpool.hpp>。
#include <chrono>
#include <thread>
#include <boost/threadpool.hpp>
#include <iostream>
#include <vector>
#include <string>
void Notice(std::string str, int val) {
std::this_thread::sleep_for(std::chrono::seconds(0));
// boost::this_thread::sleep(boost::posix_time::seconds(0));
std::cout << str << ": " << val << std::endl;
}
class Test {
public:
void Process(int &i) {
std::cout << "process [" << i << "] in thread " << std::this_thread::get_id() << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(i));
i = i * i; // square
}
};
int main() {
std::vector<int> datas;
std::cout << "input data: ";
for (int i = 0; i < 10; ++i) {
datas.push_back(i);
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
unsigned int max_thread = std::thread::hardware_concurrency();
std::cout << "max thread: " << max_thread << std::endl;
size_t thread_num = std::min(datas.size(), static_cast<size_t>(max_thread));
std::cout << "use thread: " << thread_num << std::endl;
Test test;
std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
boost::threadpool::pool tp(thread_num);
tp.schedule(boost::bind(Notice, "thread_num", thread_num));
tp.wait();
for (size_t i = 0; i < datas.size(); ++i) {
tp.schedule(bind(&Test::Process, &test, std::ref(datas.at(i))));
}
tp.wait();
std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
std::chrono::duration<double> time_used =
std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1);
std::cout << "total use time: " << time_used.count() << std::endl;
std::cout << "output data: ";
for (int i = 0; i < 10; ++i) {
std::cout << datas.at(i) << " ";
}
std::cout << std::endl;
return 0;
}
输出:
input data: 0 1 2 3 4 5 6 7 8 9
max thread: 8
use thread: 8
thread_num: 8
process [0] in thread 140629957658368
process [1] in thread 140629974443776
process [2] in thread 140629924087552
process [3] in thread 140629932480256
process [4] in thread 140629949265664
process [5] in thread 140629940872960
process [6] in thread 140629957658368
process [7] in thread 140629966051072
process [8] in thread 140629915694848
process [9] in thread 140629974443776
total use time: 10.004
output data: 0 1 4 9 16 25 36 49 64 81
CMakeLists.txt:
cmake_minimum_required(VERSION 2.8.3)
project(test)
set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -pthread")
add_executable(test main.cc)
target_link_libraries(test
boost_thread boost_system)
2、基于C++11 thread 实现线程池
参考:
boost线程组
线程组很类似线程池,对线程进行管理;内部使用的是boost::thread。
#include <iostream>
#include <boost/thread.hpp>
void func1(const std::string &str) {
std::cout << "call func1()..." << str << std::endl;
}
void func2() {
std::cout << "call func2()..." << std::endl;
}
int main() {
boost::thread_group group;
for (int i = 0; i < 3; ++i) {
group.create_thread(std::bind(func1, "alan"));
}
group.add_thread(new boost::thread(func2));
boost::thread t(func1, "alan");
group.add_thread(&t);
group.remove_thread(&t); // 移除线程
std::cout << "group size: " << group.size() << std::endl;
group.join_all();
return 0;
}