基于C++11以及17实现线程池【如何设置线程池大小】

单任务队列线程池

用现代的C++标准库（线程+锁+条件变量）实现一个单任务队列的线程池非常简单。基本的实现思路是：在线程池构造时初始化线程数，在析构时停止线程池。对外只需要提供提交任务的接口即可。

接口设计

返回类型

explicit ThreadPool(size_t threads = std::thread::hardware_concurrency());  // 构造函数

template<typename F, typename... Args>
auto enqueue(F &&f, Args &&...args); // 入队接口

入队接口enqueue()这个模板函数返回值使用了auto关键字进行推导，实际上的返回值其实是一个future。

输入参数

输入参数是一个可调用对象和它的参数，这里利用了C++11的可变参数模板来实现传递任意数量的可调用对象的参数。

基本实现

class ThreadPool {
public:
    explicit ThreadPool(size_t threads = std::thread::hardware_concurrency());

    template<typename F, typename... Args>
    auto enqueue(F &&f, Args &&...args);

    ~ThreadPool();

private:
    std::vector<std::thread> workers;
    std::queue<std::function<void()>> tasks;
    std::mutex queue_mutex;
    std::condition_variable condition;
    bool stop;
};

注意：std::thread::hardware_concurrency()在新版C++标准库中是一个很有用的函数。这个函数将返回能同时并发在一个程序中的线程数量。例如，多核系统中，返回值可以是CPU核芯的数量。返回值也仅仅是一个提示，当系统信息无法获取时，函数也会返回0。但是，这也无法掩盖这个函数对启动线程数量的帮助。

这个简单任务队列线程池的成员只有一个线程组，一个任务队列。为了保证任务队列的线程安全，提供一个锁。同时提供了一个条件变量，利用锁和条件变量，可以实现线程通知机制。线程通知机制指，刚开始时线程池中是没有任务的，所有的线程都等待任务的到来，当一个任务进入到线程池中，就会通知一个线程去处理到来的任务。

同时又提供一个stop变量，用来在析构的时候停止和清理任务和线程。因为懒（高情商：RAII风格线程池，生命周期基本上与应用的生命周期一致），没有提供stop接口。

下面是具体实现（C++11版本）：

#ifndef THREAD_POOL_H
#define THREAD_POOL_H

#include <vector>
#include <queue>
#include <memory>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <future>
#include <functional>
#include <stdexcept>

class ThreadPool {
public:
    ThreadPool(size_t);
    template<class F, class... Args>
    auto enqueue(F&& f, Args&&... args) 
        -> std::future<typename std::result_of<F(Args...)>::type>;
    ~ThreadPool();
private:
    // need to keep track of threads so we can join them
    std::vector< std::thread > workers;
    // the task queue
    std::queue< std::function<void()> > tasks;
    
    // synchronization
    std::mutex queue_mutex;
    std::condition_variable condition;
    bool stop;
};
 
// the constructor just launches some amount of workers
inline ThreadPool::ThreadPool(size_t threads)
    :   stop(false)
{
    for(size_t i = 0;i<threads;++i)
        workers.emplace_back(
            [this]
            {
                for(;;)
                {
                    std::function<void()> task;

                    {
                        std::unique_lock<std::mutex> lock(this->queue_mutex);
                        this->condition.wait(lock,
                            [this]{ return this->stop || !this->tasks.empty(); });
                        if(this->stop && this->tasks.empty())
                            return;
                        task = std::move(this->tasks.front());
                        this->tasks.pop();
                    }

                    task();
                }
            }
        );
}

// add new work item to the pool
template<class F, class... Args>
auto ThreadPool::enqueue(F&& f, Args&&... args) 
    -> std::future<typename std::result_of<F(Args...)>::type>
{
    using return_type = typename std::result_of<F(Args...)>::type;

    auto task = std::make_shared< std::packaged_task<return_type()> >(
            std::bind(std::forward<F>(f), std::forward<Args>(args)...)
        );
        
    std::future<return_type> res = task->get_future();
    {
        std::unique_lock<std::mutex> lock(queue_mutex);

        // don't allow enqueueing after stopping the pool
        if(stop)
            throw std::runtime_error("enqueue on stopped ThreadPool");

        tasks.emplace([task](){ (*task)(); });
    }
    condition.notify_one();
    return res;
}

// the destructor joins all threads
inline ThreadPool::~ThreadPool()
{
    {
        std::unique_lock<std::mutex> lock(queue_mutex);
        stop = true;
    }
    condition.notify_all();
    for(std::thread &worker: workers)
        worker.join();
}

#endif

那么，如果我们要将上面的版本改造成C++17的版本，

Notable changes（显著的变化）:

std::result_of has been deprecated in C++ 17. Use std::invoke_result instead.
We should use std::invoke instead of write f(args...) directly.
We could use initializer in lambda's capture list in C++ 14, so instead of make_shared<std::packaged_task> and copying it into the lambda we can write [task = std::move(task)].

完整的代码（C++17版本，参考自：https://github.com/jhasse/ThreadPool）：

#ifndef THREAD_POOL_HPP
#define THREAD_POOL_HPP

#include <functional>
#include <future>
#include <queue>

class ThreadPool {
public:
    explicit ThreadPool(size_t);
    template<class F, class... Args>
    decltype(auto) enqueue(F&& f, Args&&... args);
    ~ThreadPool();
private:
    // need to keep track of threads so we can join them
    std::vector< std::thread > workers;
    // the task queue
    std::queue< std::packaged_task<void()> > tasks;

    // synchronization
    std::mutex queue_mutex;
    std::condition_variable condition;
    std::condition_variable condition_producers;
    bool stop;
};

// the constructor just launches some amount of workers
inline ThreadPool::ThreadPool(size_t threads)
    :   stop(false)
{
    for(size_t i = 0;i<threads;++i)
        workers.emplace_back(
            [this]
            {
                for(;;)
                {
                    std::packaged_task<void()> task;

                    {
                        std::unique_lock<std::mutex> lock(this->queue_mutex);
                        this->condition.wait(lock,
                            [this]{ return this->stop || !this->tasks.empty(); });
                        if(this->stop && this->tasks.empty())
                            return;
                        task = std::move(this->tasks.front());
                        this->tasks.pop();
                        if (tasks.empty()) {
                            condition_producers.notify_one(); // notify the destructor that the queue is empty
                        }
                    }

                    task();
                }
            }
        );
}

// add new work item to the pool
template<class F, class... Args>
decltype(auto) ThreadPool::enqueue(F&& f, Args&&... args)
{
    using return_type = std::invoke_result_t<F, Args...>;

    std::packaged_task<return_type()> task(
            std::bind(std::forward<F>(f), std::forward<Args>(args)...)
        );

    std::future<return_type> res = task.get_future();
    {
        std::unique_lock<std::mutex> lock(queue_mutex);

        // don't allow enqueueing after stopping the pool
        if(stop)
            throw std::runtime_error("enqueue on stopped ThreadPool");

        tasks.emplace(std::move(task));
    }
    condition.notify_one();
    return res;
}

// the destructor joins all threads
inline ThreadPool::~ThreadPool() {
	{
		std::unique_lock<std::mutex> lock(queue_mutex);
		condition_producers.wait(lock, [this] { return tasks.empty(); });
		stop = true;
	}
	condition.notify_all();
	for (std::thread& worker : workers) {
		worker.join();
	}
}

#endif

使用示例：

// create thread pool with 4 worker threads
ThreadPool pool(4);

// enqueue and store future
auto result = pool.enqueue([](int answer) { return answer; }, 42);

// get result from future
std::cout << result.get() << std::endl;

如何设置线程池大小？

线程池大小的设置需要考虑三方面的因素，服务器的配置、服务器资源的预算和任务自身的特性。具体来说就是服务器有多少个CPU，多少内存，IO支持的最大QPS是多少，任务主要执行的是计算、IO还是一些混合操作，任务中是否包含数据库连接等的稀缺资源。线程池的线程数设置主要取决于这些因素。

CPU密集型任务

也叫计算密集型任务，这种类型大部分状况下，CPU使用时间远高于I/O耗时，计算要处理、许多逻辑判断，几乎没有I/O操作的任务就属于CPU密集型；如果是CPU密集型任务，频繁切换上下线程是不明智的，此时应该设置一个较小的线程数，比如CPU的数目加1；

IO密集型任务

IO密集型则是系统运行时，大部分时间都在进行I/O操作，CPU占用率不高

比如：任务对其他系统资源有依赖，如某个任务依赖数据库的连接返回的结果，这时候等待的时间越长，则CPU空闲的时间越长，那么线程数量应设置得越大，才能更好的利用CPU, 不让CPU闲下来, 但也不宜过多，需要注意线程切换的开销。

IO密集型任务应配置尽可能多的线程，因为IO操作不占用CPU，不要让CPU闲下来，应加大线程数量，如配置两倍CPU个数+1。

混合型任务

对于混合型的任务，如果可以拆分，拆分成IO密集型和CPU密集型分别处理，前提是两者运行的时间是差不多的，如果处理时间相差很大，则没必要拆分了。

->配置线程池大小的原则-阻抗公式

线程池设置线程数与CPU计算时间和I/O操作时间的比例相关，一个配置线程池大小的原则——阻抗匹配原则，其经验公式为：

N = CPU数量
P = CPU繁忙时间 / 总运行时间   // 0 < P <=1
T = 所需设置线程数
T = N / P

参考：

基于C++11实现线程池

https://github.com/lzpong/threadpool

https://github.com/progschj/ThreadPool

C++ 中的多线程的使用和线程池建设

posted @ 2022-10-17 14:49 小金乌会发光－Z&M 阅读(3938) 评论(0) 收藏举报

刷新页面返回顶部

小金乌会发光－Z&M

欲无杂草，先种庄稼，用心呵护好自己那一亩三分地！