并行计算一个大向量的和

Boost.Asio的主要目的是为网络@H_404_4@ 和 I / O编程@H_404_4@ 提供异步模型，您描述的问题似乎与网络和I / O无关。

我认为最简单的解决方案是使用Boost或C ++标准库提供的 线程原语@H_404_4@ 。

这是accumulate仅使用标准库创建的并行版本的示例。

/* Minimum number of elements for multithreaded algorithm.
   Less than this and the algorithm is executed on single thread. */
static const int MT_MIN_SIZE = 10000;

template <typename InputIt, typename T>
auto parallel_accumulate(InputIt first, InputIt last, T init) {
    // Determine total size.
    const auto size = std::distance(first, last);
    // Determine how many parts the work shall be split into.
    const auto parts = (size < MT_MIN_SIZE)? 1 : std::thread::hardware_concurrency();

    std::vector<std::future<T>> futures;

    // For each part, calculate size and run accumulate on a separate thread.
    for (std::size_t i = 0; i != parts; ++i) {
        const auto part_size = (size * i + size) / parts - (size * i) / parts;
        futures.emplace_back(std::async(std::launch::async,
            [=] { return std::accumulate(first, std::next(first, part_size), T{}); }));
        std::advance(first, part_size);
    }

    // Wait for all threads to finish execution and accumulate results.
    return std::accumulate(std::begin(futures), std::end(futures), init,
        [] (const T prev, auto& future) { return prev + future.get(); });
}

[**Live example**](http://coliru.stacked-crooked.com/a/4807261d61b7a726) （并行版本的性能与Coliru上的顺序版本大致相同，可能只有1个内核可用）

在我的机器上（使用8个线程），并行版本平均提高了约120％的性能。

顺序求和：花费时间：46毫秒 5000000050000000 -------------------------------- 并行和：花费时间：21毫秒 5000000050000000

但是，100,000,000个元素的绝对增益仅为边际（25 ms）。虽然，当累积不同类型的元素时，性能提升可能会大于int。

正如@sehe在评论中所提到的，值得一提的是 OpenMP@H_404_4@ 可以为该问题提供简单的解决方案，例如

template <typename T, typename U>
auto omp_accumulate(const std::vector<T>& v, U init) {
    U sum = init;

    #pragma omp parallel for reduction(+:sum)
    for(std::size_t i = 0; i < v.size(); i++) {
        sum += v[i];
    }

    return sum;
}

在我的机器上，此方法的执行效果与使用标准线程基元的并行方法相同。

顺序求和：花费时间：46毫秒 5000000050000000 -------------------------------- 并行求和：花费时间：21毫秒求和：5000000050000000 -------------------------------- OpenMP总和：花费时间：21毫秒总和：5000000050000000

其他 2022/1/1 18:16:03 有549人围观

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进，让解决方法与时俱进

请先登录

并行计算一个大向量的和

撰写回答

推荐问题

Spring RestController 6并行执行？

这是Files.lines（）中的错误，还是我对并行流有误解？

Jenkins工作流程：基于工具输出的并行化步骤

Node.js本机Promise.all是并行还是顺序处理？

如何在Jenkins声明式管道中循环参数化并行阶段？

Golang net.Conn并行写入

Jenkins中的数组简单并行执行

Jenkins在不同的代理上并行构建

与Jenkins工作流程/管道并行运行的阶段

Currying groovy CPS封闭以并行执行

并行运行两个异步任务，并在.NET 4.5中收集结果

Jenkins运行并行脚本

并行计算一个大向量的和

Golang中的并行处理

在同一终端中一次运行多个并行命令

尝试并行运行多个HTTP请求，但受到Windows（注册表）的限制

如何跨多个拉取请求并行运行持续集成？

并行高效地运行多个作业

使用jQuery的并行异步Ajax请求

在node.js中协调并行执行

分类汇总

您的鼓励是对我最大的支持