我试图扩大达维德斯帕塔罗的解决方案使用atomic_flag
它“不同于标准::原子的所有专业,它是保证是无锁的”解决atomic<bool>
同步的问题http://en.cppreference.com/w/cpp/atomic/atomic_flag
编辑: 不相关的前一个问题,但我已经基准哪种方法更快,什么是我的惊喜atomic<bool>
有约100快于atomic_flag
。
基准测试结果:
num_threads:2
400000001 iterations flag
401386195 iterations flag
atomic_flag : it took 24.1202 seconds. Result: 1
400000001 iterations bool
375842699 iterations bool
atomic<bool>: it took 0.334785 seconds. Result: 1
num_threads:3
229922451 iterations flag
229712046 iterations flag
233333335 iterations flag
atomic_flag : it took 21.5974 seconds. Result: 1
219564626 iterations bool
233333335 iterations bool
196877803 iterations bool
atomic<bool>: it took 0.200942 seconds. Result: 1
num_threads:4
151745683 iterations flag
150000001 iterations flag
148849108 iterations flag
148933269 iterations flag
atomic_flag : it took 18.6651 seconds. Result: 1
150000001 iterations bool
112825220 iterations bool
151838008 iterations bool
112857688 iterations bool
atomic<bool>: it took 0.167048 seconds. Result: 1
基准代码:
#include <thread>
#include <atomic>
#include <vector>
#include <iostream>
#include <algorithm>
template<typename Iterator>
static void any_of_flag(Iterator & begin, Iterator& end, std::atomic_flag & result)
{
int counter = 0;
for (auto it = begin; it != end; ++it)
{
counter++;
if (!result.test_and_set() || (*it) != 0)
{
result.clear();
std::cout << counter << " iterations flag\n";
return;
}
}
}
template<typename Iterator>
static void any_of_atomic(Iterator & begin, Iterator& end, std::atomic<bool> & result)
{
int counter = 0;
for (auto it = begin; it != end; ++it)
{
counter++;
if (result || (*it) != 0)
{
result = true;
std::cout << counter << " iterations bool\n";
return;
}
}
}
void test_atomic_flag(std::vector<int>& input, int num_threads)
{
using namespace std::chrono;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
size_t chunk_size = input.size()/num_threads;
std::atomic_flag result = ATOMIC_FLAG_INIT;
result.test_and_set();
std::vector<std::thread> threads;
for (size_t i = 0; i < num_threads; ++i)
{
auto & begin = input.begin() + i *chunk_size;
auto & end = input.begin() + std::min((i + 1) * chunk_size, input.size());
// had to use lambda in VS 2017
threads.emplace_back([&begin, &end, &result] {any_of_flag(begin, end, result); });
}
for (auto & thread : threads)
thread.join();
bool hasNonZero = !result.test_and_set();
high_resolution_clock::time_point t2 = high_resolution_clock::now();
duration<double> time_span = duration_cast<duration<double>>(t2 - t1);
std::cout << "atomic_flag : it took " << time_span.count() << " seconds. Result: " << hasNonZero << std::endl;
}
void test_atomic_bool(std::vector<int>& input, int num_threads)
{
using namespace std::chrono;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
size_t chunk_size = input.size()/num_threads;
std::atomic<bool> result(false);
std::vector<std::thread> threads;
for (size_t i = 0; i < num_threads; ++i)
{
auto & begin = input.begin() + i *chunk_size;
auto & end = input.begin() + std::min((i + 1) * chunk_size, input.size());
// had to use lambda in VS 2017
threads.emplace_back([&begin, &end, &result] {any_of_atomic(begin, end, result); });
}
for (auto & thread : threads)
thread.join();
bool hasNonZero = result;
high_resolution_clock::time_point t2 = high_resolution_clock::now();
duration<double> time_span = duration_cast<duration<double>>(t2 - t1);
std::cout << "atomic<bool>: it took " << time_span.count() << " seconds. Result: " << hasNonZero << std::endl;
}
int main()
{
std::vector<int> input(1e9, 0);
input[1e9 - 1e8] = 1;
for (int num_threads : {2, 3, 4})
{
std::cout << "num_threads:" << num_threads << std::endl;
test_atomic_flag(input, num_threads);
test_atomic_bool(input, num_threads);
}
int q;
std::cin >> q;
return 0;
};
旧文章: 我有一些问题,迭代器的常量性和安放线程,但核心的变化,那就是atomic_flag的使用似乎上班。它不会立即停止所有线程,但在最坏的情况下,每次迭代只会有一个线程(因为每次迭代只有一个线程会知道它应该由于清除标志而停止)。
#include <thread>
#include <atomic>
#include <vector>
#include <iostream>
#include <algorithm>
template<typename Iterator>
static void any_of(Iterator & begin, Iterator& end, std::atomic_flag & result)
{
for (auto it = begin; it != end; ++it)
{
if (!result.test_and_set() || (*it) != 0)
{
result.clear();
return;
}
}
}
int main()
{
int num_threads = 3;
std::vector<int> input = { 0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0, 1,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0};
size_t chunk_size = input.size()/num_threads;
std::atomic_flag result = ATOMIC_FLAG_INIT;
result.test_and_set();
std::vector<std::thread> threads;
for (size_t i = 0; i < num_threads; ++i)
{
auto & begin = input.begin() + i *chunk_size;
auto & end = input.begin() + std::min((i + 1) * chunk_size, input.size());
// had to use lambda in VS 2017
threads.emplace_back([&begin, &end, &result] {any_of(begin, end, result); });
}
for (auto & thread : threads)
thread.join();
bool hasNonZero = !result.test_and_set();
return 0;
};
请注意,如果您检查该标志在每个迭代上,你再次有效序列化的算法,为访问做原子这里招致了一些同步开销(多少恰恰是高度依赖于平台的)。最好在每次* n *次迭代中检查标志,然后为你的平台选择一个相当大的* n *。 – ComicSansMS
@Davide Spataro令人印象深刻!我已经尝试过fork()函数。但它是一个真正的麻烦制造者。是范围越野车的计算? 如果input.size()== 1001和num_threads == 4,那么我猜测上一个值没有被检查。如果我错了,请原谅我。 –
@ComicSansMS这是一个非常好的观点。为每个线程选择(或随机生成)不同的_n_以避免可能的冲突是一个好主意吗? –