2016-08-19 68 views
6

我的线程同步“风格”似乎在抛弃helgrind。这是一个简单的程序,重现了这个问题:如何避免Helgrind误报?

#include <thread> 
#include <atomic> 
#include <iostream> 

int main() 
{ 
    std::atomic<bool> isReady(false); 

    int i = 1; 

    std::thread t([&isReady, &i]() 
    { 
     i = 2; 
     isReady = true; 
    }); 

    while (!isReady) 
     std::this_thread::yield(); 

    i = 3; 

    t.join(); 

    std::cout << i; 

    return 0; 
} 

据我所知,以上是一个完美结构良好的程序。然而,当我运行helgrind使用下面的命令我得到的错误:

valgrind --tool=helgrind ./a.out 

的这个输出是:

==6247== Helgrind, a thread error detector 
==6247== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al. 
==6247== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info 
==6247== Command: ./a.out 
==6247== 
==6247== ---Thread-Announcement------------------------------------------ 
==6247== 
==6247== Thread #1 is the program's root thread 
==6247== 
==6247== ---Thread-Announcement------------------------------------------ 
==6247== 
==6247== Thread #2 was created 
==6247== at 0x56FBB1E: clone (clone.S:74) 
==6247== by 0x4E46189: create_thread (createthread.c:102) 
==6247== by 0x4E47EC3: [email protected]@GLIBC_2.2.5 (pthread_create.c:679) 
==6247== by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==6247== by 0x5115DC2: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) 
==6247== by 0x4010EF: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /home/arman/a.out) 
==6247== by 0x400F93: main (in /home/arman/a.out) 
==6247== 
==6247== ---------------------------------------------------------------- 
==6247== 
==6247== Possible data race during read of size 1 at 0xFFF00035B by thread #1 
==6247== Locks held: none 
==6247== at 0x4022C3: std::atomic<bool>::operator bool() const (in /home/arman/a.out) 
==6247== by 0x400F9F: main (in /home/arman/a.out) 
==6247== 
==6247== This conflicts with a previous write of size 1 by thread #2 
==6247== Locks held: none 
==6247== at 0x40233D: std::__atomic_base<bool>::operator=(bool) (in /home/arman/a.out) 
==6247== by 0x40228E: std::atomic<bool>::operator=(bool) (in /home/arman/a.out) 
==6247== by 0x400F4A: main::{lambda()#1}::operator()() const (in /home/arman/a.out) 
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1}()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out) 
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1}()>::operator()() (in /home/arman/a.out) 
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1}()> >::_M_run() (in /home/arman/a.out) 
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) 
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==6247== Address 0xfff00035b is on thread #1's stack 
==6247== in frame #1, created by main (???:) 
==6247== 
==6247== ---------------------------------------------------------------- 
==6247== 
==6247== Possible data race during write of size 4 at 0xFFF00035C by thread #1 
==6247== Locks held: none 
==6247== at 0x400FAE: main (in /home/arman/a.out) 
==6247== 
==6247== This conflicts with a previous write of size 4 by thread #2 
==6247== Locks held: none 
==6247== at 0x400F35: main::{lambda()#1}::operator()() const (in /home/arman/a.out) 
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1}()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out) 
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1}()>::operator()() (in /home/arman/a.out) 
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1}()> >::_M_run() (in /home/arman/a.out) 
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) 
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==6247== by 0x4E476F9: start_thread (pthread_create.c:333) 
==6247== by 0x56FBB5C: clone (clone.S:109) 
==6247== Address 0xfff00035c is on thread #1's stack 
==6247== in frame #0, created by main (???:) 
==6247== 
3==6247== 
==6247== For counts of detected and suppressed errors, rerun with: -v 
==6247== Use --history-level=approx or =none to gain increased speed, at 
==6247== the cost of reduced accuracy of conflicting-access information 
==6247== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) 

Helgrind似乎是捡了我的while循环的竞争条件。我应该如何形成这个计划来避免Helgrind抛出误报?

+1

无论如何,在繁忙的循环中产生一个糟糕的同步形式,可以考虑使用'condition_variable'来代替。 –

+0

@JonathanWakely怎么样?我尝试编译[cppreference]中的[std :: condition_variable]的示例代码(http://en.cppreference.com/w/cpp/thread/condition_variable),但生成的程序给出了一个“可疑:相关的锁不被任何在helgrind下运行时出现线程错误。 – arman

+0

正确使用它:-) –

回答

4

问题是Helgrind不理解GCC的原子构建,所以没有意识到它们是无竞争的并且对程序施加了排序。

有许多方法可以帮助Helgrind注释您的代码,请参阅http://valgrind.org/docs/manual/hg-manual.html#hg-manual.effective-use(但我不确定如何在这里使用它们,我已经尝试了sbabbi显示的内容,它只解决了部分问题)。

无论如何,我会避免在繁忙的循环中产生,这是一种糟糕的同步形式。它可以用一个条件变量来完成,像这样:

#include <thread> 
#include <atomic> 
#include <iostream> 
#include <condition_variable> 

int main() 
{ 
    bool isReady(false); 
    std::mutex mx; 
    std::condition_variable cv; 

    int i = 1; 

    std::thread t([&isReady, &i, &mx, &cv]() 
    { 
     i = 2; 
     std::unique_lock<std::mutex> lock(mx); 
     isReady = true; 
     cv.notify_one(); 
    }); 

    { 
     std::unique_lock<std::mutex> lock(mx); 
     cv.wait(lock, [&] { return isReady; }); 
    } 

    i = 3; 

    t.join(); 

    std::cout << i; 

    return 0; 
} 
+1

顺便说一下,我实际上喜欢在这种情况下使用'std :: promise ',在这种情况下,线程正在向另一个线程发出一次性信号,以执行一些处理。 –

+0

最终我们会为此提供['barrier'](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4392.pdf)。 –

1

Valgrind的不知道while (!isReady)循环(与存储和负载隐memory_order_releasememory_order_consume标志位),意味着声明i = 2依赖有序之前i = 3

你必须使用的valgrind ANNOTATE_HAPPENS_BEFORE明确地说明这一点不变,并ANNOTATE_HAPPENS_AFTER宏:

#include <valgrind/drd.h> 

#include <thread> 
#include <atomic> 
#include <iostream> 

int main() 
{ 
    std::atomic<bool> isReady(false); 

    int i = 1; 

    std::thread t([&isReady, &i]() 
    { 
     i = 2; 
     ANNOTATE_HAPPENS_BEFORE(&isReady); 
     isReady = true; 
    }); 

    while (!isReady) 
     std::this_thread::yield(); 

    ANNOTATE_HAPPENS_AFTER(&isReady); 
    i = 3; 

    t.join(); 

    std::cout << i; 

    return 0; 
} 

我们在这里说,在ANNOTATE_HAPPENS_BEFORE线在ANNOTATE_HAPPENS_AFTER前行总是会发生的,我们知道,由于检查程序逻辑,但valgrind不能证明你的。

这个程序产生:

valgrind --tool=helgrind ./a.out 
==714== Helgrind, a thread error detector 
==714== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al. 
==714== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info 
==714== Command: ./val 
==714== 
==714== ---Thread-Announcement------------------------------------------ 
==714== 
==714== Thread #1 is the program's root thread 
==714== 
==714== ---Thread-Announcement------------------------------------------ 
==714== 
==714== Thread #2 was created 
==714== at 0x59E169E: clone (in /usr/lib/libc-2.23.so) 
==714== by 0x4E421D9: create_thread (in /usr/lib/libpthread-2.23.so) 
==714== by 0x4E43C42: [email protected]@GLIBC_2.2.5 (in /usr/lib/libpthread-2.23.so) 
==714== by 0x4C316F3: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==714== by 0x4C327D7: [email protected]* (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==714== by 0x5113DB4: __gthread_create (gthr-default.h:662) 
==714== by 0x5113DB4: std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (thread.cc:163) 
==714== by 0x40109C: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /home/ennio/val) 
==714== by 0x400F55: main (in /home/ennio/val) 
==714== 
==714== ---------------------------------------------------------------- 
==714== 
==714== Possible data race during read of size 1 at 0xFFF00061F by thread #1 
==714== Locks held: none 
==714== at 0x401585: std::atomic<bool>::operator bool() const (in /home/ennio/val) 
==714== by 0x400F61: main (in /home/ennio/val) 
==714== 
==714== This conflicts with a previous write of size 1 by thread #2 
==714== Locks held: none 
==714== at 0x4015D5: std::__atomic_base<bool>::operator=(bool) (in /home/ennio/val) 
==714== by 0x401550: std::atomic<bool>::operator=(bool) (in /home/ennio/val) 
==714== by 0x400F1B: main::{lambda()#1}::operator()() const (in /home/ennio/val) 
==714== by 0x40146F: void std::_Bind_simple<main::{lambda()#1}()>::_M_invoke<>(std::_Index_tuple<>) (in /home/ennio/val) 
==714== by 0x40140C: std::_Bind_simple<main::{lambda()#1}()>::operator()() (in /home/ennio/val) 
==714== by 0x4013EB: std::thread::_State_impl<std::_Bind_simple<main::{lambda()#1}()> >::_M_run() (in /home/ennio/val) 
==714== by 0x5113A9E: execute_native_thread_routine (thread.cc:83) 
==714== by 0x4C318E7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) 
==714== Address 0xfff00061f is on thread #1's stack 
==714== in frame #1, created by main (???:) 
==714== 
3==714== 
==714== For counts of detected and suppressed errors, rerun with: -v 
==714== Use --history-level=approx or =none to gain increased speed, at 
==714== the cost of reduced accuracy of conflicting-access information 
==714== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) 

要解除对isReady本身的错误,我认为在__atomic_base<bool>::operator=抑制文件就足够了。