GLSL SpinLock only Mostly Works

我已经实现了使用GLSL自旋锁的深度剥离算法（受this的启发）。在下面的可视化中，请注意深度剥离算法的正确运行方式（第一层左上，第二层右上，第三层左下，第四层右下）。四个深度图层存储在一个RGBA纹理中。GLSL SpinLock only Mostly Works

不幸的是，自旋锁有时不能防止错误 - 你可以看到很少的白色斑点，特别是在第四层。第二层的太空船也有一个。这些斑点每帧都有所不同。

enter image description here

以我GLSL自旋锁，当一个片段是要绘制，所述片段程序读取和原子写锁定值到一个单独的锁定纹理，等待直到一个0出现时，指示该锁打开。 In practice，我发现程序必须是并行的，因为如果两个线程在同一像素上，则warp不能继续（一个必须等待，另一个线程继续，并且GPU线程扭曲中的所有线程必须同时执行）。

我的片断程序看起来像这样（注释和补充间距）：

#version 420 core 

//locking texture 
layout(r32ui) coherent uniform uimage2D img2D_0; 
//data texture, also render target 
layout(RGBA32F) coherent uniform image2D img2D_1; 

//Inserts "new_data" into "data", a sorted list 
vec4 insert(vec4 data, float new_data) { 
    if  (new_data<data.x) return vec4(  new_data,data.xyz); 
    else if (new_data<data.y) return vec4(data.x,new_data,data.yz); 
    else if (new_data<data.z) return vec4(data.xy,new_data,data.z); 
    else if (new_data<data.w) return vec4(data.xyz,new_data  ); 
    else      return data; 
} 

void main() { 
    ivec2 coord = ivec2(gl_FragCoord.xy); 

    //The idea here is to keep looping over a pixel until a value is written. 
    //By looping over the entire logic, threads in the same warp aren't stalled 
    //by other waiting threads. The first imageAtomicExchange call sets the 
    //locking value to 1. If the locking value was already 1, then someone 
    //else has the lock, and can_write is false. If the locking value was 0, 
    //then the lock is free, and can_write is true. The depth is then read, 
    //the new value inserted, but only written if can_write is true (the 
    //locking texture was free). The second imageAtomicExchange call resets 
    //the lock back to 0. 

    bool have_written = false; 
    while (!have_written) { 
     bool can_write = (imageAtomicExchange(img2D_0,coord,1u) != 1u); 

     memoryBarrier(); 

     vec4 depths = imageLoad(img2D_1,coord); 
     depths = insert(depths,gl_FragCoord.z); 

     if (can_write) { 
      imageStore(img2D_1,coord,depths); 
      have_written = true; 
     } 

     memoryBarrier(); 

     imageAtomicExchange(img2D_0,coord,0); 

     memoryBarrier(); 
    } 
    discard; //Already wrote to render target with imageStore 
}

我的问题是，为什么会出现这种斑点的行为呢？我想让螺旋锁在100％的时间内工作！它可能与我的memoryBrier（）的位置有关吗？

来源

2012-08-05 imallett

“imageAtomicExchange（img2D_0，coord，0）;”需要在if语句中，因为即使对于没有它的线程，它也会重置锁变量！改变它可以修复它。

来源

2013-01-31 05:48:55 imallett

最终片段着色器是什么样的？它是否还有memoryBarrier（）操作？ – ragnar 2013-02-12 18:55:05

是的，但在更简洁的位置。 IIRC（它是程序生成的），它们仅在imageAtomicExchange和imageAtomicExchange之后。 – imallett 2013-02-12 21:35:06

我实际上错误地认为它解决了这个问题。我在这里做了一个更完整的列表：http://stackoverflow.com/questions/21538555/broken-glsl-spinlock-glsl-locks-compendium – imallett 2014-02-03 21:53:29

作为参考，这里是锁定的代码，已经测试在GTX670上的Nvidia驱动程序314.22 & 320.18上工作。请注意，如果将代码重新排序或重写为逻辑上等效的代码，则会触发现有的编译器优化错误（请参阅下面的注释）。下面的注释使用无图像引用。

// sem is initialized to zero 
coherent uniform layout(size1x32) uimage2D sem; 

void main(void) 
{ 
    ivec2 coord = ivec2(gl_FragCoord.xy); 

    bool done = false; 
    uint locked = 0; 
    while(!done) 
    { 
    // locked = imageAtomicCompSwap(sem, coord, 0u, 1u); will NOT work 
     locked = imageAtomicExchange(sem, coord, 1u); 
     if (locked == 0) 
     { 
      performYourCriticalSection(); 

      memoryBarrier(); 

      imageAtomicExchange(sem, coord, 0u); 

      // replacing this with a break will NOT work 
      done = true; 
     } 
    } 

    discard; 
}

来源

2013-05-28 21:51:03

GLSL SpinLock only Mostly Works

回答

相关问题