我已经实现了使用GLSL自旋锁的深度剥离算法(受this的启发)。在下面的可视化中,请注意深度剥离算法的正确运行方式(第一层左上,第二层右上,第三层左下,第四层右下)。四个深度图层存储在一个RGBA纹理中。GLSL SpinLock only Mostly Works
不幸的是,自旋锁有时不能防止错误 - 你可以看到很少的白色斑点,特别是在第四层。第二层的太空船也有一个。这些斑点每帧都有所不同。
以我GLSL自旋锁,当一个片段是要绘制,所述片段程序读取和原子写锁定值到一个单独的锁定纹理,等待直到一个0出现时,指示该锁打开。 In practice,我发现程序必须是并行的,因为如果两个线程在同一像素上,则warp不能继续(一个必须等待,另一个线程继续,并且GPU线程扭曲中的所有线程必须同时执行)。
我的片断程序看起来像这样(注释和补充间距):
#version 420 core
//locking texture
layout(r32ui) coherent uniform uimage2D img2D_0;
//data texture, also render target
layout(RGBA32F) coherent uniform image2D img2D_1;
//Inserts "new_data" into "data", a sorted list
vec4 insert(vec4 data, float new_data) {
if (new_data<data.x) return vec4( new_data,data.xyz);
else if (new_data<data.y) return vec4(data.x,new_data,data.yz);
else if (new_data<data.z) return vec4(data.xy,new_data,data.z);
else if (new_data<data.w) return vec4(data.xyz,new_data );
else return data;
}
void main() {
ivec2 coord = ivec2(gl_FragCoord.xy);
//The idea here is to keep looping over a pixel until a value is written.
//By looping over the entire logic, threads in the same warp aren't stalled
//by other waiting threads. The first imageAtomicExchange call sets the
//locking value to 1. If the locking value was already 1, then someone
//else has the lock, and can_write is false. If the locking value was 0,
//then the lock is free, and can_write is true. The depth is then read,
//the new value inserted, but only written if can_write is true (the
//locking texture was free). The second imageAtomicExchange call resets
//the lock back to 0.
bool have_written = false;
while (!have_written) {
bool can_write = (imageAtomicExchange(img2D_0,coord,1u) != 1u);
memoryBarrier();
vec4 depths = imageLoad(img2D_1,coord);
depths = insert(depths,gl_FragCoord.z);
if (can_write) {
imageStore(img2D_1,coord,depths);
have_written = true;
}
memoryBarrier();
imageAtomicExchange(img2D_0,coord,0);
memoryBarrier();
}
discard; //Already wrote to render target with imageStore
}
我的问题是,为什么会出现这种斑点的行为呢?我想让螺旋锁在100%的时间内工作!它可能与我的memoryBrier()的位置有关吗?
最终片段着色器是什么样的?它是否还有memoryBarrier()操作? – ragnar 2013-02-12 18:55:05
是的,但在更简洁的位置。 IIRC(它是程序生成的),它们仅在imageAtomicExchange和imageAtomicExchange之后。 – imallett 2013-02-12 21:35:06
我实际上错误地认为它解决了这个问题。我在这里做了一个更完整的列表:http://stackoverflow.com/questions/21538555/broken-glsl-spinlock-glsl-locks-compendium – imallett 2014-02-03 21:53:29