🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Reading an area from a texture

Started by
7 comments, last by alikim 2 years, 7 months ago

I'm writing a function that should count how many pixels of the same color are around a certain pixel, this is for WebGL. The width of the (rectangle) area w has a large effect on performance (as expected), so I'd like to ask if I there are any obvious ways to make this code work faster, thank you!

  float neighCount(float id, sampler2D tex, vec2 vuv, vec2 uvpix, float w) {
    float cnt = 0.0;
    vec2 beg = vuv - vec2(0.5 * w * uvpix);
    for(float r = 0.0; r < w; r++) {
      float y = uvpix.y * r;
      for(float c = 0.0; c < w; c++) {
        float tid = texture2D(tex, beg + vec2(uvpix.x * c, y)).r;
        if(abs(id - tid) < 0.05 || tid == 0.0) cnt += 1.0;
      }
    }
    return smoothstep(0.0, 1.0, cnt / (w * w));
  }
Advertisement

Actually, with your current solution a single GPU thread does all the work, so you probably want to parallelize.

The fastest way would be using a compute shader, where each workgroup processes a tile of texture, e.g. 16*16 pixels.
Each thread of the workgroup would process one pixel. To merge the counted results, there are two options: 1. Increase a counter in shared memory using atomic add, or 2. Each thread having it's own counter in a shared memory array, and using a prefix sum scan algorithm to calculate the sum. (1. is less work and i guess it's faster as well)
After the sum for this tile is known, increase a final counter in video memory with one atomic add per workgroup, done from only one thread of the group.

This however means you need a dispatch for the compute shaders, so rectanlge dimensions need to be known at API level. You could not set dimension from a pixel shader and call a function to do the counting work.
If you need this, or if you eventually want to do one such counting per pixel at multiple rectangles of variable sizes from the same image, you could use the Summed Area Table (SAT) method. This processes the whole image just once, and after that you can query rectangles of any size per pixel at a minimal cost of O(1).

Your question is a good example requiring the basics of parallel programming on GPU. The book ‘OpenGL Super Bible’ has a very good chapter about compute shaders, covering exactly this.

I don't think there are compute shaders in WebGL

alikim said:
I don't think there are compute shaders in WebGL

I somehow expected this, seems true: https://www.khronos.org/registry/webgl/specs/latest/2.0-compute/
Well,

Well, probably there is not much you can do to improve your solution. But for what do you need this? Maybe there is some alternative…

Thanks for your reply, somehow I didn't think about multithreading, maybe it'll be possible in WebGPU. I need this to estimate the number of illuminated pixels around the current one to create shadows with penumbra, thre are other ways to do it of course, I'm experimenting.

alikim said:
I need this to estimate the number of illuminated pixels around the current one to create shadows with penumbra

Perfect application of SAT i guess. Other people had the same problem, found this for example: https://stackoverflow.com/questions/36692088/summed-area-table-in-glsl-and-gpu-fragment-shader-execution

Thanks! Very useful reads, I'll post a working example here if I manage to come up with some significant improvements.

This topic is closed to new replies.

Advertisement