🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

HLSL indexing issue

Started by
4 comments, last by Vilem Otte 3 years, 11 months ago

So, I wanted to add cascaded shadow maps just to allow some additional shadowing mode for directional lights. And here I hit a wall with HLSL.

This code is seemingly alright:

DirectionalLight dirLight = (DirectionalLight)lightsData[lightInput.id];


...


// Just for the check - assign red/green/blue/yellow color to each cascade (along with storing index)
float3 cascadeMul = float3(0.0f, 0.0f, 0.0f);
int cascadeID = -1;
if (posz < dirLight.cascadeShadowClip[0])
{
	cascadeID = 0;
	cascadeMul = float3(1.0f, 0.0f, 0.0f);
}
else if (posz < dirLight.cascadeShadowClip[1])
{
	cascadeID = 1;
	cascadeMul = float3(0.0f, 1.0f, 0.0f);
}
else if (posz < dirLight.cascadeShadowClip[2])
{
	cascadeID = 2;
	cascadeMul = float3(0.0f, 0.0f, 1.0f);
}
else if (posz < dirLight.cascadeShadowClip[3])
{
	cascadeID = 3;
	cascadeMul = float3(1.0f, 1.0f, 0.0f);
}


// Index of shadowmap in shadow map atlas
int index = dirLight.cascadeShadowID[cascadeID];


// Shadow mapping
if (cascadeID != -1 && index >= 0)
{
	float4 positionProj = mul(shadowAtlasData[index].shadowMatrix, positionWS);
	positionProj.xy /= positionProj.w;
	positionProj.y = -positionProj.y;
	positionProj.x *= 0.5f;
	positionProj.x += 0.5f;
	positionProj.y *= 0.5f;
	positionProj.y += 0.5f;
	positionProj.z -= lightsData[lightInput.id].offset;

	positionProj.xy *= shadowAtlasData[index].size;
	positionProj.xy += shadowAtlasData[index].offset;

	if (positionProj.x > 0.0f && positionProj.x < 1.0f && positionProj.y > 0.0f && positionProj.y < 1.0f && positionProj.z < 1.0f)
	{
		shadowMask *= ShadowMap(shadowMap, shadowSampler, positionProj.xyz);
	}
}

shadowMask = clamp(shadowMask, 0.0f, 1.0f);

...

Now, theoretically this code is correct (I did try it with hard setting index to 0, 1, 2, 3 and it does work properly (if I hard code indexes into shadowAtlasData it also works … but that kind of removes the purpose of texture atlas, and makes multiple directional lights kind of impossible to do). The problem appears when I want to set index dynamically based on values stored in structured buffer.

For further information - lightsData is a structured buffer, shadowAtlasData is also structured buffer.

Also further information - this will end up in having divergent index in draw call, I know that. One could say that on the line where I initialize index I should use NonUniformResourceIndex(cascadeID) and NonUniformResourceIndex (index) further on. I have tried that and it does not work. The whole code executes as if index was 0 all the time.

I'd prefer to avoid doing multiple render passes for directional lights (one for each cascade), but it still looks like the only option (unless I would really want to compute shadow N times - for each slice in cascaded shadow map … which is huge overkill).

And yet more information - I was originally using D3DCompileFromFile which allowed only up to Shader Model 5.1, I suspected that old compiler make break this down, so I've switched to dxc (DirectX Shader Compiler), and I'm using the recent build (binary distribution). I have tried with Shader Model 6.0 and Shader Model 6.1 profiles, without any change.

So, my questions are:

  • Why NonUniformResourceIndex does not work here (is this a compiler bug?)?
  • Is there any other reasonable workaround apart from doing N-passes?

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement

Not being able to dynamically reference rings a bell (https://docs.microsoft.com/en-us/windows/win32/direct3d12/dynamic-indexing-using-hlsl-5-1) but on principle with HLSL consider the following:

Have you tried playing with [flatten] https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-if​ or compiler directives https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/d3dcompile-constants​

Have you tried:

float dir0 =  dirLight.cascadeShadowClip[0]; // indirect reference may highlight a compiler optimisation issue (I assume this is a float, and you havent made a clipping error in your referencing)

if (posz < dir0)
{

}
else // nesting else ifs in parenthesis to avoid confusion
{
  ...
  if (...)
  {

  }
  else
  {
    ...
    if (...)
    {

    }
  }
}  

@teatreetim Thanks! I have tried playing with both. Sadly compiler flags don't help (except for disabling optimization … forcing strict does NOT make it work). My hope for somewhat clean code is being shattered by each minute. Some hack-ish solutions do work though (all have one thing in common - you access only with first index and with offset, not using different indices).

And yes - in DirectionalLight the member is float cascadeShadowClip[4].

Using ifs may to some extent, but I really suspect compiler acting weirdly here - here compare 2 of the following codes:

int index = dirLight.cascadeShadowID[NonUniformResourceIndex(0)];

[flatten] if (posz < dirLight.cascadeShadowClip[0])
{
    if (index >= 0)
    {
        float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
        ...
    }
    cascadeMul = float3(1.0f, 0.0f, 0.0f);
}
else
{
    [flatten] if (posz < dirLight.cascadeShadowClip[1])
    {
        index = dirLight.cascadeShadowID[NonUniformResourceIndex(1)];
        if (index >= 0)
        {
            float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
            ...
        }
        cascadeMul = float3(0.0f, 1.0f, 0.0f);
    }
    else
    {
        [flatten] if (posz < dirLight.cascadeShadowClip[2])
        {
            index = dirLight.cascadeShadowID[NonUniformResourceIndex(2)];
            if (index >= 0)
            {
                float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
                ...
            }
            cascadeMul = float3(0.0f, 0.0f, 1.0f);
        }
        else
        {
            [flatten] if (posz < dirLight.cascadeShadowClip[3])
            {
                index = dirLight.cascadeShadowID[NonUniformResourceIndex(3)];
                if (index >= 0)
                {
                    float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
                    ...
                }
                cascadeMul = float3(1.0f, 1.0f, 0.0f);
            }
            else
            {
                cascadeMul = float3(1.0f, 1.0f, 1.0f);
            }
        }
    }
}

versus

int index = dirLight.cascadeShadowID[NonUniformResourceIndex(0)];

[flatten] if (posz < dirLight.cascadeShadowClip[0])
{
    if (index >= 0)
    {
        float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
        ...
    }
    cascadeMul = float3(1.0f, 0.0f, 0.0f);
}
else
{
    index++;
    [flatten] if (posz < dirLight.cascadeShadowClip[1])
    {
        if (index >= 0)
        {
            float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
            ...
        }
        cascadeMul = float3(0.0f, 1.0f, 0.0f);
    }
    else
    {
        index++;
        [flatten] if (posz < dirLight.cascadeShadowClip[2])
        {
            if (index >= 0)
            {
                float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
                ...
            }
            cascadeMul = float3(0.0f, 0.0f, 1.0f);
        }
        else
        {
            index++;
            [flatten] if (posz < dirLight.cascadeShadowClip[3])
            {
                if (index >= 0)
                {
                    float4 positionProj = mul(shadowAtlasData[NonUniformResourceIndex(index)].shadowMatrix, positionWS);
                    ...
                }
                cascadeMul = float3(1.0f, 1.0f, 0.0f);
            }
            else
            {
                cascadeMul = float3(1.0f, 1.0f, 1.0f);
            }
        }
    }
}

You see that in the first case, I read value into index and then use index. That will work incorrectly.

In the second case, I use knowledge that cascades are next to each other in shadow map atlas definition (which may not apply all the time unless I force it - which I don't want to if I don't need to). The second case works.

I still don't get why - for me it still looks like compiler issue (something gets either optimized away, or NonUniformResourceIndex gets thrown away completely during compilation). I may want to run dxc from command line to see what the code looks like.

EDIT: Just to show - that it sort-of works:

Fig. 01 - Cascaded shadow maps with directional light
Fig.02 - Red/Green/Blue/Yellow show separate cascades for direct illumination from directional lights (indirect illumination not affected by coloring of cascades)

Indirect illumination was the tricky one, I ended up sampling only highest cascade for it.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Did you try to directly print the content of cascadeShadowClip(and lightsData)?
With Constant Buffers, for example, there is a trap - they need to be accessed as float4, even if you want to use a single float. The third float value is inside myStructBuff[0].z

If i were you i would print the content of all the inputs to the shader(from inside the shader to see how the shader sees it). It is something i always do. It takes a good amount of extra effort, but solved my problems in lot of occasions.

@nikito I can access them with specifying offset by hand. So if I do like:

cascadeShadowClip[2] … this works correctly - the problem happens when I do cascadeShadowClip[index] and index is variable based on some condition. NonUniformResourceIndex should be used for this, so like cascadeShadowClip[NonUniformResourceIndex(index)] … yet this ends up in incorrect result.

Also, in the first code example (1st post) cascadeMul actually produces correct result - therefore the branching and access works properly. It is accessing StructuredBuffer at variable index set few lines above that causes troubles.

I'm currently just incrementing the index (which seems to work properly) and doing it in ugly branchy way. I don't consider that a proper solution, more likely a usable work-around.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

This topic is closed to new replies.

Advertisement