🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Direct3D 11 and 2D: multiplication matrix-vector in HLSL does not give the correct result

Started by
6 comments, last by JoeJ 2 years, 1 month ago

I am trying to improve my test program about Direct3D 11 in 2D, without any additional library (see https://stackoverflow.com/q/71772545/688348).

Now, what I want to do is to rotate the content (client) area of the window. The GDI part is working, but not the D3D one. The function d3d_resize() takes an argument named rot and, according to its value (0, 1, 2 or 3), I want to rotate the client area.

For now, I focus on rot == 0, that is no rotation.

I use a constant buffer to store the rotation matrix, defined (mathematically) like this:

|r11 r12 t1|
|r21 r22 t2|

where the sub-matrix (r) is the rotation part, and the sub-vector (t) is the translation.

So for rot == 0, the matrix is basically the identity matrix + no translation. If `(x,y)` is my 2D vector, I expand it with 1 to take into account the translation, hence:

|1 0 0|   |x|   |x|
|0 1 0| * |y| = |y|
          |1|

that is, what I want if rot == 0. My C part for the rotation is:

typedef struct
{
    float rotation[2][3]; /* rotation + translation */
    float dummy[2]; /* for 16 bytes padding */
} Const_Buffer;

void d3d_resize(D3d *d3d, int rot, UINT width, UINT height)
{
    /* snip */

    switch (rot)
    {
    case 0:
        ((Const_Buffer *)mapped.pData)->rotation[0][0] = 1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][1] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][2] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][0] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][1] = 1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][2] = 0.0f;
        break;
    case 1:
        ((Const_Buffer *)mapped.pData)->rotation[0][0] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][1] = -1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][2] = 2.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][0] = 1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][1] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][2] = 0.0f;
        break;
    case 2:
        ((Const_Buffer *)mapped.pData)->rotation[0][0] = -1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][1] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][2] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][0] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][1] = -1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][2] = 0.0f;
        break;
    case 3:
        ((Const_Buffer *)mapped.pData)->rotation[0][0] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][1] = 1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[0][2] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][0] = -1.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][1] = 0.0f;
        ((Const_Buffer *)mapped.pData)->rotation[1][2] = 2.0f;
        break;
    }

    /* snip */
}

The vertex shader is:

cbuffer cv_viewport : register(b0)
{
    float2x3 rotation_matrix;
}

struct vs_input
{
    float2 position : POSITION;
    float4 color : COLOR;
};

struct ps_input
{
    float4 position : SV_POSITION;
    float4 color : COLOR;
};

ps_input main_vs(vs_input input )
{
    ps_input output;
    float2 p = input.position;
    p = mul(rotation_matrix, float3(input.position, 1.0f));
    output.position = float4(p, 0.0f, 1.0f);
    output.color = input.color;
    return output;
}

I used Visual Studio graphic debugger to check for the const buffer values when rot == 0 and they seem correct.

but nothing is displayed when rot == 0. If I comment the line where the multiplication is done, the content is displayed. I've done something wrong but I don't know what and where. Find below the link to the complete C code + the shader code in my github (it is too large to be posted inline)

C code: https://github.com/vtorri/d3d_rot/blob/master/d3d_rot.c

shader code: https://github.com/vtorri/d3d_rot/blob/master/shader_3.hlsl

Advertisement

i just want to add that by “nothing is displayed”, I mean that, normally, a yellow triangle and a blue rectangle shold be displayed

when I comment the multiplication in the shader:

with the multiplication in the shader:

I would try two things:

  1. Replace float2x3 with float3x2 in shader. Maybe it is a major row / column convention issue.
  2. Don't use such type in the shader at all, but write your own multiplication. You should be able to make it work at least this way. Currently that's no performance issue either, because current GPUs are scalar and have no native vector / matrix types. Self written matrix multiplication even often was faster to me because it took less registers. But not sure if that's still a good advise on most recent hardware. Tensor cores have hardware matrix multiplication for example, but only for low precision types.

JoeJ said:

I would try two things:

  1. Replace float2x3 with float3x2 in shader. Maybe it is a major row / column convention issue.

i've tried, no luck with both matrix layout

  1. Don't use such type in the shader at all, but write your own multiplication. You should be able to make it work at least this way. Currently that's no performance issue either, because current GPUs are scalar and have no native vector / matrix types. Self written matrix multiplication even often was faster to me because it took less registers. But not sure if that's still a good advise on most recent hardware. Tensor cores have hardware matrix multiplication for example, but only for low precision types.

ok, I'll try this. I could also do that in CPU, maybe less problemthat way.

thank you

vtorri said:
I could also do that in CPU, maybe less problemthat way.

Yeah, but this really should not be necessary. My proposals are meant just as temporary trial and error to figure out which of your assumptions break. After clarifying that, you should be able to use the built in type without further issues.

But you need to go through the tedious process of figuring out what conventions, padding issues, or whatever else the API expects.
I remember i had such pain with replicating my custom structs in GPUs constant memory. Totally cumbersome due to padding, and specs read like patents. I really started to hate OpenGL at this time ; )

The main problem here is missing debug options on GPU. I use some plain memory buffer for that. You could write the numbers you get from the float2x3 type to such buffer, download to CPU after the frame, and compare if it's the expected numbers in expected order. (Make sure only one shader invocation writes to the buffer, otherwise write hazards may corrupt it. Or use atomics.)
It's some work to set this up just for debugging purposes, but over time it's worth it.

@JoeJ is the graphic debugger of Visual Studio of any help for that ? because it seems that it reports correct values for my constant buffer

vtorri said:
@JoeJ is the graphic debugger of Visual Studio of any help for that ?

Never used it, but sure it is useful.

Maybe you can just display color values from the matrix type in shader? To see if the type is made as expected from the constant buffer, in case debugger does not show this already.

This topic is closed to new replies.

Advertisement