Advertisement

reconstruct position from depth - Inverse projection error?

Started by November 19, 2019 04:34 PM
16 comments, last by JohnnyCode 4 years, 9 months ago

So to give a bit of context as to what i'm trying to achieve and where i seem to be failing at:

I'm currently implementing an SSAO approach which requires me to reconstruct the viewspace position from the depth buffer.

The depth is encoded into an RGB (24 bit) texture. (As i can't access the depth buffer directly.)

Spoiler



float4 ClipPos : COLOR0;
//vertex
void main(in input IN, out v2p OUT)
{
    OUT.Position = mul(gm_Matrices[MATRIX_WORLD_VIEW_PROJECTION],IN.Position);
  	OUT.ClipPos = OUT.Position;
}


//=============================
//fragment
float4 ClipPos : COLOR0;

float3 ftov(float depth)
{
  //encodes float into RGB
    float depthVal = depth * (256.0*256.0*256.0 - 1.0) / (256.0*256.0*256.0);
    float4 encode = frac( depthVal * float4(1.0, 256.0, 256.0*256.0, 256.0*256.0*256.0) );
    return encode.xyz - encode.yzw / 256.0 + 1.0/512.0;
}

void main(in v2p IN, out p2s OUT)
{

    OUT.Depth = float4(ftov(IN.ClipPos.z/IN.ClipPos.w),1.0);
  
}

 

To prepare for depth reconstruction i inverse the projection matrix: (and i think herein lies the issue: )

Spoiler


    scr_set_camera_projection();//set matrix
    obj_controller.projectionMatrix_player = matrix_get(matrix_projection);//get projection matrix
    
    var pLin = r44_create_from_gmMat(obj_controller.projectionMatrix_player);//convert
    pLin = r44_invert(pLin);//invert
    obj_controller.projectionMatrix_player_inverted = gmMat_create_from_r44(pLin);//convert back to send it to the shader

 

Now, i noticed that there is an issue with the reconstructed depth component in the shader.

Spoiler


   varying vec2 v_vTexcoord;
uniform sampler2D gDepth;

float vtof(vec3 pack)
{
  //decodes RGB to float
    float depth = dot( pack, 1.0 / vec3(1.0, 256.0, 256.0*256.0) );
    return depth * (256.0*256.0*256.0) / (256.0*256.0*256.0 - 1.0);
}

//reconstructs Viewspace position from depth
vec3 getViewPos(vec2 texcoord){
    vec4 clipPos;
    clipPos.xy = texcoord.xy * 2.0 - 1.0;
    clipPos.z = vtof(texture2D(gDepth, texcoord).rgb);
    clipPos.w = 1.0;
    
    vec4 viewP = projection_inv * clipPos;
    viewP = viewP/viewP.w;
    
    return viewP.xyz;
}

void main()
{

  vec3 fragPos = getViewPos(v_vTexcoord);
  float si = sign(fragPos.z);//get sign from z component
  if(si < 0.0){
    gl_FragColor = vec4(1.0,0.0,0.0,1.0);
  }else{
    gl_FragColor = vec4(0.0,1.0,0.0,1.0);
  }

}

 

So in DirectX, +Z goes towards the screen while -Z goes into the distance (effectively reversed compared to OpenGL)

However, after hours of debugging i simply tried to render the sign of the Z coordinate. And this is the result:

1297350433_Runner2019-11-1916-59-00-79.thumb.png.c6ba29b547b623c07d8a989f2e90b5c7.png

For some reason, the Z coordinate is positive (up until a point) which is shown as the green coloring in the scene.

And after that it becomes red (where z becomes negative).

From my understanding it should be all red (as the Z component should start to be negative from the cameras origin point.)

(The encoding/decoding of RGB values was tested and shouldn't be the culprit.) Has anyone an idea/direction as to what the issue could be? (I think that maybe the conversion process for the inversion call might be the cause of this?. Or maybe the reconstruction in the shader is wrong?)

To me it seems, that near plane/far plane are not utilized during encoding the depth, but are during decoding due to projection inverse. How do you obtain


ClipPos 

You seem using only world and view transformation, not projecting it, but you encode it as if it was projected with w division.

Advertisement
3 hours ago, JohnnyCode said:

To me it seems, that near plane/far plane are not utilized during encoding the depth, but are during decoding due to projection inverse. How do you obtain



ClipPos 

You seem using only world and view transformation, not projecting it, but you encode it as if it was projected with w division.

I'm utilizing the near/far plane (projection matrix):


//MATRIX_WORLD_VIEW_PROJECTION is the modelViewProjection matrix
OUT.Position = mul(gm_Matrices[MATRIX_WORLD_VIEW_PROJECTION],IN.Position);
OUT.ClipPos = OUT.Position;

I'm then dividing by W in the fragment shader before storing the depth information in the RGB channels:


OUT.Depth = float4(ftov(IN.ClipPos.z/IN.ClipPos.w),1.0);

Or is this the wrong/incorrect way of doing it?

From debugging i can confidently say that the Depth storage as well as the matrix inverse function works as intended.

Though no idea what i'm missing here.

 

To clarify: Is this the correct order?

  1. Multiply position "P" with MVP matrix
  2. divide "P" by w
  3. store z component of "P" (which should be in the 0-1 range) in the texture (encoded)

Back:

  1. get Z from depth texture (by decoding) and convert XY texture coordinates from (0/1) to (-1 / +1) range
  2. Store XY texture coordinate and Z component in P.
  3. multiply P with inverse projection matrix
  4. divide P by w component
  5. retrieve Z value from P
27 minutes ago, Lewa said:

4. divide P by w component 

nope,

You store normalized device projected coordinate in depth texture what is not correct, you will never retrieve clip space cordinate from that without storing also w divisor. Store the component without division in clipspace as is, only transfromed by WVP matrix, sample it then up and multiply with projection inverse to get view space z that was originaly there.

 

11 minutes ago, JohnnyCode said:

nope,

You store normalized device projected coordinate in depth texture what is not correct, you will never retrieve clip space cordinate from that without storing also w divisor. Store the component without division in clipspace as is, only transfromed by WVP matrix, sample it then up and multiply with projection inverse to get view space z that was originaly there.

 

Oh alright. So i should basically never divide by the w component (both during the encoding and decoding phase?)

What is the range which the z value can then reach if i don't divide it by w? (is it znear and zfar?) If that's the case then i need to change the encoding/decoding algorithm as it currently only works with values in the 0-1 range.

 

Value will be between -1.0,1.0 range, just use a 32bit texture target as GPU's are optimized for it. The datatype of the target texture render buffer should be a single channel 32 bit FLOAT. That will ease you of special encodings.

By the way, I am forgettting there is something special that you may need to perform when you render projection texture. You are outputting to a texture channel and I am not sure wheather w division for X and Y components happen there- if not, do it manualy for X and Y component to corectly run fragment function on target texture (but I gues it does run for all 1.0 4th component vectors, just not sure).

Advertisement
4 minutes ago, JohnnyCode said:

Value will be between -1.0,1.0 range, just use a 32bit texture target as GPU's are optimized for it. The datatype of the target texture render buffer should be a single channel 32 bit FLOAT. That will ease you of special encodings.

Does the -1.0,1.0 range only apply to OpenGL or also to DirectX? (i would have to normalize them with the (v+1.0)/2.0 method during encoding and doing the reverse while decoding)

(I have a DirectX backend.) Sadly i don't have access to 32 bit floating point formats on textures so i have to use the encoding/decoding method. :/

Depends on projection matrix , but yes, everything outside this range will be cliped.

Then perform classic -1,+1 to 0-1 encoding which is

0.5*x+0.5, and decode approprietly

Getting closer. (The viewspace X and Y axis looked ok in the debug view).

Though there seem to be something wrong with the depth coordinate.

In the depth shader i tried to check if the values exceed 1.0 or 2.0 and this is the result:

depth.thumb.png.408f3e62aa6cccd51b28b55898918e65.png

Check the bottom right corner (with the red/blue stripes.)

That's the depthmap view. This shader is applied:


//vertex
OUT.Position = mul(gm_Matrices[MATRIX_WORLD_VIEW_PROJECTION],IN.Position);
    OUT.ClipPos = OUT.Position;

//fragment:

OUT.Depth = float4(IN.ClipPos.z,IN.ClipPos.z,IN.ClipPos.z,1.0);

float v = IN.ClipPos.z;
if(v > 1.0){
//red
OUT.Depth = float4(1.0,0.0,0.0,1.0);
}
if(v > 2.0){
//blue
OUT.Depth = float4(0.0,0.0,2.0,1.0);
}

if(v < 0.0){
            //green
            OUT.Depth = float4(0.0,1.0,0.0,1.0);
            }

Which means that after multiplying the coordinate with the MVP matrix (without perspective divide) it will get a range from 0 - X.

 

One thing, how do you feed IN.Position? Commonly the fourth component is set manualy to 1.0 in the 4d transfromed object space vector. is it streamed as one or some arbitrary value? Try setting it

Also, outputting a floating value in fragment function to target texture actualy performs 0.0-1.0 to byte 0.0,255.0 conversion and does not store it as is. Do not know the details though

This topic is closed to new replies.

Advertisement