🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Compute Shader Works Intel not Nvidia

Started by
19 comments, last by NikiTo 3 years, 11 months ago

Hello,

Im trying to run a compute shader to generate a coarse grid of intensities for a MRI volumetric scan, when i run the program on Intel integrated GPU it functions, but on a Nvidia card it doesn't.

Intel Result:

Nvidia Result:

Shader Code:

https://pastebin.com/ZxzxfiCR

Advertisement

Are there any D3D errors? Do you have that enabled? Is it D3D11 or D3D12? You haven't provided very much info…

Ah yes sorry, it is DX11 and Debug device is enabled, the code ran on intel and Nvidia GPU are identical. Both images are taken from render doc, it is stored as R8_Unorm but generated into a RWTexture uint a staging texture is used to get the data format he GPU so i can cache it out to disk for future reuse.

You use “SV_DispatchThreadID”, could you show us the Dispatch(?, ?, ?) call parameters?

It uses the volume dimensions and voxels per cell, its the same in both nvidia and Intel case. In my case its 2, 4, 4

m_GraphicsDevice->Dispatch((Uint32)((width / m_GridData.m_VoxelsPerCell.x) / 8), (Uint32)((height / m_GridData.m_VoxelsPerCell.y)/8),
							   (Uint32)((depth / m_GridData.m_VoxelsPerCell.z)/8));

Is it possible to try switching to Texture2DArray instead of Texture3D and see if anything changes? (Switching both in the shader and CPU side UAV creation)

If i understand it correctly, you attempt to get a cube-

But you have 2, 4, 4, shouldn't those three be equal?
Do you still get two different results between Intel and NV with 4, 4, 4?

It smells as a problems of addressing to me.

@NikiTo Its MRI or CT volumetric data it doesn't have to be square, for instance the skull im using is 128,256,256 and thus reduces too a 2,4,4 dispatch for 8,8,8 threads when using 8,8,8 cells per coarse grid voxel. Wonder why the addressing works on Intel but Not nvidia though?

@turanszkij Yeah i could give it a go and see what happens.

Jman2 said:
Wonder why the addressing works on Intel but Not nvidia though?

Intel and NVidia have different wavefront sizes(16 and 32). The way the two images differ smells to me as something related to the wavefront size diferences. And maybe synchronization, but i personaly would investigate toward addressing first.

Would it be easy for you to render only part of the scull? Just for the test. Render on both GPUs a cubic chunk of the skull of size of 128x128x128 with 4, 4, 4. Just to see if the problem persist.

@NikiTo

I swapped out to a 256x256x256 data set and the result is the same:

256x256x256 data

Is it also worth noting that i generate the volumetric normal using central difference and that functions perfectly well for both the cuboid volume and the cube volume?

You can view the normal slices here:

https://i.gyazo.com/c86e244c630f1567f7ef4942b3da006a.mp4

This topic is closed to new replies.

Advertisement