hplus0603 said:
The reason the restrictions came into place, is because to run at the frame rates we do with the shading rates we do, unpredictable flexibility must be removed. If you add that back in, performance will take a significant step backwards. Maybe your argument is “I'm OK with a GeForce 580 level scene, as long as I get to be more flexible about the vertices, pixels, and objects I render,” which is a fine opinion to have, but the general market doesn't seem to agree with this opinion.
No, that's not my point. And it's not that i do not understand valid arguments of why full flexibility isn't always meaningful either.
Some of my arguments in detail:
* Dreams does not use ROPs, but it shows nice gfx and performance. UE5 in case of it's first demos uses compute raster for 80%.
Conclusion: It now makes sense to consider removing ROPs, although initially they were the major feature of GPUs. This frees die area for even more compute power to compensate in cases where we are not sure yet.
As general purpose power increases, and rendering methods become more varied and diverge from traditional standards, the lesser fixed function units accelerating only those traditional standards are justified.
You say yourself ‘The reason the restrictions came into place…’ which is past tense. Are those reasons still valid today? Things have changed a lot. We now have very different bottlenecks.
But i don't say i want to remove ROPs already now, and i don't think it would make sense for mobile GPUs, for example. I just think we will get to this point in the near future.
* I've heard one argument very often within my request to expose BVH of raytracing: ‘HW vendors need blackbox and compromised flexibility to get the ball rolling; Initial fixed function acceleration can't already support full flexibility, but just wait some years to get your flexibility.’
But that's bullshit and misses my point. My request is tangent to fixed function limitations, and i do not request more flexible HW at all.
The problem here is only the API which creates limitations, not the hardware. If the API would provide specifications across all vendor specific BVH data structures, it would already solve all problems.
Supporting all formats would be a lot of work, and new chips means new work. But only the developers which really need to build / modify BVH on their own have to go down there. All the others can still enjoy the simple and easy to use API as is.
* Back on topic of requesting or expecting Nanite HW acceleration, i don't think this would be a win on the long run. Because it would enforce again stagnation, or put any further progress on such tech again only into the hands of HW vendors.
But that's wrong. It's our job to invent such software solutions, not the job of chip makers. And to do our job, we need flexibility so we can develop it further at all. If it's fixed function, this opportunity is lost. And it's not that Nanite is the one and only final solution to LOD.
It's nice, but it's not the end of the road. We'll have to explore many other and further roads as well. Which won't happen, if there would be HW acceleration on a current state of the art. We could not beat it, so no point to improve it further. We're stuck again.
But more importantly: I see no need for HW acceleration at all. Nanite is fast, although it enjoys full flexibility. And from my perspective this applies to other future solutions of classical rendering problems as well. I have such solution for realtime GI, but i could not even implement it if there were no flexible compute shaders. It uses raytraced visibility, but i could not speed it up using HW tracers, even if the flexibility were there.
I just think we are at a point where flexibility is most important in general, and fixed function units rarely help with the remaining bottlenecks.
The sad thing, imo, is this: Even we now have some nice things like Nanite which proofs how flexible general purpose computing actually completely destroys former fixed function brute force solutions,
people still don't get it, and even request to go back to the stale middle age of ‘fixed function is the only way’.
That's what i try to say.
It's about time game devs finally learn what compute enables, how to use it, and for what. Sorry about the disrespect, but this just did not happen yet during the PS4 era.