🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Strange intermittent bottleneck while swapping textures

Started by
5 comments, last by CDRZiltoid 2 years, 9 months ago

Got a weird issue that I've been able to drill down into a specific point, but am now hitting a road block. I've been working on an engine for awhile and recently noticed that I had an intermittent (drastic) decrease in framerate for a single frame. The first time it happens is almost exactly at 30 seconds into the application runtime every time. I was able to profile the application and discovered that the issue is somehow related to binding textures, as I could watch the profiler and as soon as the drop happened it was reported that my method responsible for binding textures and passing uniform data was taking up to 80ms to complete (tested multiple times and results varied anywhere between 20ms and 80ms).

I find it strange that the issue is this intermittent. The first time it happens takes 30 seconds, then after that it might do it again, and it might not, but if it does it's several minutes past the first time it occurs. Other than the first time it happens, there's no recognizable pattern. Is there anything that anyone can think of that might be causing this behavior? It almost feels like maybe there's a race condition on the gpu?

Below is a stripped down version of what's going on (if anyone would like to see the full codebase it can be accessed via the “massive_refactor” branch in the repo: https://github.com/vellocet3d/vellocet3d/tree/massive_refactor​)​

 // Scene::draw occurs every tick, theres alot more code within the actual method, this is just the relevant part

void Scene::draw(float alpha)
{
	// This reports that useHdr is the bottleneck
	if (r.getShader() != gpu->getActiveShader())
		gpu->useShader(r.getShader());
	if (s.getActiveHdr() != gpu->getActiveHdr())
		gpu->useHdr(s.getActiveHdr());
	if (r.getMesh() != gpu->getActiveMesh())
		gpu->useMesh(r.getMesh());
	if (r.getMaterial() != gpu->getActiveMaterial())
		gpu->useMaterial(r.getMaterial());

	// This reports that useMaterial is the bottleneck
	// if (r.getShader() != gpu->getActiveShader())
		// gpu->useShader(r.getShader());
	// if (r.getMaterial() != gpu->getActiveMaterial())
		// gpu->useMaterial(r.getMaterial());
	// if (s.getActiveHdr() != gpu->getActiveHdr())
		// gpu->useHdr(s.getActiveHdr());
	// if (r.getMesh() != gpu->getActiveMesh())
		// gpu->useMesh(r.getMesh());
	
	
	for (auto& a : r.actors.getAll())
		this->drawActor(a, alpha);
}


void GPU::useHdr(HDR* h)
{
	this->activeHdr = h;
	
	// bind pre-computed IBL data
	glActiveTexture(GL_TEXTURE0);
	glBindTexture(GL_TEXTURE_CUBE_MAP, h->irradianceMap);
	this->setShaderInt("irradianceMap", 0);
	
	glActiveTexture(GL_TEXTURE1);
	glBindTexture(GL_TEXTURE_CUBE_MAP, h->prefilterMap);
	this->setShaderInt("prefilterMap", 1);
	
	glActiveTexture(GL_TEXTURE2);
	glBindTexture(GL_TEXTURE_2D, h->brdfLUTTexture);
	this->setShaderInt("brdfLUT", 2);
}

void GPU::useMaterial(Material* m)
{
	this->activeMaterial = m;

	glActiveTexture(GL_TEXTURE3);
	glBindTexture(GL_TEXTURE_2D, m->albedo->id);
	this->setShaderInt("albedoMap", 3);
	
	glActiveTexture(GL_TEXTURE4);
	glBindTexture(GL_TEXTURE_2D, m->normal->id);
	this->setShaderInt("normalMap", 4);
	
	glActiveTexture(GL_TEXTURE5);
	glBindTexture(GL_TEXTURE_2D, m->metallic->id);
	this->setShaderInt("metallicMap", 5);
	
	glActiveTexture(GL_TEXTURE6);
	glBindTexture(GL_TEXTURE_2D, m->roughness->id);
	this->setShaderInt("roughnessMap", 6);
	
	glActiveTexture(GL_TEXTURE7);
	glBindTexture(GL_TEXTURE_2D, m->ao->id);
	this->setShaderInt("aoMap", 7);
}

void GPU::setShaderInt(const std::string& name, int value) const
{
	glUniform1i(glGetUniformLocation(
		this->activeShader->id,
		name.c_str()),
		value);
}

**EDIT**

Fun Fact: I installed nvidia nsight to see if that might give me some clues as to what's going on. Whenever I run the executable through nvidia nsight…problem's gone. Solid framerate, no drops, draw method even appears to be completing 2-3 times faster 0_0. Starting to think I've missed an opengl call somewhere and when running nsight, nsight is making that missing call…possibly? Maybe?

Advertisement

Well...I added logic to utilize the opengl debug context, only the error does not happen when the debug context is enabled, same as when using nvidia nsight. As soon as I disable debug context logging, issue resumes 0_0.

What on earth could be going on where enabling the debug context "corrects" the issue?

Thought I had it figured out after caching shader uniform locations, but false alarm. Attempting to drill down deeper with the profiler, it seems it may be something within glfw at this point (or possibly how I'm using it or not doing something I should be).

You have a state issue man ?

I would try to query it from gpu with and without debug context.

Or query without debug context and check nvidia nsight state. See the difference

EddieSpina said:

You have a state issue man ?

I would try to query it from gpu with and without debug context.

Or query without debug context and check nvidia nsight state. See the difference

Yeah, something odd is definitely going on. I'm narrowing it down. I've been able to track it back to glfw's `glfwSwapBuffers()` method AND the issue persists if I remove my entire draw method, so when I simply clear and swap buffers every tick. Think I'm going to setup an mvp project with nothing but glfw and see if I can reproduce the behavior there. This could wind up being an issue with glfw or glfw in relation to my machine.

This wound up being due to nvidia's “Threaded Optimization” setting. Turning it off resolves the issue. I ended up including nvidia's nvapi library in my project in order to create an application profile with threaded optimization disabled.

This topic is closed to new replies.

Advertisement