Advertisement

Memory leak in my DX11 program

Started by November 12, 2019 07:58 PM
18 comments, last by joetext 4 years, 9 months ago

Hey all, I'm at my wits end with this weird memory leak. My issue is with the small preview-render view that comes up on the Windows 10 taskbar when you hover over your application. I don't know the name of it, but it's the little thumbnail view that pops up when you have your mouse hovered over a minimized application. When I do this with my dx11 program, memory absolutely rockets up, I'm talking like going from 1 gb to 4 gb in a few seconds. I managed to find a single line that I can comment out and "fix" the leak, but it's not making any sense to me. The graphics portions of my program are somewhat fragmented between different abstraction layers so bear with me.


// the code that actually handles the render calls

// map buffers and copy data from cpu buffers
m_context->copyDataToBuffer((void*)verts.data(),
sizeof(Vertex) * verts.size(),
m_vertexBuffer);

m_context->copyDataToBuffer((void*)indicies.data(),
sizeof(uint) * indicies.size(),
m_indexBuffer);

m_context->copyDataToBuffer((void*)uniforms.data(),
sizeof(HuiUniforms) * uniforms.size(),
m_uniformBuffer);


// render
ISamplerState* sampler = shaderSys->getSamplerState();
m_context->setVertexBuffer(m_vertexBuffer); // this is the line that is causing the leak
m_context->setIndexBuffer(m_indexBuffer);
m_context->setUniformBuffer(m_uniformBuffer, ShaderType::Vert);
m_context->setUniformBuffer(m_uniformBuffer, ShaderType::Frag);
m_context->setTopology(PrimativeTopology::TriangleList);
m_context->setSamplerState(sampler);
Texture2D** textureArr = &diffuseTexture;
m_context->setTextures(textureArr, 1);
m_context->drawIndexed(indicies.size());

m_vertexBuffer is an abstraction for a ID3D11Buffer. Here's the code for that setVertextBuffer call


void D3DRenderContext::setVertexBuffer(IBuffer* buffer)
{
	D3DBuffer* d3dBuffer = static_cast<D3DBuffer*>(buffer);
	ID3D11Buffer** d3dBufferResource = d3dBuffer->getBuffer();

	UINT stride = sizeof(Vertex); // Vertex is a simple struct 
	UINT offset = 0;

	m_context->IASetVertexBuffers(0, 1, d3dBufferResource, &stride, &offset);
}

And for completeness sake here's the copy buffer function


void D3DRenderContext::copyDataToBuffer(void* data, int dataSize, IBuffer* toBuffer)
{
	D3DBuffer* d3dBuffer = static_cast<D3DBuffer*>(toBuffer);
	ID3D11Buffer** d3dBufferResource = d3dBuffer->getBuffer();

	HRESULT result;
	D3D11_MAPPED_SUBRESOURCE resource;
	result = m_context->Map(*d3dBufferResource, NULL, D3D11_MAP_WRITE_DISCARD, NULL, &resource);
	memcpy(resource.pData, data, dataSize);
	m_context->Unmap(*d3dBufferResource, NULL);
}

A few things.. I removed like 15 ASSERTS to make the code shorter, so any pointer that could potentially be null, and the dx results are all checked in the actual code. Second, I have the dx debug output enabled and it has nothing to say. Third, I've tried flushing the context after every call to drawIndexed, to no avail.

This leak is just beyond bizarre to me, granted I have no clue how the underpinnings of that preview window work. In the meantime, I'm going to figure out how to tell windows to disable it, but I'd still like to know why this is happening. Any suggestions appreciated!

You could try recording your scenario and analyzing it, as described in this doc page: https://docs.microsoft.com/en-us/windows-hardware/test/wpt/memory-footprint-optimization-exercise-2. I can't imagine any reason (outside of a driver bug) why that preview would cause your app to consume more memory.

Advertisement

Thanks for the suggestion. Interestingly, I can see the memory shoot up in the visual studio profiler, but the visual studio heap profiling snapshot inspection tools themselves don't show any change in size. Naive guess: because dx11 owns the memory? I'll make sure I'm up to date on drivers and see if anything changes. I use a microsoft surface book, and it has had a ton of driver issues since I've gotten it, admittedly more on the audio and input side than the graphics side.

I think you are not showing enough code for us to tell.

Would also help if you showed the line that "fix" the issue.

And if you could also provide your wrapper code.

One easy way to catch a memory leak, is to have a small allocation logger class.

In this class you would have something like std::unordered_map<long long, std::tuple<std::string, int>> allocationMap;

Then, before every place you allocate something, you would insert a value to the allocationMap, where the key is the address translated into long long(for instance, the pointer of the interface if it's not actual ram), the string is some string you give it to identify the allocation place, and the int could be bytes(if applicable).

When you free, you just give the address and it it removes the entry based on the interface address.

In addition, you can add asserts if there was insert of a null address, or if there was a removal of something that doesn't exist in the map.

This is also useful to find VRAM allocation issues, as often you don't have profilers for VRAM.

Notice you are talking about RAM not VRAM, so somehow I doubt it's the actual DX object that leaks. Because buffers are stored mostly in VRAM, and you would see the leak in VRAM as well.

My guess is that maybe your wrapper is leaking, or maybe while the app is minimized, something gets allocated and never released, because you might release it only in the renderer, and it doesn't render in that state.

Thanks for the response Zurtan.  I will post some more of the wrapper code, but in the meantime, the line that fixes the issue has a comment in the first block I posted.


m_context->setVertexBuffer(m_vertexBuffer); // this is the line that is causing the leak

Updated to the most recent drivers so I think you're right that it's something in the wrapper, but I'm not sure what. I'll add the allocation map as well, that's a nice idea. I don't intentionally have any logic that switches on minimization, so I'm not sure why it would make a difference, but I'll go through all my winapi code to make sure.

 

Here's some more wrapper code. This is the init function.


// this init code is run once and only once
void MeshRenderSys::init()
{
	m_device = _getSystem<RenderSystem>()->getDevice();
	m_context = _getSystem<RenderSystem>()->getContext();

	// vertex buffer
	BufferDescription vdescr = {};
	vdescr.usage = BufferUsageType::Dynamic;
	vdescr.size = sizeof(Vertex) * kVertexBufferSize; 
	vdescr.bindType = BufferBindType::Vertex;
	vdescr.access = BufferAccessType::Write;

	// index buffer
	BufferDescription idescr = {};
	idescr.usage = BufferUsageType::Dynamic;
	idescr.size = sizeof(uint) * kVertexBufferSize;
	idescr.bindType = BufferBindType::Index;
	idescr.access = BufferAccessType::Write;

	// uniform buffer
	BufferDescription udescr = {};
	udescr.usage = BufferUsageType::Dynamic;
	udescr.size = sizeof(MeshUniforms);
	udescr.bindType = BufferBindType::Constant;
	udescr.access = BufferAccessType::Write;

	m_vertexBuffer = m_device->createBuffer(BufferType::Vertex, vdescr);
	m_indexBuffer = m_device->createBuffer(BufferType::Index, idescr);
	m_uniformBuffer = m_device->createBuffer(BufferType::Uniform, udescr);

}

And here's there createBuffer function referenced here



IBuffer* D3DRenderDevice::createBuffer(BufferType type, BufferDescription descr)
{
	D3DBuffer* buffer = new D3DBuffer(type);
	ID3D11Buffer** d3dBuffer = buffer->getBuffer();

	D3D11_BUFFER_DESC d3dDescr = d3dCreateD3DBufferDescription(descr); // merely creates a d3d buffer descr from my abstract buffer descr class

	HRESULT result = m_device->CreateBuffer(
		&d3dDescr,
		NULL,
		d3dBuffer);
	ASSERT(result >= 0);

	return static_cast<IBuffer*>(buffer);
}

Here's the abstract input layout code. Also only run once



void MeshRenderSys::initShaderLayout()
{
	ShaderSys* shaderSystem = _getSystem<ShaderSys>();

	// input layout
	vector<InputLayoutItem> inputLayout;
	inputLayout.push_back({ "POSITION", DatatypeFormat::R32G32B32_Float });
	inputLayout.push_back({ "NORMAL", DatatypeFormat::R32G32B32_Float });
	inputLayout.push_back({ "BINORMAL", DatatypeFormat::R32G32B32_Float });
	inputLayout.push_back({ "TANGENT", DatatypeFormat::R32G32B32_Float });
	inputLayout.push_back({ "COLOR", DatatypeFormat::R32G32B32A32_Float });
	inputLayout.push_back({ "TEXCOORD", DatatypeFormat::R32G32_Float });
	inputLayout.push_back({ "BLENDINDICES", DatatypeFormat::R32G32B32A32_Uint });
	inputLayout.push_back({ "BLENDWEIGHT", DatatypeFormat::R32G32B32A32_Float });

	ASSERT(shaderSystem->isShaderLoaded("shaders.shader", ShaderType::Vert));
	IShader* shader = shaderSystem->getVertexShader("shaders.shader");
	m_layout = m_device->createInputLayout(inputLayout, shader);
}

and the code it wraps



IInputLayout* D3DRenderDevice::createInputLayout(vector<InputLayoutItem> layoutItems, IShader* shader)
{
	ASSERT(shader != nullptr);
	ASSERT(shader->getType() == ShaderType::Vert);

	D3DInputLayout* layout = new D3DInputLayout();
	ID3D11InputLayout** d3dLayout = layout->getLayout();
	D3DShader* d3dShader = static_cast<D3DShader*>(shader);

	D3D11_INPUT_ELEMENT_DESC* descr = d3dInputElementDescription(layoutItems);

	HRESULT result = m_device->CreateInputLayout(descr, layoutItems.size(), 
		(*d3dShader->getBlob())->GetBufferPointer(), (*d3dShader->getBlob())->GetBufferSize(),
		d3dLayout);
	ASSERT(result >= 0);

	return static_cast<IInputLayout*>(layout);
}

and here's an expanded version of the render code I have above with more of the context



void MeshRenderSys::render(RenderPass pass, RenderOptions options)
{
	// start draw pass
	ShaderSys* shaderSys = _getSystem<ShaderSys>();
	LightingSystem* lightingSys = _getSystem<LightingSystem>();
	CoCamera* coCam = _getSystem<CameraSystem>()->getPrimaryCam();

	// wait until shaders are loaded to start drawing
	if (shaderSys->isShaderLoaded("shaders.shader", ShaderType::Vert) &&
		shaderSys->isShaderLoaded("shaders.shader", ShaderType::Frag))
	{
		// set shaders
		shaderSys->setShader("shaders.shader", ShaderType::Vert);
		shaderSys->setShader("shaders.shader", ShaderType::Frag);

		// create input layout if it hasn't be done yet
		if (m_layout == nullptr)
		{
			initShaderLayout();
		}

		//set vert layout
		m_context->setInputLayout(m_layout);

		for (int i = 0; i < m_sortedMeshes.size(); i++)
		{
			CoMesh* coMesh = std::get<0>(m_sortedMeshes[i]);
			Resource<Mesh>* meshRes = coMesh->getMesh().get();
			Entity* meshEnt = coMesh->getEntity();
			Entity* camEnt = coCam->getEntity();

			if (meshRes != nullptr)
			{
				bool render = true;

				// frustum culling
				if (!coCam->isBoxInsideFrustumFast(coMesh->getBounds(), meshEnt))
				{
					render = false;
				}

				if (render)
				{
					Mesh* mesh = meshRes->get();
					ASSERT(mesh != nullptr);

					// ===================== uniform buffer ================================
					vector<MeshUniforms> uniforms;
					MeshUniforms test = {};
					test.worldMat = mat4(meshEnt->position(), meshEnt->rotation(), meshEnt->scale()).transposed();
					test.viewMat = coCam->getViewMatrix().transposed();
					test.projectionMat = coCam->getProjectionMatrix().transposed();

					ASSERT(lightingSys->getLights().size() <= 1); // don't support multiple lights yet
					CoDirectionalLight* light = lightingSys->getLights()[0];

					test.cameraPosition = vec4(camEnt->position(), 0.f);
					test.ambientLightColor = vec4(.15f, .15f, .15f, 1.f);
					test.directionalLightDir = vec4(light->getEntity()->rotation().toEuler(), 0.f);
					test.directionalLightColor = color::toVec(light->getColor());
					test.specularLightColor = vec4(1.f, 1.f, 1.f, 1.f);
					test.specularLightIntensity = 0.2f;
					test.materialAlpha = coMesh->getMaterial()->getAlpha();

					FogSystem* fogSys = _getSystem<FogSystem>();
					if (fogSys != nullptr)
					{
						test.fogTypeMask |= 1 << (int32)fogSys->getFogType();
						test.fogColor = color::toVec(fogSys->getFogColor());
						test.fogNear = fogSys->getFogNear();
						test.fogFar = fogSys->getFogFar();
						test.fogDensity = fogSys->getFogDensity();
					}
					else
					{
						test.fogTypeMask = 0;
						test.fogColor = color::toVec(color::white());
						test.fogNear = 0.f;
						test.fogFar = 0.f;
						test.fogDensity = 0.f;
					}

					//textures
					Texture2D* textures[kNumTextureSlots];
					test.textureUseMask = 0;
					test.textureOffset = vec2(0.f, 0.0f);
					for (int i = 0; i < kNumTextureSlots; i++)
					{
						textures[i] = nullptr;
					}
					if (coMesh->getDiffuseTex() != nullptr)
					{
						textures[TEXTURE_SLOT_DIFFUSE] = coMesh->getDiffuseTex();
						test.textureUseMask |= 1 << TEXTURE_SLOT_DIFFUSE;
					}
					if (coMesh->getBlendTex() != nullptr)
					{
						textures[TEXTURE_SLOT_BLEND] = coMesh->getBlendTex();
						test.textureUseMask |= 1 << TEXTURE_SLOT_BLEND;
					}
					if (coMesh->getAlphaTex() != nullptr)
					{
						textures[TEXTURE_SLOT_ALPHA] = coMesh->getAlphaTex();
						test.textureUseMask |= 1 << TEXTURE_SLOT_ALPHA;
					}
					if (coMesh->getLightTex() != nullptr)
					{
						textures[TEXTURE_SLOT_LIGHT] = coMesh->getLightTex();
						test.textureUseMask |= 1 << TEXTURE_SLOT_LIGHT;
					}
					if (coMesh->getNormalTex() != nullptr)
					{
						textures[TEXTURE_SLOT_NORMAL] = coMesh->getNormalTex();
						test.textureUseMask |= 1 << TEXTURE_SLOT_NORMAL;
					}
					if (coReflect != nullptr && !options.reflect.reflectionPass)
					{
						textures[TEXTURE_SLOT_REFLECT] = coReflect->getReflectTexture();
						test.textureUseMask |= 1 << TEXTURE_SLOT_REFLECT;
					}

					// animations
					if (mesh->getArmature() != nullptr)
					{
						vector<mat4> boneMats = coMesh->getBoneMatrices();

						ASSERT(mesh->getArmature()->bones.size() < kMaxBones);
						int maxBones = int_min(kMaxBones, mesh->getArmature()->bones.size());
						for (int bIndex = 0; bIndex < maxBones; bIndex++)
						{
							test.bones[bIndex] = boneMats[bIndex];
						}
					}

					uniforms.push_back(test);

					// ===================== uniform buffer end ================================

					// map buffers
					m_context->copyDataToBuffer((void*)mesh->getVerts().data(),
						sizeof(Vertex) * mesh->getVerts().size(),
						m_vertexBuffer);

					m_context->copyDataToBuffer((void*)mesh->getIndices().data(),
						sizeof(uint) * mesh->getIndices().size(),
						m_indexBuffer);
					
					m_context->copyDataToBuffer((void*)uniforms.data(),
						sizeof(MeshUniforms) * uniforms.size(),
						m_uniformBuffer);

					// render
					ISamplerState* sampler = shaderSys->getSamplerState();
					m_context->setVertexBuffer(m_vertexBuffer);
					m_context->setIndexBuffer(m_indexBuffer);
					m_context->setUniformBuffer(m_uniformBuffer, ShaderType::Vert);
					m_context->setUniformBuffer(m_uniformBuffer, ShaderType::Frag);
					m_context->setTopology(PrimativeTopology::TriangleList);
					m_context->setSamplerState(sampler);
					Texture2D** textureArr = &textures[0];
					m_context->setTextures(textureArr, kNumTextureSlots);
					m_context->drawIndexed(mesh->getIndices().size());
					
				}
			}
		}
	}
}

And my device init, because why not



void D3DSystem::createSwapchain(SwapchainDescription swapchainDescription,
	ISwapChain* &outSwapchain, 
	IRenderDevice* &outRenderDevice, 
	IRenderContext* &outRenderContext)
{
	// get abstract window handle and cast it to the platform dependent window handle
	shared_ptr<IWindowHandle> window = _getSystem<OSSystem>()->getOS()->getWindowHandle();
	ASSERT(window != nullptr);
	WindowsWindowHandle* winWindow = static_cast<WindowsWindowHandle*>(window.get());
	ASSERT(winWindow != nullptr);
	HWND hwnd = winWindow->getHWND();

	DXGI_SWAP_CHAIN_DESC descr = d3dSwapchainDescription(swapchainDescription, hwnd);

	D3DSwapChain* swapchain = new D3DSwapChain();
	D3DRenderDevice* device = new D3DRenderDevice();
	D3DRenderContext* context = new D3DRenderContext();

	IDXGISwapChain** d3dSwapchainHandle = swapchain->getSwapchain();
	ID3D11Device** d3dDeviceHandle = device->getDevice();
	ID3D11DeviceContext** d3dContextHandle = context->getContext();

	HRESULT result = D3D11CreateDeviceAndSwapChain(NULL,
		D3D_DRIVER_TYPE_HARDWARE,
		NULL,
		D3D11_CREATE_DEVICE_DEBUG, // TODO: switch on debug versus release
		NULL,
		NULL,
		D3D11_SDK_VERSION,
		&descr,
		d3dSwapchainHandle,
		d3dDeviceHandle,
		NULL,
		d3dContextHandle);

	ASSERT(result >= 0);

	outSwapchain = static_cast<ISwapChain*>(swapchain);
	outRenderDevice = static_cast<IRenderDevice*>(device);
	outRenderContext = static_cast<IRenderContext*>(context);
}

 

If there's any more context I can provide let me know, but that's a lot of it.

As for checking leaks in my wrapper code, there's very few places in the entire app where I use new, so I'm skeptical. But I will do due diligence and triple check I'm not leaking anything.

Advertisement

another hint is that it seems to be proportional to kVertexBufferSize, which is just an arbitrary const that determines how much space you have to render a single mesh. If I take the size down, the memory leak goes down (but doesn't go away)

The only thing that doesn't make sense to me is that you set the size of the index buffer to sizeof(unit) multiplied kVertexBufferSize.

Yeah that's definitely a mistake on my part, I had forgotten to make a second constant (kIndexBufferSize or something). That being said, the only point of that constant is to have a vertex and index buffer of sufficiently big size, so I'm sure it's not the source of the issue.

It's strange that the leak only occurs when I activate that taskbar preview, and even stranger to me that there don't seem to be more people with the same issue. Another thing I can try is running the application on another PC and see if I get the same problem.

Does your app also have a console window?

There are some scenarios where win32 apps might make your main thread get stuck.

For instance, if you have a console window, and you click the mouse on it. It will be stuck(the thread itself) until you press enter.

If you have a background thread working in the background, it might allocate stuff.

This is only win32 and C++ right? No C#?

This topic is closed to new replies.

Advertisement