@Valakor So you're using a single VkBuffer per mesh?
Roughly-ish. It depends on the vertex streams required by the mesh, but I try to compact vertex streams from separate sub-meshes with the same vertex layout to the same buffer.
VMA seems to be very good, but as a noob i don't understand its value, unless you need to allocate more than 4096, or what the limit is, unique VkBuffer's. My plan only needs a one large vertex, and one index buffer, i.e. no need for VMA, maybe it becomes a necessity for VkImage's, when not using texture arrays?
For clarification, the often-used 4096 number is referring to allocations (VkDeviceMemory), not buffers (VkBuffer or VkImage). You can (generally) have as many buffers/images as you want, but they must be backed by this more limited number of actual memory allocations. This is why you generally need some form of custom memory allocation scheme for sub-allocation.
VMA abstracts away some of the complexities of memory allocation. It does things like:
- Remove the need to think about device memory and allocation limits (it handles allocating chunks of memory and sub-allocating for you, within the limits of your device)
- Abstract away the ideas of memory heaps and memory types bits - you instead give it a simpler memory usage (e.g. CPU-TO-GPU or GPU-ONLY) and it figures out the correct type bits and heap to allocate from
- Handle "dedicated" allocations (VK_KHR_dedicated_allocation)
- Provide simpler map/unmap utilities
- etc.
Of course you can use more advanced features as well - it supports custom memory pools, different allocation strategies, hinting or requiring specific type bits, etc. If nothing else the library documentation is worth a read to understand many of the concepts and problems related to memory allocation in Vulkan.
All that being said, your specific use case determines how you'll want to allocate. Using lots of different buffers instead of compacting into 1 may be simpler, but likely has some performance overheads when actually drawing if you have to change buffers more often.
Agreed, for a “level” based game/engine it's probably better to use a simple linear allocator, then just overwrite everything at the next level loading screen. If fragmentation becomes an issue, i may use different buffers with different "allocators" for persistent content, e.g. fonts, for long-lived content, e.g. "levels", and for short-lived transient stuff like effects.
Yes you'll likely end up with a toolbox of allocation strategies for different systems. You'll find that many classic CPU allocation strategies work great here as well: per-frame stack allocators for uniform buffers, buddy allocators and the like for transient allocations, one-off allocations for persistent content, stack sub-allocation for “levels” or “packs”, one-off pool or chunk allocators for specific systems like debug-draw vertices, etc. etc.